[Bug tree-optimization/71040] New: [7 Regression] ICE: verify_gimple failed (error: invalid operand in unary operation; error: incorrect sharing of tree nodes) w/ -O3

2016-05-09 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71040

Bug ID: 71040
   Summary: [7 Regression] ICE: verify_gimple failed (error:
invalid operand in unary operation; error: incorrect
sharing of tree nodes) w/ -O3
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-7.0.0-alpha20160508 ICEs when compiling the following reduced snippet w/
-O3:

struct
{
  int b8 : 24;
} ud, ce;
int et;

void
fo (void);

void
p4 (int pn)
{
  while (et != 0)
{
  short int *uc = (short int *)
  if (pn != 0)
fo ();
  *uc = ce.b8;
  ++et;
}
}


% gcc-7.0.0-alpha20160508 -c -O3 apdexura.c
apdexura.c: In function 'p4':
apdexura.c:11:1: error: invalid operand in unary operation
 p4 (int pn)
 ^~
# VUSE <.MEM_8(D)>
ce_b8_lsm0.5_6 = () MEM[(struct  *)];
apdexura.c:11:1: error: invalid operand in unary operation
apdexura.c:11:1: error: incorrect sharing of tree nodes
MEM[(struct  *)]
# VUSE <.MEM_8(D)>
_23 = (short int) MEM[(struct  *)];
apdexura.c:11:1: internal compiler error: verify_gimple failed

Re: [C PATCH] Warn for optimize attribute on decl after definition (PR c/70255)

2016-05-09 Thread Martin Sebor

On 05/09/2016 08:45 AM, Marek Polacek wrote:

In this PR, Richi pointed out that we don't warn for the case when a
declaration with attribute optimize follows the definition which is lacking
that attribute.  This patch adds such a warning.  Though the question is
whether this shouldn't apply to more attributes than just "optimize".  And,
as can be seen in the testcase, we'll warn for even for the case when the
definition has
   optimize ("no-associative-math,O2")
and the declaration
   optimize ("O2,no-associative-math")
Not sure if we have something better than attribute_value_equal, though.


There is attribute_list_equal which seems more appropriate given
that there could be more than one attribute optimize associated
with a function, and the order of the attributes shouldn't matter.
attribute_value_equal only returns true if all attributes are
the same and in the same order.  I would not expect GCC to warn
on the following, for example:

  int attribute__ ((optimize ("no-reciprocal-math"),
optimize ("no-associative-math")))
  f () { return 0; }

  int __attribute__ ((optimize ("no-associative-math"),
  optimize ("no-reciprocal-math")))
  f ();

Martin



Re: Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

2016-05-09 Thread lh mouse
I have a vision. It is gcc/gcc/incpath.c that the problem is in.
I had been looking through that file for a few days but eventually gave up.

It is worth mentioning that adding an '-iprefix /this/need/not/exist' vanishes 
the problem.
This might have something to do with the following line in incpath.c (it should 
be line #132 on gcc-6-branch at the moment):
```
  if (iprefix && (len = cpp_GCC_INCLUDE_DIR_len) != 0)
```
Still I have no idea about how relocated paths are pulled in.

I am looking forward to a patch for the relocation problem.

--   
Best regards,
lh_mouse
2016-05-10

-
发件人:Brett Neumeier 
发送日期:2016-05-10 04:31
收件人:lh_mouse
抄送:Jonathan Wakely,gcc
主题:Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

On Tue, May 3, 2016 at 10:01 AM, lh_mouse  wrote:
> Should I file a bug report then?
> We need some Linux testers, though not many people on Linux relocate 
> compilers.

For what it's worth -- I encountered the same problem on a GNU/Linux
system. In my specific situation, I'm cross-compiling GCC using an
AMD64-to-mips64el cross-toolchain, and installing the resulting GCC in
a sysroot directory. When I try to use that GCC on a target device
where (of course) the sysroot directory becomes "/", the hard-coded
"/path/to/sysroot" from the host system is still used to find the C++
headers, resulting in the same ".../include/c++/6.1.1/cstdlib:75:25:
fatal error: stdlib.h: No such file or directory" error message you
got.

Changing #include_next to #include in cstdlib and cmath fixed my
problem -- so, thank you very much for this discussion! It helped at
least one other person.

Please let me know if there's any other testing I can do to help.

Cheers,

Brett




Re: Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

2016-05-09 Thread lh mouse
We use neither --with-sysroot nor --with-build-sysroot.
The reason is that, the hard-coded path in GCC repository - that is, the 
/mingw/ one - does not actually exist.

In order to build GCC for mingw targets, we take either solution:
0) Make a symlink (or rather, a copy, since Windows does not support symlinks) 
as /mingw/, as mentioned in 
https://sourceforge.net/p/mingw-w64/wiki2/Native%20Win64%20compiler/
1) Replace the non-existent path with an existent one, as done in 
https://github.com/lhmouse/MINGW-packages/blob/master/mingw-w64-gcc-git/PKGBUILD#L112

--   
Best regards,
lh_mouse
2016-05-10

-
发件人:Andrew Pinski 
发送日期:2016-05-10 05:10
收件人:Brett Neumeier
抄送:lh_mouse,Jonathan Wakely,gcc
主题:Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

On Mon, May 9, 2016 at 1:31 PM, Brett Neumeier  wrote:
> On Tue, May 3, 2016 at 10:01 AM, lh_mouse  wrote:
>> Should I file a bug report then?
>> We need some Linux testers, though not many people on Linux relocate 
>> compilers.
>
> For what it's worth -- I encountered the same problem on a GNU/Linux
> system. In my specific situation, I'm cross-compiling GCC using an
> AMD64-to-mips64el cross-toolchain, and installing the resulting GCC in
> a sysroot directory. When I try to use that GCC on a target device
> where (of course) the sysroot directory becomes "/", the hard-coded
> "/path/to/sysroot" from the host system is still used to find the C++
> headers, resulting in the same ".../include/c++/6.1.1/cstdlib:75:25:
> fatal error: stdlib.h: No such file or directory" error message you
> got.
>
> Changing #include_next to #include in cstdlib and cmath fixed my
> problem -- so, thank you very much for this discussion! It helped at
> least one other person.
>
> Please let me know if there's any other testing I can do to help.


This sounds like a good use of --with-build-sysroot instead of just
--with-sysroot.
I use the following for the candian cross:
--with-sysroot=/ --with-build-sysroot=${SYSROOT}

Thanks,
Andrew

>
> Cheers,
>
> Brett




[Bug tree-optimization/71039] New: [7 Regression] ICE: verify_ssa failed (error: definition in block 4 does not dominate use in block 5) w/ -O1 and above

2016-05-09 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71039

Bug ID: 71039
   Summary: [7 Regression] ICE: verify_ssa failed (error:
definition in block 4 does not dominate use in block
5) w/ -O1 and above
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-7.0.0-alpha20160508 snapshot ICEs when compiling the following reduced
snippet at -O1 and above:

struct wv
{
  int qi;
} qp, *ft;
void *pb;

void
wz (void)
{
  struct wv *vf = pb ? (struct wv *) : 
  *ft = *vf;
}

% x86_64-pc-linux-gnu-gcc-7.0.0-alpha20160508 -c -O1 fo7dullr.c 
fo7dullr.c: In function 'wz':
fo7dullr.c:8:1: error: definition in block 4 does not dominate use in block 5
 wz (void)
 ^~
for SSA_NAME: ft.2_2 in statement:
# .MEM_7 = VDEF <.MEM_4(D)>
*ft.2_2 = MEM[(struct wv *)];
fo7dullr.c:8:1: internal compiler error: verify_ssa failed

[PATCH, rs6000] Fix PR70963: Wrong code for V2DF/V2DI vec_cts with zero scale factor

2016-05-09 Thread Bill Schmidt
Hi,

PR70963 reports a problem with vec_cts when used to convert vector double to 
vector long long.
This is due to a register with an undefined value that is generated only when 
the scale factor is
zero.  This patch adds logic to provide the correct value when the scale factor 
is zero.

The problem from the PR is in the define_expand for vsx_xvcvdpsxds_scale.  The 
define_expand
for vsx_xvcvdpuxds_scale clearly has the same problem, although it is not 
possible to reach this
via a call to vec_cts.  The raw builtin __builtin_vsx_xvcvdpuxds_scale can be 
used, however, and
I’ve shown this in the test case.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.  
Is this ok for
trunk, and eventual backport to 6 and 5?

Thanks,
Bill


[gcc]

2016-05-09  Bill Schmidt  

* config/rs6000/vsx.md (vsx_xvcvdpsxds_scale): Generate correct
code for a zero scale factor.
(vsx_xvcvdpuxds_scale): Likewise.

[gcc/testsuite]

2016-05-09  Bill Schmidt  

* gcc.target/powerpc/pr70963.c: New.


Index: gcc/config/rs6000/vsx.md 
===
--- gcc/config/rs6000/vsx.md(revision 236051)   
+++ gcc/config/rs6000/vsx.md(working copy)  
@@ -1717,10 +1717,15 @@
 {  
   rtx op0 = operands[0];   
   rtx op1 = operands[1];   
-  rtx tmp = gen_reg_rtx (V2DFmode);
+  rtx tmp; 
   int scale = INTVAL(operands[2]); 
-  if (scale != 0)  
-rs6000_scale_v2df (tmp, op1, scale);   
+  if (scale == 0)  
+tmp = op1; 
+  else 
+{  
+  tmp  = gen_reg_rtx (V2DFmode);   
+  rs6000_scale_v2df (tmp, op1, scale); 
+}  
   emit_insn (gen_vsx_xvcvdpsxds (op0, tmp));   
   DONE;
 }) 
@@ -1741,10 +1746,15 @@
 {  
   rtx op0 = operands[0];   
   rtx op1 = operands[1];   
-  rtx tmp = gen_reg_rtx (V2DFmode);
+  rtx tmp; 
   int scale = INTVAL(operands[2]); 
-  if (scale != 0)  
-rs6000_scale_v2df (tmp, op1, scale);   
+  if (scale == 0)  
+tmp = op1; 
+  else 
+{  
+  tmp = gen_reg_rtx (V2DFmode);
+  rs6000_scale_v2df (tmp, op1, scale); 
+}  
   emit_insn (gen_vsx_xvcvdpuxds (op0, tmp));   
   DONE;
 }) 
Index: gcc/testsuite/gcc.target/powerpc/pr70963.c   
===
--- gcc/testsuite/gcc.target/powerpc/pr70963.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr70963.c  (working copy)  
@@ -0,0 +1,39 @@
+/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */   
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */  
+/* { dg-options "-maltivec" } */  

[Bug libstdc++/71038] New: copy_file(...) returns false on successful copy.

2016-05-09 Thread eric at efcs dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71038

Bug ID: 71038
   Summary: copy_file(...) returns false on successful copy.
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eric at efcs dot ca
  Target Milestone: ---

The title says it all. copy_file always seems to return false.

Example:

#include 
#include 
#include 

using namespace std::experimental::filesystem;

int main() {
std::ofstream out("/tmp/foo.txt");
out << "hello world!\n";
out.close();
const path p = "/tmp/foo.txt";
const path to = "/tmp/bar.txt";
bool ret = copy_file(p, to);
assert(ret == true);
}

[Bug libstdc++/71037] New: Exceptions thrown from "filesystem::canonical(...)" should contain both paths.

2016-05-09 Thread eric at efcs dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71037

Bug ID: 71037
   Summary: Exceptions thrown from "filesystem::canonical(...)"
should contain both paths.
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eric at efcs dot ca
  Target Milestone: ---

The filesystem error thrown from canonical only contains the first path, not
the base. Since the base path can be user specified the exception should
contain this as well.

#include 
#include 

using namespace std::experimental::filesystem;

int main() {
  const path p = "DNE"
  const path base = "BASE";
  try {
canonical(p, base);
assert(false);
  } catch (filesystem_error const& err) {
assert(err.path1() == p);
assert(err.path2() == base); // FIRES
  }
}

[Bug libstdc++/71036] New: create_directory(p, ...) reports a failure when 'p' is an existing directory

2016-05-09 Thread eric at efcs dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71036

Bug ID: 71036
   Summary: create_directory(p, ...) reports a failure when 'p' is
an existing directory
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eric at efcs dot ca
  Target Milestone: ---

The create_directory functions should not report an error when the directory
they are trying to create already exists. The create_directory(...) functions
seem to report this error.

#include 
#include 

using namespace std::experimental::filesystem;

int main() {
  path p = "/tmp/foo";
  assert(!exists(p));
  create_directory(p); // create the directory once
  create_directory(p); // THROWS!
}

[Bug c++/71035] GNU does not give error on declaration of non literal type in template function

2016-05-09 Thread Judy.Ward at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71035

--- Comment #2 from Judy Ward  ---
Yes I have a beta copy of EDG 
4.11 which has relaxed constexpr and they give an error. Unfortunately some
Boost code (I think inadvertently) relies on not giving a diagnostic.

Yes I see your point that this is really a QOI issue but GNU does seem
inconsistent and EDG will have to emulate that inconsistency.

Thanks
Judy

> On May 9, 2016, at 7:38 PM, msebor at gcc dot gnu.org 
>  wrote:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71035
> 
> Martin Sebor  changed:
> 
>   What|Removed |Added
> 
>   Keywords||accepts-invalid
> Status|UNCONFIRMED |NEW
>   Last reconfirmed||2016-05-09
> CC||msebor at gcc dot gnu.org
> Ever confirmed|0   |1
>  Known to fail||4.9.3, 5.3.0, 6.1.0
>   Severity|normal  |enhancement
> 
> --- Comment #1 from Martin Sebor  ---
> Hi Judy!
> 
> I'll take a stab at this -- let me know if I missed something.  I agree that
> similarly to the non-template case, (in the absence of a valid explicit
> specialization) diagnosing the constexpr function template below would be
> useful, even though in p5 and p6 of [dcl.constexpr], the standard leaves both
> cases as a matter of QoI:
> 
> -6-  If the instantiated template specialization of a constexpr function
> template o member function of a class template would fail to satisfy the
> requirements for a constexpr function or constexpr constructor, that
> specialization is still a constexpr function or constexpr constructor, even
> though a call to such a function cannot appear in a constant expression.  If 
> no
> specialization of the template would satisfy the requirements for a constexpr
> function or constexpr constructor when considered as a non-template function 
> or
> constructor, the template is ill-formed; no diagnostic required.
> 
> (I read the last sentence as referring to implicit specializations of the
> template definition, not explicit ones with valid definitions.)
> 
> Thus, I'm inclined to view this bug not as a defect but an enhancement 
> request.
> Let me know if you disagree.
> 
> Clang is more strict than GCC here by issuing the optional diagnostic.
> 
> My copy of EDG (version 4.10) rejects the program because it doesn't fully
> implement the C++ 14 rules: a) it doesn't recognize void as a literal type, 
> and
> b) it doesn't allow statements in constexpr functions.
> 
> -- 
> You are receiving this mail because:
> You reported the bug.

[Bug c/71013] [7 Regression] c-common.c:12810:37: error: 'LLONG_MAX' was not declared in this scope

2016-05-09 Thread danglin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71013

--- Comment #4 from John David Anglin  ---
Created attachment 38460
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38460=edit
Patch

This fixes build failure on hppa64-hpux.  Not sure its the right place
or even right fix.

[PATCH] PR driver/69265: add hint for options with misspelled arguments

2016-05-09 Thread David Malcolm
opts-common.c's cmdline_handle_error handles invalid arguments
for options with CL_ERR_ENUM_ARG by building a strings listing the
valid arguments.  By also building a vec of valid arguments, we
can use find_closest_string and provide a hint if we see a close
misspelling.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
PR driver/69265
* Makefile.in (GCC_OBJS): Move spellcheck.o to...
(OBJS-libcommon-target): ...here.
* opts-common.c: Include spellcheck.h.
(cmdline_handle_error): Build a vec of valid options and use it
to suggest provide hints for misspelled arguments.

gcc/testsuite/ChangeLog:
PR driver/69265
* gcc.dg/spellcheck-options-11.c: New test case.
---
 gcc/Makefile.in  |  4 ++--
 gcc/opts-common.c| 11 ++-
 gcc/testsuite/gcc.dg/spellcheck-options-11.c |  7 +++
 3 files changed, 19 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-options-11.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6c5adc0..525482f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1159,7 +1159,7 @@ CXX_TARGET_OBJS=@cxx_target_objs@
 FORTRAN_TARGET_OBJS=@fortran_target_objs@
 
 # Object files for gcc many-languages driver.
-GCC_OBJS = gcc.o gcc-main.o ggc-none.o spellcheck.o
+GCC_OBJS = gcc.o gcc-main.o ggc-none.o
 
 c-family-warn = $(STRICT_WARN)
 
@@ -1548,7 +1548,7 @@ OBJS-libcommon = diagnostic.o diagnostic-color.o 
diagnostic-show-locus.o \
 # compiler and containing target-dependent code.
 OBJS-libcommon-target = $(common_out_object_file) prefix.o params.o \
opts.o opts-common.o options.o vec.o hooks.o common/common-targhooks.o \
-   hash-table.o file-find.o
+   hash-table.o file-find.o spellcheck.o
 
 # This lists all host objects for the front ends.
 ALL_HOST_FRONTEND_OBJS = $(foreach v,$(CONFIG_LANGUAGES),$($(v)_OBJS))
diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index bb68982..4e1ef49 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "options.h"
 #include "diagnostic.h"
+#include "spellcheck.h"
 
 static void prune_options (struct cl_decoded_option **, unsigned int *);
 
@@ -1113,6 +1114,7 @@ cmdline_handle_error (location_t loc, const struct 
cl_option *option,
   for (i = 0; e->values[i].arg != NULL; i++)
len += strlen (e->values[i].arg) + 1;
 
+  auto_vec  candidates;
   s = XALLOCAVEC (char, len);
   p = s;
   for (i = 0; e->values[i].arg != NULL; i++)
@@ -1123,9 +1125,16 @@ cmdline_handle_error (location_t loc, const struct 
cl_option *option,
  memcpy (p, e->values[i].arg, arglen);
  p[arglen] = ' ';
  p += arglen + 1;
+ candidates.safe_push (e->values[i].arg);
}
   p[-1] = 0;
-  inform (loc, "valid arguments to %qs are: %s", option->opt_text, s);
+  const char *hint = find_closest_string (arg, );
+  if (hint)
+   inform (loc, "valid arguments to %qs are: %s; did you mean %qs?",
+   option->opt_text, s, hint);
+  else
+   inform (loc, "valid arguments to %qs are: %s", option->opt_text, s);
+
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.dg/spellcheck-options-11.c 
b/gcc/testsuite/gcc.dg/spellcheck-options-11.c
new file mode 100644
index 000..8e27141
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/spellcheck-options-11.c
@@ -0,0 +1,7 @@
+/* Verify that we provide a hint if the user misspells an option argument
+   (PR driver/69265).  */
+
+/* { dg-do compile } */
+/* { dg-options "-ftls-model=global-dinamic" } */
+/* { dg-error "unknown TLS model 'global-dinamic'"  "" { target *-*-* } 0 } */
+/* { dg-message "valid arguments to '-ftls-model=' are: global-dynamic 
initial-exec local-dynamic local-exec; did you mean 'global-dynamic'?"  "" { 
target *-*-* } 0 } */
-- 
1.8.5.3



[Bug c++/71035] GNU does not give error on declaration of non literal type in template function

2016-05-09 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71035

Martin Sebor  changed:

   What|Removed |Added

   Keywords||accepts-invalid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-05-09
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1
  Known to fail||4.9.3, 5.3.0, 6.1.0
   Severity|normal  |enhancement

--- Comment #1 from Martin Sebor  ---
Hi Judy!

I'll take a stab at this -- let me know if I missed something.  I agree that
similarly to the non-template case, (in the absence of a valid explicit
specialization) diagnosing the constexpr function template below would be
useful, even though in p5 and p6 of [dcl.constexpr], the standard leaves both
cases as a matter of QoI:

-6-  If the instantiated template specialization of a constexpr function
template o member function of a class template would fail to satisfy the
requirements for a constexpr function or constexpr constructor, that
specialization is still a constexpr function or constexpr constructor, even
though a call to such a function cannot appear in a constant expression.  If no
specialization of the template would satisfy the requirements for a constexpr
function or constexpr constructor when considered as a non-template function or
constructor, the template is ill-formed; no diagnostic required.

(I read the last sentence as referring to implicit specializations of the
template definition, not explicit ones with valid definitions.)

Thus, I'm inclined to view this bug not as a defect but an enhancement request.
 Let me know if you disagree.

Clang is more strict than GCC here by issuing the optional diagnostic.

My copy of EDG (version 4.10) rejects the program because it doesn't fully
implement the C++ 14 rules: a) it doesn't recognize void as a literal type, and
b) it doesn't allow statements in constexpr functions.

[Bug target/68945] enable libcilkrts on SPARC

2016-05-09 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68945

--- Comment #11 from Eric Botcazou  ---
> * In runtime/config/sparc/os-unix-sysdep.c (__cilkrts_getticks) I needed
> different
>   32- and 64-bit versions.  I tested the result in standalone program which
> just
>   printed the result.

This looks good to me.

> * One thing I wonder about is runtime/config/sparc/os-fence.h: when using
>   __sync_synchronize, gcc emits membar #StoreLoad, while Stefan's patch had
>   membar#LoadLoad | #LoadStore | #StoreStore | #StoreLoad.  It seems that
>   all but #StoreLoad are no-ops for TSO SPARC CPUs, but I'd better get this
> right.

__sync_synchronize emits the minimum memory barrier for the memory model, which
is TSO on Solaris so only #StoreLoad is needed.  The 4 flags are needed for RMO
theoretically, but I'm not sure RMO ever existed in real life.

PING^4 [PATCH, GCC 5] PR 70613, -fabi-version docs don't match implementation

2016-05-09 Thread Jim Wilson
On Mon, May 2, 2016 at 12:13 PM, Jim Wilson  wrote:
> Here is a patch to correct the -fabi-version docs on the GCC 5 branch.
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00480.html

Maybe I didn't put enough info in the email the first 3 times?

You can see the default -fabi-version in gcc/c-family/c-opts.c on the
gcc-5 branch which has

  /* Change flag_abi_version to be the actual current ABI level for the
 benefit of c_cpp_builtins.  */
  if (flag_abi_version == 0)
flag_abi_version = 9;

You can see in the docs that -fabi-version only goes up to 8.

https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/C_002b_002b-Dialect-Options.html#C_002b_002b-Dialect-Options

As for how we got here...
I see that the patch for bug 65945 was back ported to the gcc-5
branch, which required a partial backport of the patch for bug 44282,
which added abi version 9.  The original patch for 44282 is missing
the doc change.

The missing doc change was then added here
https://gcc.gnu.org/viewcvs/gcc?view=revision=228017
which has the invoke.texi hunk we need, but is missing a ChangeLog
entry for it.  So it appears all we need is a partial backport of this
invoke.texi hunk.  This is mostly documenting a change to -Wabi, so we
only need parts of two hunks that document -fabi-version=9 and mention
gcc-5.2.

The patch is attached again.

Jim
Index: ChangeLog
===
--- ChangeLog	(revision 234867)
+++ ChangeLog	(working copy)
@@ -1,3 +1,12 @@
+2016-04-11  Jim Wilson  
+
+	Partial backport from trunk r228017.
+	2015-09-22  Jason Merrill  
+
+	PR c++/70613
+	* doc/invoke.texi (-fabi-version): Document version 9.
+	(-Wabi): Document version 9.  Mention version 8 is default for GCC 5.1.
+
 2016-04-09  Oleg Endo  
 
 	Backport from mainline
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 234867)
+++ doc/invoke.texi	(working copy)
@@ -2118,6 +2118,9 @@ scope.
 Version 8, which first appeared in G++ 4.9, corrects the substitution
 behavior of function types with function-cv-qualifiers.
 
+Version 9, which first appeared in G++ 5.2, corrects the alignment of
+@code{nullptr_t}.
+
 See also @option{-Wabi}.
 
 @item -fabi-compat-version=@var{n}
@@ -2619,7 +2622,15 @@ When mangling a function type with function-cv-qua
 un-qualified function type was incorrectly treated as a substitution
 candidate.
 
-This was fixed in @option{-fabi-version=8}.
+This was fixed in @option{-fabi-version=8}, the default for GCC 5.1.
+
+@item
+@code{decltype(nullptr)} incorrectly had an alignment of 1, leading to
+unaligned accesses.  Note that this did not affect the ABI of a
+function with a @code{nullptr_t} parameter, as parameters have a
+minimum alignment.
+
+This was fixed in @option{-fabi-version=9}, the default for GCC 5.2.
 @end itemize
 
 It also warns about psABI-related changes.  The known psABI changes at this


Re: show size of stack needed by functions

2016-05-09 Thread Eric Botcazou
> Output of -fstack-usage is not accurate
> ===
> 
> This article mentions a "call cost":
> https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/
> 
> I checked for myself, by looking at the change of the stackpointer with a
> debugger, and, yes, there seems to be a constant mismatch (2 bytes with
> avr-gcc-5.3) between change of stack pointer and output of -fstack-usage.
> In some rare cases there are more differences, which I didn't understand
> yet.

That's a bug, very likely in the AVR back-end, which must be fixed by someone 
who knows the AVR architecture.

> Wishes:
> - Add stack-usage in output of -fdump-ipa-cgraph, so that you don't need to
> relate information from two input files at all. I guess this is not
> trivial. Or is it?

It's not difficult, but there is a conflict between them because -fstack-usage 
is designed to be conservatively correct while -fdump-ipa-cgraph is not (it 
does not dump the full callgraph).

-- 
Eric Botcazou


[Bug target/70947] regrename Go breakage on powerpc64

2016-05-09 Thread amodra at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70947

--- Comment #1 from Alan Modra  ---
Author: amodra
Date: Mon May  9 23:12:20 2016
New Revision: 236052

URL: https://gcc.gnu.org/viewcvs?rev=236052=gcc=rev
Log:
[RS6000] Stop regrename twiddling with split-stack prologue

PR target/70947
* config/rs6000/rs6000.c (rs6000_expand_split_stack_prologue): Stop
regrename modifying insns saving lr before __morestack call.
* config/rs6000/rs6000.md (split_stack_return): Similarly for
insns restoring lr after __morestack call.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/config/rs6000/rs6000.md

[Bug fortran/71014] associate statement inside omp parallel do appears to disable default private attribute for inner loop indices

2016-05-09 Thread klindsay at ucar dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71014

--- Comment #9 from Keith Lindsay  ---
Harald,

Thanks for your tips on validation/sanitizing tools.

I am not sufficiently fluent in standard-ese to know what 'associated
do-loops(s)" means. It doesn't help that BLOCK and ASSOCIATE appear in other
contexts in the OpenMP standard, making it challenging to locate information
about them in the standard.

If I replace ASSOCIATE with BLOCK/END BLOCK, I do see the same problems.

I added -fdump-tree-original to the gfortran invocation and compiled code with
and without the BLOCK construct. The generated intermediate files have the
difference

@@ -6,7 +6,7 @@
   static integer(kind=4) s_true = 5050;

   s = {};
-  #pragma omp parallel private(i)
+  #pragma omp parallel
 {
   {
 #pragma omp for private(j) nowait

(There are other differences that appear to simply be renaming of labels, I'm
ignoring those.) So it does seem that the presence of the BLOCK construct is
changing how the compiler assigns attributes to the inner loop index. I didn't
think this was correct, but perhaps I'm misunderstanding how the OpenMP and
Fortran standards interact.

At this point, I would like to know if the compiler is in the right in doing
this. If it is, then I would change my coding practice. If it isn't, then I
assume that gfortran developers would want to know about this.

Keith

[Bug fortran/71032] explicit interface and must not have attributes generates gfortran: internal compiler error: Abort trap: 6 (program f951)

2016-05-09 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71032

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-05-09
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
There is no ICE (only normal errors) when compiling the tests with gfortran
6.1.0 and trunk (7.0).

The ICE has been fixed between revisions r223447 (2015-05-20, ICE) and r223694
(2015-05-26, errors), likely r223614 (last patch for pr44054). AFAICT there is
no plan to back port the fix to the gcc-5 branch. Unless someone objects, I'll
close this PR as fixed in the coming days.

[Bug target/70963] vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

2016-05-09 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70963

--- Comment #4 from Bill Schmidt  ---
OK, there is an obvious bug in the define_expand for vsx_xvcvdpsxds_scale.  If
the scale factor is 0, wrong code is always generated.  I'll get a patch going.

Re: [C PATCH] Warn for optimize attribute on decl after definition (PR c/70255)

2016-05-09 Thread Joseph Myers
On Mon, 9 May 2016, Marek Polacek wrote:

> In this PR, Richi pointed out that we don't warn for the case when a
> declaration with attribute optimize follows the definition which is lacking
> that attribute.  This patch adds such a warning.  Though the question is
> whether this shouldn't apply to more attributes than just "optimize".  And,
> as can be seen in the testcase, we'll warn for even for the case when the
> definition has
>   optimize ("no-associative-math,O2")
> and the declaration
>   optimize ("O2,no-associative-math")
> Not sure if we have something better than attribute_value_equal, though.
> 
> (The C++ FE lacks these kind of warnings; I opened PR71024 for that.)
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug c++/71035] New: GNU does not give error on declaration of non literal type in template function

2016-05-09 Thread Judy.Ward at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71035

Bug ID: 71035
   Summary: GNU does not give error on declaration of non literal
type in template function
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Judy.Ward at intel dot com
  Target Milestone: ---

Both EDG and Clang give an error on the code below.

GNU only gives an error if call() is not a template function (see -DERROR
below) or if call() is used in a way that requires it to be constexpr (i.e. in
a static_assert).

I think this should be an error.

struct C
{
   C(); // constructor is not declared constexpr
};

#ifdef ERROR
#else
template 
#endif
constexpr void call()
{
   C c;
}

int main() {
#ifdef ERROR
   call();
#else
   call();
#endif
   return 0;
}

[Bug target/70963] vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

2016-05-09 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70963

--- Comment #3 from Bill Schmidt  ---
Note also that your asm constraints are wrong.  You need VSX registers, not
Altivec registers, so you should be using the "wa" constraint instead of the
"v" constraint.  This is why you get some apparently wrong register numbers
with your asm results.

[Bug target/70957] testsuite/gcc.target/powerpc/vsx-elemrev-4.c fails on power7

2016-05-09 Thread seurer at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70957

--- Comment #9 from Bill Seurer  ---
Systems where I see it fail:
granola
yavin3

Systems where I do not:
bns

All are power7 BE systems.  I didn't do anything special on any of the systems.
 I ran configure like this on all of them:

/home/seurer/gcc/gcc-test/configure --prefix=/home/seurer/gcc/install/gcc-test 
 --enable-languages=c,fortran,c++ --disable-multilib --disable-libsanitizer
--disable-bootstrap

Different compiler versions were used to build on all of the systems.

seurer@bns:~/gcc/build/gcc-test$ $CC --version
gcc (SUSE Linux) 4.3.4 [gcc-4_3-branch revision 152973]

On bns I also tried a bootstrapped gcc 7.0 and the test case still worked
there.

seurer@yavin3:~/gcc/build/gcc-test$ $CC --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)


seurer@granola:~/gcc/build/gcc-test$ $CC --version
gcc (GCC) 5.3.0

[Bug target/70963] vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

2016-05-09 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70963

--- Comment #2 from Bill Schmidt  ---
The xxswapd's are a bit of a red herring.  These are part of the little-endian
normalization code that are required with the funky lxvd2x and stxvd2x
instructions.  The problem appears to be the register assignment on the
instructions generated for vec_cts and vec_ctf.  The use of vs12 on vec_cts is
an obvious problem, since vs12 doesn't contain any value assigned in the
function.  The code for vec_ctf looks fine.  So we need to figure out what
happened with the register number on xvcvdpsxds.

The problem still exists on trunk.

Re: SafeStack proposal in GCC

2016-05-09 Thread Joseph Myers
On Mon, 9 May 2016, Michael Matz wrote:

> Sure.  Same QoI bug in my book.  (And I'm not motivated enough to find out 
> if the various C standards weren't just following POSIX whe setjmp was 
> included, or really the other way around).

Standards for setjmp and longjmp date back at least as far as the 1984 
/usr/group Standard, which was a base document for the C standard library 
and for POSIX (and from there, they date back to at least V7 Unix).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix PR c++/70822 (bogus error with parenthesized SCOPE_REF)

2016-05-09 Thread Patrick Palka
On Fri, Apr 29, 2016 at 11:55 AM, Patrick Palka  wrote:
> The problem here is that some code paths are not prepared to handle a
> non-dependent PAREN_EXPR, which my fix for PR c++/70106 introduced.  In
> particular lvalue_kind() returns clk_none for a PAREN_EXPR which makes
> build_x_unary_op() emit a bogus error for an expression like &(A::b).
> (If the PAREN_EXPR were dependent then lvalue_kind() wouldn't get called
> in the first place, build_x_unary_op() would exit early.)
>
> This patch replaces the 70106 fix.  Instead of wrapping a SCOPE_REF in a
> PAREN_EXPR, this patch overloads the REF_PARENTHESIZED_P to apply to
> SCOPE_REFs too.  This makes sense to me because the two tree codes are
> closely related (e.g. a SCOPE_REF before instantiation may become a
> COMPONENT_REF after instantiation) so they should be treated similarly
> by force_paren_expr().
>
> There are two rather simpler ways to fix this PR.  One is to make
> lvalue_kind() recurse into PAREN_EXPRs (although other parts of the FE
> may be mishandling non-dependent PAREN_EXPRs as well), and the other is
> to make force_paren_expr() never return a non-dependent PAREN_EXPR,
> which can be achieved by building the PAREN_EXPR with build_nt().  I am
> not sure which approach is best for GCC 7 and for GCC 6.
>
> Somewhat unrelated the fix: I couldn't find an existing test that
> checked that force_paren_expr handles SCOPE_REFs properly wrt auto
> deduction so I added one.
>
> Bootstrap and regtesting in progress on x86_64-pc-linux-gnu.
>
> gcc/cp/ChangeLog:
>
> PR c++/70822
> PR c++/70106
> * cp-tree.h (REF_PARENTHESIZED_P): Make this flag apply to
> SCOPE_REFs too.
> * pt.c (tsubst_qualified_id): If REF_PARENTHESIZED_P is set
> on the qualified_id then propagate it to the resulting
> expression.
> (do_auto_deduction): Check REF_PARENTHESIZED_P on SCOPE_REFs
> too.
> * semantics.c (force_paren_expr): If given a SCOPE_REF, just set
> its REF_PARENTHESIZED_P flag.
>
> gcc/testsuite/ChangeLog:
>
> PR c++/70822
> PR c++/70106
> * g++.dg/cpp1y/auto-fn31.C: New test.
> * g++.dg/cpp1y/paren4.C: New test.
> ---
>  gcc/cp/cp-tree.h   |  4 ++--
>  gcc/cp/pt.c| 15 +++
>  gcc/cp/semantics.c | 13 +++--
>  gcc/testsuite/g++.dg/cpp1y/auto-fn31.C | 33 +
>  gcc/testsuite/g++.dg/cpp1y/paren4.C| 14 ++
>  5 files changed, 63 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn31.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/paren4.C
>
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 2caf7ce..0df5953 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -170,7 +170,7 @@ operator == (const cp_expr , tree rhs)
>TARGET_EXPR_DIRECT_INIT_P (in TARGET_EXPR)
>FNDECL_USED_AUTO (in FUNCTION_DECL)
>DECLTYPE_FOR_LAMBDA_PROXY (in DECLTYPE_TYPE)
> -  REF_PARENTHESIZED_P (in COMPONENT_REF, INDIRECT_REF)
> +  REF_PARENTHESIZED_P (in COMPONENT_REF, INDIRECT_REF, SCOPE_REF)
>AGGR_INIT_ZERO_FIRST (in AGGR_INIT_EXPR)
>CONSTRUCTOR_MUTABLE_POISON (in CONSTRUCTOR)
> 3: (TREE_REFERENCE_EXPR) (in NON_LVALUE_EXPR) (commented-out).
> @@ -3403,7 +3403,7 @@ extern void decl_shadowed_for_var_insert (tree, tree);
> some of the time in C++14 mode.  */
>
>  #define REF_PARENTHESIZED_P(NODE) \
> -  TREE_LANG_FLAG_2 (TREE_CHECK2 ((NODE), COMPONENT_REF, INDIRECT_REF))
> +  TREE_LANG_FLAG_2 (TREE_CHECK3 ((NODE), COMPONENT_REF, INDIRECT_REF, 
> SCOPE_REF))
>
>  /* Nonzero if this AGGR_INIT_EXPR provides for initialization via a
> constructor call, rather than an ordinary function call.  */
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index e7ec629..7adf308 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -13741,8 +13741,10 @@ tsubst_qualified_id (tree qualified_id, tree args,
>  {
>if (is_template)
> expr = build_min_nt_loc (loc, TEMPLATE_ID_EXPR, expr, template_args);
> -  return build_qualified_name (NULL_TREE, scope, expr,
> -  QUALIFIED_NAME_IS_TEMPLATE (qualified_id));
> +  tree r = build_qualified_name (NULL_TREE, scope, expr,
> +QUALIFIED_NAME_IS_TEMPLATE 
> (qualified_id));
> +  REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (qualified_id);
> +  return r;
>  }
>
>if (!BASELINK_P (name) && !DECL_P (expr))
> @@ -13822,6 +13824,9 @@ tsubst_qualified_id (tree qualified_id, tree args,
>&& TREE_CODE (expr) != OFFSET_REF)
>  expr = convert_from_reference (expr);
>
> +  if (REF_PARENTHESIZED_P (qualified_id))
> +expr = force_paren_expr (expr);
> +
>return expr;
>  }
>
> @@ -23966,8 +23971,10 @@ do_auto_deduction (tree type, tree init, tree 
> auto_node,
>
>if (AUTO_IS_DECLTYPE 

[Bug fortran/56226] Add support for DEC UNION and MAP extensions

2016-05-09 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56226

--- Comment #30 from Steve Kargl  ---
On Mon, May 09, 2016 at 02:55:01PM +, fritzoreese at gmail dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56226
> 
> --- Comment #29 from Fritz Reese  ---
> (In reply to Andreas Schwab from comment #25)
> > FAIL: gfortran.dg/dec_union_4.f90   -O0  execution test
> > FAIL: gfortran.dg/dec_union_4.f90   -O1  execution test
> > FAIL: gfortran.dg/dec_union_4.f90   -O2  execution test
> > FAIL: gfortran.dg/dec_union_4.f90   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> > FAIL: gfortran.dg/dec_union_4.f90   -O3 -g  execution test
> > FAIL: gfortran.dg/dec_union_4.f90   -Os  execution test
> 
> It was silly of me to disregard endian-ness in this test case. Fixed:
> 
> https://gcc.gnu.org/ml/fortran/2016-05/msg00018.html
> 

Thanks for the patch.  I'll take care of this on Saturday,
if no one else commits before then.

Re: show size of stack needed by functions

2016-05-09 Thread Sebastian
Hi,
sorry for reopening a very old thread, it took some time until I got around to 
write a script that parses the output of -fdump-ipa-cgraph and -fstack-usage.
I'm using gcc 5.3 currently.

It's mostly what I need, I get all the information about the callgraph that I 
wanted to get (what's inlined, which functions was taken the address of, which 
functions contain indirect calls).
With optimization, some of these indirect calls even disappear. Nice.

But there are some things that turned out not so nice, which required 
workarounds. Any advice what to do about them? Put them into bugzilla, each as 
a separate ticket?

Do you think that with nowadays, it would be better (simpler, more portable to 
new gcc versions) to write a plugin instead of parsing these dumps?
I tried with a python plugin, but the call graph information I got in form of 
some gimple fragments wasn't very accessible. Did I miss the right API with 
which everything is easy, or do you have other hints on how to get the same 
information as in -fdump-ipa-cgraph and -fstack-usage?


TMPDIR
==

This one is no big problem. But maybe someone can improve the documentation on 
this.

When you build with -flto, some of the output doesn't go into the same folder 
as the executable. It goes into $TMPDIR instead.
Except on Windows - there it goes into GetTempPath(). So my Makefile needs to 
handle Windows as a special case, and use a different environment variable 
there (TMP instead of TMPDIR).

Wishes:
- Document that the output of -fstack-usage goes into TMPDIR sometimes, and 
document the differece on Windows.
- On Windows, use GetTempPath() only as a fallback when TMPDIR is not set
- Output into the same directory as the executable


Output of -fstack-usage is not accurate
===

This article mentions a "call cost": 
https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/

I checked for myself, by looking at the change of the stackpointer with a 
debugger, and, yes, there seems to be a constant mismatch (2 bytes with 
avr-gcc-5.3) between change of stack pointer and output of -fstack-usage. In 
some rare cases there are more differences, which I didn't understand yet.

Wishes:
- Document that there is a call cost not included in -fstack-usage. Document 
what the call cost is, for each target architecture.
- Preferred: Make output of -fstack-usage accurate


Symbol identifiers in output of -fstack-usage are not unique


In the output of -fstack-usage, there is the name of the source file, and the 
name of the symbol.
Unfortunately, with -flto, this is not unique.

Example: static inline functions declared in header files. Usually, the stack 
size will be the same in all translation units (and in my project I fortunately 
don't have this problem currently). But there is a chance that the stack size 
of a function changes depending on some macro that is different for different 
translation units.

Fixing this would have the additional benefit that my script could be simpler 
because it would not need to know which object file name (as given in the 
output of -fdump-ipa-cgraph) corresponds to which source file name (as given in 
the output of -fstack-usage).

Wishes:
- Add stack-usage in output of -fdump-ipa-cgraph, so that you don't need to 
relate information from two input files at all.
  I guess this is not trivial. Or is it?
  Can you, in cgraph_node::dump, simply access the stack_usage "su" of struct 
function *get_fun (void);
  and in a sufficiently late pass the values will be valid?
- In -fstack-usage, use the same unique identifiers as in -fdump-ipa-cgraph.
  The problem with that: This fix would probably break some other tools which 
parse the output of -fstack-usage.
  I'm considering to patch gcc myself.
  But because of the mentioned problem, there is no big chance that this will 
be accepted up-stream, is there?
  On first glance, this isn't hard - I found this in output_stack_usage:
  /* We don't want to print the full qualified name because it can be long,
 so we strip the scope prefix, but we may need to deal with the suffix
 created by the compiler.  */
  But on second glance, I noticed that this is a suffix with a dot - not the 
"order" of a cgraph_node with a slash.
  Is it possible to access this order in output_stack_usage?


Regards,
Sebastian



On Thu, 14 Oct 2010 00:16:29 +0200
Richard Guenther  wrote:

> On Wed, Oct 13, 2010 at 11:43 PM, Sebastian
>  wrote:
> > On Wed, Oct 13, 2010 H.J. Lu wrote:  
> >> GCC 4.6.0 has -fstack-usage.  
> > Thanks. That's probably the reason I didn't find it in current manuals.
> >
> > On Wed, Oct 13, 2010 Ian Lance Taylor wrote:  
> >> The mailing list gcc@gcc.gnu.org is for the development of gcc itself.
> >> This question would be more appropriate for the mailing list
> >> 

Re: SafeStack proposal in GCC

2016-05-09 Thread Michael Matz
Hi,

On Mon, 9 May 2016, Rich Felker wrote:

> > Done.  I never understood why they left in the hugely unuseful 
> > {sig,}{set,long}jmp() but removed the actually useful *context() 
> > (amended somehow like above).
> 
> Because those are actually part of the C language

Sure.  Same QoI bug in my book.  (And I'm not motivated enough to find out 
if the various C standards weren't just following POSIX whe setjmp was 
included, or really the other way around).

> (the non-sig versions, but the sig versions are needed to work around 
> broken unices that made the non-sig versions save/restore signal mask 
> and thus too slow to ever use). They're also much more useful for 
> actually reasonable code (non-local exit across functions that were 
> badly designed with no error paths)

Trivially obtainable with getcontext/setcontext as well.

> as opposed to just nasty hacks that 
> are mostly/entirely UB anyway (coroutines, etc.).

Well, we differ in the definition of reasonable :)  And I certainly don't 
see any material difference in undefined behaviour between both classes of 
functions.  Both are "special" regarding compilers (e.g. returning 
multiple times) and usage.  But as the *jmp() functions can be implemented 
with *context(), but not the other way around, it automatically follows 
(to me!) that the latter are more useful, if for nothing else than basic 
building blocks.  (there are coroutine libs that try to emulate a real 
makecontext with setjmp/longjmp on incapable architectures.  As this is 
impossible for all corner cases they are broken and generally awful on 
them :) )


Ciao,
Michael.


[Bug tree-optimization/71034] abs(f) u>= 0. is always true

2016-05-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71034

--- Comment #2 from Marc Glisse  ---
(In reply to Andrew Pinski from comment #1)
> I think this is the optimizations that should be done:
> abs(x) < 0 -> x != x

for x=NaN, abs(x) is NaN, and NaN<0 is false. So the current simplification to
false seems correct.

> abs(x) >= 0 -> x u== x

x == x. I'd like to canonicalize it to x ord x, but that's a different issue.

> abs(x) == 0 -> x == 0
> abs(x) <= 0 -> x == 0 (since this is an ordered comparison)

ok

> abs(x) u< 0 -> false
> abs(x) u>= 0 -> false

u<, u>= are true if an argument is NaN... u< can simplify to x unord x, and u>=
is always true.

> abs(x) u== 0 -> x == 0

x u== 0

[Bug tree-optimization/71034] abs(f) u>= 0. is always true

2016-05-09 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71034

--- Comment #1 from Andrew Pinski  ---
I think this is the optimizations that should be done:
abs(x) < 0 -> x != x
abs(x) >= 0 -> x u== x
abs(x) == 0 -> x == 0
abs(x) <= 0 -> x == 0 (since this is an ordered comparison)
abs(x) u< 0 -> false
abs(x) u>= 0 -> false
abs(x) u== 0 -> x == 0

 I Hope I did not mess this up and got the unordered comparisons correct.

Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Mon, May 9, 2016 at 2:03 PM, Joel Sherrill  wrote:
>
> On 5/9/2016 3:41 PM, Ian Lance Taylor wrote:
>>
>> On Mon, May 9, 2016 at 1:07 PM, Joel Sherrill 
>> wrote:
>>>
>>>
>>> One complication on RTEMS which is a single process, multi-threaded RTOS
>>> is that we can no longer check the stack bounds. For threads, we know
>>> where the stack memory is and the range for each thread. For ucontext_t,
>>> it seems this knowledge is unknown to the RTOS.
>>>
>>> Thus it would become the responsibility of the run-time using ucontext_t
>>> to put in fence patterns and check those.
>>
>>
>> On RTEMS and similar systems, you could write makecontext to register
>> the stack (whose start and length are known to the function) with the
>> RTOS.
>
>
> Ahh... slow today. swapcontext() would have to work with the stack
> checker.
>  Interesting.. the stack usage reporting would have to be taught
> about the ucontext_t's in the system and report those as well.
>
> Am I missing something or is there no way to know when a context
> goes out of existence in the API?

That is correct.  Good point.

Ian


[Bug tree-optimization/71034] New: abs(f) u>= 0. is always true

2016-05-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71034

Bug ID: 71034
   Summary: abs(f) u>= 0. is always true
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

A very simple missed optimization, we optimize abs(x)<0 to false (in forwprop,
haven't found the exact place yet) but not abs(x) u>= 0 to true. I noticed it
because cdce now produces this comparison for sqrt, which causes a small
regression on a dead sqrt(abs(x)).

int f(double x){
  x=__builtin_fabs(x);
  // return x<0;
  return !__builtin_isless(x,0);
}

Re: [PATCH] Make basic asm implicitly clobber memory

2016-05-09 Thread Bernd Edlinger
On 05/09/16 15:46, Bernd Schmidt wrote:
> On 05/09/2016 03:37 PM, Bernd Edlinger wrote:
>> On 05/09/16 09:56, Richard Biener wrote:
>>>
>>> At least it sounds to me that its semantics can be fully expressed
>>> with generic asms?  (Maybe apart from the only-if-ASM_STRING-is-empty
>>> part)
>>>
>>
>> That was also my first idea too.
>>
>> In simple cases an asm ("whatever"); should do the same as
>> asm ("whatever" ::: );
>>
>> Adding a "memory" to the clobber list would be simple that's true.
>>
>> But in general it can be pretty complicated, especially if the
>> string contains the special characters % { | }.
>
> Is the only difference in how the string is output? Maybe we can have a
> slightly different form of ASM_OPERANDS (with a bit set, or with the
> string wrapped in something else) to indicate that it's old-style.

Most of the difference is what happens in final.c, and adding a new
attribute to the ASM_OPERANDS tree node is definitely one option.
I tried to implement it in a way that causes the least confusion.

There are lots of different tree representations for an extended asm
statement in genereal, but only one for a basic asm.

An extended asm that has no outputs and no clobbers, is an ASM_OPERAND
node with optional vector of ASM_INPUTs containig the input constraint:

ASM_OPERAND { "asm", "", 0, VEC { inputs...}, VEC { ASM_INPUT ("x")...}

but if it has any CLOBBERS, it will look like this:

PARALLEL { ASM_OPERAND, CLOBBER... }

if it has one output, and zero clobbers we have:

SET { x, ASM_OPERAND }

and in case we have more than one output we have:

PARALLEL { SET { x, ASM_OPERAND }... , CLOBBER... }


A basic asm is just an ASM_INPUT that is not underneath an ASM_OPERAND.

But to add any CLOBBERs to this ASM_INPUT it has to be in PARALLEL
with the CLOBBERs, so that would look like this:

PARALLEL { ASM_INPUT{ "asm" }, CLOBBER... }


There are lots of places where we need to know if a statement is an
assembler statement, in most places this is done in this way:

GET_CODE (PATTERN (insn)) == ASM_INPUT
|| asm_noperands (PATTERN (insn)) >= 0

There are a handful of places where it is done it this way:

GET_CODE (PATTERN (insn)) == ASM_INPUT
|| extract_asm_operands (PATTERN (insn)) != NULL_RTX

extract_asm_operands locates the ASM_OPERAND node from an extended
asm that can have either of the several forms above, but in most
cases the result is not looked at.  Making extract_asm_operands
return anything but an ASM_OPERANDS is impossible, but making
asm_noperands return 0 for a PARALLEL { ASM_INPUT, CLOBBER... }
is not too complicated.

Fortunately, all the remaining uses of extract_asm_operands really
mean an extended asm.

Hope that explains my idea.


Thanks
Bernd.


Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

2016-05-09 Thread Andrew Pinski
On Mon, May 9, 2016 at 1:31 PM, Brett Neumeier  wrote:
> On Tue, May 3, 2016 at 10:01 AM, lh_mouse  wrote:
>> Should I file a bug report then?
>> We need some Linux testers, though not many people on Linux relocate 
>> compilers.
>
> For what it's worth -- I encountered the same problem on a GNU/Linux
> system. In my specific situation, I'm cross-compiling GCC using an
> AMD64-to-mips64el cross-toolchain, and installing the resulting GCC in
> a sysroot directory. When I try to use that GCC on a target device
> where (of course) the sysroot directory becomes "/", the hard-coded
> "/path/to/sysroot" from the host system is still used to find the C++
> headers, resulting in the same ".../include/c++/6.1.1/cstdlib:75:25:
> fatal error: stdlib.h: No such file or directory" error message you
> got.
>
> Changing #include_next to #include in cstdlib and cmath fixed my
> problem -- so, thank you very much for this discussion! It helped at
> least one other person.
>
> Please let me know if there's any other testing I can do to help.


This sounds like a good use of --with-build-sysroot instead of just
--with-sysroot.
I use the following for the candian cross:
--with-sysroot=/ --with-build-sysroot=${SYSROOT}

Thanks,
Andrew

>
> Cheers,
>
> Brett


Re: Machine constraints list

2016-05-09 Thread Joseph Myers
On Mon, 9 May 2016, David Wohlferd wrote:

> In my defense, I can't find any official list of which are 'tertiary' and
> which are deprecated (https://gcc.gnu.org/ml/gcc/2016-03/msg00010.html).

Deprecated targets are exactly those in the "# Obsolete configurations." 
list in config.gcc (targets requiring --enable-obsolete to build them).  
The only architecture for which all configurations are currently 
deprecated is mep.

Of course, such deprecated targets should still have all their 
documentation present, bug reports kept open, etc. - it's only when 
support for a target is actually removed from GCC that documentation etc. 
is removed and bug reports closed (as WONTFIX).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: SafeStack proposal in GCC

2016-05-09 Thread Joel Sherrill



On 5/9/2016 3:41 PM, Ian Lance Taylor wrote:

On Mon, May 9, 2016 at 1:07 PM, Joel Sherrill  wrote:


One complication on RTEMS which is a single process, multi-threaded RTOS
is that we can no longer check the stack bounds. For threads, we know
where the stack memory is and the range for each thread. For ucontext_t,
it seems this knowledge is unknown to the RTOS.

Thus it would become the responsibility of the run-time using ucontext_t
to put in fence patterns and check those.


On RTEMS and similar systems, you could write makecontext to register
the stack (whose start and length are known to the function) with the
RTOS.


Ahh... slow today. swapcontext() would have to work with the stack
checker.
 
Interesting.. the stack usage reporting would have to be taught

about the ucontext_t's in the system and report those as well.

Am I missing something or is there no way to know when a context
goes out of existence in the API?



Ian



--joel



Re: Machine constraints list

2016-05-09 Thread David Wohlferd

On 5/9/2016 6:42 AM, paul_kon...@dell.com wrote:

On May 8, 2016, at 6:27 PM, David Wohlferd  wrote:

If these architectures aren't supported anymore, is it time to drop some of 
these from this page?

Your theory is quite mistaken.  A lot of the ones you labeled "drop" are 
supported.  Quite possibly all of them.


Ok, I see that.  Spot checking some of the architectures, they are still 
getting periodic checkins.


In my defense, I can't find any official list of which are 'tertiary' 
and which are deprecated (https://gcc.gnu.org/ml/gcc/2016-03/msg00010.html).


That said, there are still a lot of entries on that machine constraint page.

How about if I re-organize the list similar to function attributes 
(https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html)?  Or at a 
minimum, add @anchors for each architecture so there are links?


dw


[gomp4] backport fix for PR70626

2016-05-09 Thread Cesar Philippidis
This patch backports the change in the way that 'acc parallel loop'
reductions are handled in trunk. Before, the reduction clause only used
to be associated with the split acc loop. Now the reduction clause is
associated with both the loop and the parallel region. That's beneficial
because the gimplifier adds implicit copy clauses if necessary for any
reduction variable attached to a parallel construct.

I had to update reduction-2.f95 because of the way that gomp4 implements
device_type, which tends to rearrange the ordering of the clauses. Also,
libgomp.oacc-c++/template-reduction.C is broken in gomp4, so I had to
xfail it. Apparently, it exposed an async bug. My forthcoming patch
which uses firstprivate pointers for subarrays should fix it.

This patch has been committed to gomp-4_0-branch.

Cesar
2016-05-09  Cesar Philippidis  

	Backport trunk r235651:
	2016-04-29  Cesar Philippidis  

	gcc/c-family/
	PR middle-end/70626
	* c-common.h (c_oacc_split_loop_clauses): Add boolean argument.
	* c-omp.c (c_oacc_split_loop_clauses): Use it to duplicate
	reduction clauses in acc parallel loops.

	gcc/c/
	PR middle-end/70626
	* c-parser.c (c_parser_oacc_loop): Don't augment mask with
	OACC_LOOP_CLAUSE_MASK.
	(c_parser_oacc_kernels_parallel): Update call to
	c_oacc_split_loop_clauses.

	gcc/cp/
	PR middle-end/70626
	* parser.c (cp_parser_oacc_loop): Don't augment mask with
	OACC_LOOP_CLAUSE_MASK.
	(cp_parser_oacc_kernels_parallel): Update call to
	c_oacc_split_loop_clauses.

	gcc/fortran/
	PR middle-end/70626
	* trans-openmp.c (gfc_trans_oacc_combined_directive): Duplicate
	the reduction clause in both parallel and loop directives.

	gcc/testsuite/
	PR middle-end/70626
	* c-c++-common/goacc/combined-reduction.c: New test.
	* gfortran.dg/goacc/reduction-2.f95: Add check for kernels reductions.

	libgomp/
	PR middle-end/70626
	* testsuite/libgomp.oacc-c++/template-reduction.C: Adjust test.
	* testsuite/libgomp.oacc-c-c++-common/combined-reduction.c: New test.
	* testsuite/libgomp.oacc-fortran/combined-reduction.f90: New test.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ef3493e..daa77f9 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1285,7 +1285,7 @@ extern bool c_omp_check_loop_iv (tree, tree, walk_tree_lh);
 extern bool c_omp_check_loop_iv_exprs (location_t, tree, tree, tree, tree,
    walk_tree_lh);
 extern tree c_finish_oacc_wait (location_t, tree, tree);
-extern tree c_oacc_split_loop_clauses (tree, tree *);
+extern tree c_oacc_split_loop_clauses (tree, tree *, bool);
 extern void c_omp_split_clauses (location_t, enum tree_code, omp_clause_mask,
  tree, tree *);
 extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 4d3f7dc..614ee1f 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -861,9 +861,10 @@ c_omp_check_loop_iv_exprs (location_t stmt_loc, tree declv, tree decl,
#pragma acc parallel loop  */
 
 tree
-c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses)
+c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses,
+			   bool is_parallel)
 {
-  tree next, loop_clauses;
+  tree next, loop_clauses, nc;
 
   loop_clauses = *not_loop_clauses = NULL_TREE;
   for (; clauses ; clauses = next)
@@ -882,7 +883,23 @@ c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses)
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_INDEPENDENT:
 	case OMP_CLAUSE_PRIVATE:
+	  OMP_CLAUSE_CHAIN (clauses) = loop_clauses;
+	  loop_clauses = clauses;
+	  break;
+
+	  /* Reductions must be duplicated on both constructs.  */
 	case OMP_CLAUSE_REDUCTION:
+	  if (is_parallel)
+	{
+	  nc = build_omp_clause (OMP_CLAUSE_LOCATION (clauses),
+ OMP_CLAUSE_REDUCTION);
+	  OMP_CLAUSE_DECL (nc) = OMP_CLAUSE_DECL (clauses);
+	  OMP_CLAUSE_REDUCTION_CODE (nc)
+		= OMP_CLAUSE_REDUCTION_CODE (clauses);
+	  OMP_CLAUSE_CHAIN (nc) = *not_loop_clauses;
+	  *not_loop_clauses = nc;
+	}
+
 	  OMP_CLAUSE_CHAIN (clauses) = loop_clauses;
 	  loop_clauses = clauses;
 	  break;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 48fa26a..0f2d871 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -14012,6 +14012,8 @@ static tree
 c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 		omp_clause_mask mask, tree *cclauses, bool *if_p)
 {
+  bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
+
   strcat (p_name, " loop");
   mask |= OACC_LOOP_CLAUSE_MASK;
 
@@ -14020,7 +14022,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 	cclauses == NULL);
   if (cclauses)
 {
-  clauses = c_oacc_split_loop_clauses (clauses, cclauses);
+  clauses = c_oacc_split_loop_clauses (clauses, cclauses, is_parallel);
   if (*cclauses)
 	*cclauses = c_finish_omp_clauses (*cclauses, C_ORT_ACC);
   if (clauses)
@@ -14128,8 +14130,6 @@ 

[Bug middle-end/70988] missing buffer overflow detection in chained strcat calls

2016-05-09 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70988

Martin Sebor  changed:

   What|Removed |Added

Summary|missing buffer overflow |missing buffer overflow
   |warning on chained strcat   |detection in chained strcat
   |calls   |calls
  Known to fail||4.5.3, 4.8.3, 4.9.3, 5.3.0,
   ||6.1.0

--- Comment #1 from Martin Sebor  ---
Furthermore, in cases where GCC does optimize multiple chained strcat calls
into calls to __builtin_memcpy (which are then expanded into inline assembly)
as in the test case below, it fails to add the instrumentation necessary to
detect the buffer overflow.

$ cat xxx.c && /home/msebor/build/gcc-trunk-git/gcc/xgcc
-B/home/msebor/build/gcc-trunk-git/gcc -D_FORTIFY_SOURCE=2 -O2 -S -Wall -Wextra
-Wpedantic -fdump-tree-optimized=/dev/stdout xxx.c && ./a.out 
#include 

void  __attribute__ ((noclone, noinline))
f (const char *s)
{
  __builtin_printf ("\"%s\"\n", s);
}

void  __attribute__ ((noclone, noinline))
g (void)
{
  char a [4] = "";
  strcat (a, "abc");
  strcat (a, "def");
  strcat (a, "ghi");
  strcat (a, "jkl");
  f (a);
}

int main ()
{
  g ();
}

;; Function f (f, funcdef_no=24, decl_uid=2214, cgraph_uid=24, symbol_order=24)

__attribute__((noinline, noclone))
f (const char * s)
{
  :
  __builtin_printf ("\"%s\"\n", s_2(D)); [tail call]
  return;

}



;; Function g (g, funcdef_no=25, decl_uid=2217, cgraph_uid=25, symbol_order=25)

__attribute__((noinline, noclone))
g ()
{
  char a[4];

  :
  MEM[(char * {ref-all})] = "abc";
  __builtin_memcpy ([(void *) + 3B], "def", 4);
  __builtin_memcpy ([(void *) + 6B], "ghi", 4);
  __builtin_memcpy ([(void *) + 9B], "jkl", 4);
  f ();
  a ={v} {CLOBBER};
  return;

}



;; Function main (main, funcdef_no=26, decl_uid=2220, cgraph_uid=26,
symbol_order=26) (executed once)

main ()
{
  :
  g ();
  return 0;

}


"�@abcdef"

[Bug middle-end/70626] bogus results in 'acc parallel loop' reductions

2016-05-09 Thread cesar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70626

--- Comment #6 from cesar at gcc dot gnu.org ---
Author: cesar
Date: Mon May  9 20:42:47 2016
New Revision: 236049

URL: https://gcc.gnu.org/viewcvs?rev=236049=gcc=rev
Log:
Backport trunk r235651:
2016-04-29  Cesar Philippidis  

gcc/c-family/
PR middle-end/70626
* c-common.h (c_oacc_split_loop_clauses): Add boolean argument.
* c-omp.c (c_oacc_split_loop_clauses): Use it to duplicate
reduction clauses in acc parallel loops.

gcc/c/
PR middle-end/70626
* c-parser.c (c_parser_oacc_loop): Don't augment mask with
OACC_LOOP_CLAUSE_MASK.
(c_parser_oacc_kernels_parallel): Update call to
c_oacc_split_loop_clauses.

gcc/cp/
PR middle-end/70626
* parser.c (cp_parser_oacc_loop): Don't augment mask with
OACC_LOOP_CLAUSE_MASK.
(cp_parser_oacc_kernels_parallel): Update call to
c_oacc_split_loop_clauses.

gcc/fortran/
PR middle-end/70626
* trans-openmp.c (gfc_trans_oacc_combined_directive): Duplicate
the reduction clause in both parallel and loop directives.

gcc/testsuite/
PR middle-end/70626
* c-c++-common/goacc/combined-reduction.c: New test.
* gfortran.dg/goacc/reduction-2.f95: Add check for kernels reductions.

libgomp/
PR middle-end/70626
* testsuite/libgomp.oacc-c++/template-reduction.C: Adjust test.
* testsuite/libgomp.oacc-c-c++-common/combined-reduction.c: New test.
* testsuite/libgomp.oacc-fortran/combined-reduction.f90: New test.


Added:
   
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/combined-reduction.c
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/pr70688.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/combined-reduction.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-fortran/combined-reduction.f90
Modified:
branches/gomp-4_0-branch/gcc/c-family/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/c-family/c-common.h
branches/gomp-4_0-branch/gcc/c-family/c-omp.c
branches/gomp-4_0-branch/gcc/c/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/c/c-parser.c
branches/gomp-4_0-branch/gcc/cp/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/cp/parser.c
branches/gomp-4_0-branch/gcc/cp/semantics.c
branches/gomp-4_0-branch/gcc/fortran/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/fortran/gfortran.h
branches/gomp-4_0-branch/gcc/fortran/match.c
branches/gomp-4_0-branch/gcc/fortran/trans-openmp.c
branches/gomp-4_0-branch/gcc/testsuite/ChangeLog.gomp
   
branches/gomp-4_0-branch/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
branches/gomp-4_0-branch/gcc/testsuite/gfortran.dg/goacc/reduction-2.f95
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c++/template-reduction.C

Re: SafeStack proposal in GCC

2016-05-09 Thread Rich Felker
On Mon, May 09, 2016 at 10:03:02PM +0200, Michael Matz wrote:
> Hi,
> 
> On Mon, 9 May 2016, Rich Felker wrote:
> 
> > > > The *context APIs are deprecated and I'm not sure they're worth 
> > > > supporting with this. It would be a good excuse to get people to 
> > > > stop using them.
> > > 
> > > How?  POSIX decided to remove the facilities without any adequate 
> > > replacement (thread aren't).
> > 
> > Threads work just as well as the ucontext api for coroutines. Due to the 
> > requirement to save/restore signal masks, the latter requires a syscall, 
> > making it no faster than a voluntary context switch via futex syscall.
> 
> Uhm, no.  If you disregard efficiency, sure, POSIX threads are sometimes a 
> replacement on some platforms.  They still have completely different 
> activation models (being synchronous with *context, for which you need 
> even further slow synchronization in a threading model).

switch_to_next:
sem_post(next->sem);
while (sem_wait(self->sem));

It can actually be done more idiomatically with cond vars, but I don't
see a way to make it as efficient.

> > Most of the other hacks people used the ucontext API for were complete 
> > hacks with undefined behavior, anyway.
> 
> Sure, that doesn't imply the facility should be removed.  I can misuse all 
> kinds of stuff.

Indeed.

> > BTW it's not even possible to implement makecontext on most targets due 
> > to the wacky variadic calling convention it uses -- in most ABIs, 
> > there's simply no way to shift the variadic args into the right slots 
> > for calling the start function for the new context without knowing their 
> > types, and the implementation has no way to know the types. So it's 
> > really an unusably broken API.
> 
> Of course.  But _that_ implies that a workable replacement should have 
> been put in place, not the unrealistic stance POSIX took with the removal:
>   makecontext2(ucontext_t *ucp, void (*func)(void*), void* cookie);

It could have been done even more simply, without a new function, by
just saying the behavior is undefined unless func has type
void(*)(void), argc==1, and the first variadic arg has type void*.

> Done.  I never understood why they left in the hugely 
> unuseful {sig,}{set,long}jmp() but removed the actually useful *context()
> (amended somehow like above).

Because those are actually part of the C language (the non-sig
versions, but the sig versions are needed to work around broken unices
that made the non-sig versions save/restore signal mask and thus too
slow to ever use). They're also much more useful for actually
reasonable code (non-local exit across functions that were badly
designed with no error paths) as opposed to just nasty hacks that are
mostly/entirely UB anyway (coroutines, etc.).

Rich


Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Mon, May 9, 2016 at 1:07 PM, Joel Sherrill  wrote:
>
> One complication on RTEMS which is a single process, multi-threaded RTOS
> is that we can no longer check the stack bounds. For threads, we know
> where the stack memory is and the range for each thread. For ucontext_t,
> it seems this knowledge is unknown to the RTOS.
>
> Thus it would become the responsibility of the run-time using ucontext_t
> to put in fence patterns and check those.

On RTEMS and similar systems, you could write makecontext to register
the stack (whose start and length are known to the function) with the
RTOS.

Ian


[gomp4] backport the *finish_omp_clauses changes

2016-05-09 Thread Cesar Philippidis
This patch backports the *finish_omp_clauses changes I made to the c and
c++ front ends in trunk revision 235780. Like the cilk patch, there were
enough changes in gomp-4_0-branch which prevented this patch from
applying cleanly on that branch.

I've applied this patch to gomp-4_0-branch.

Cesar
2016-05-09  Cesar Philippidis  

	Backport trunk r235780:
	2016-05-02  Cesar Philippidis  

	gcc/c-family/
	* c-common.h (enum c_omp_region_type): Define.

	gcc/c/
	* c-parser.c (c_parser_oacc_all_clauses): Update call to
	c_finish_omp_clauses.
	(c_parser_omp_all_clauses): Likewise.
	(c_parser_oacc_cache): Likewise.
	(c_parser_oacc_loop): Likewise.
	(omp_split_clauses): Likewise.
	(c_parser_omp_declare_target): Likewise.
	(c_parser_cilk_all_clauses): Likewise.
	(c_parser_cilk_for): Likewise.
	* c-typeck.c (c_finish_omp_clauses): Replace bool arguments
	is_omp, declare_simd, and is_cilk with enum c_omp_region_type ort.

	gcc/cp/
	* cp-tree.h (finish_omp_clauses): Update prototype.
	* parser.c (cp_parser_oacc_all_clauses): Update call to
	finish_omp_clauses.
	(cp_parser_omp_all_clauses): Likewise.
	(cp_parser_omp_for_loop): Likewise.
	(cp_omp_split_clauses): Likewise.
	(cp_parser_oacc_cache): Likewise.
	(cp_parser_oacc_loop): Likewise.
	(cp_parser_omp_declare_target):
	(cp_parser_cilk_simd_all_clauses): Likewise.
	(cp_parser_cilk_for): Likewise.
	* pt.c (tsubst_omp_clauses): Replace allow_fields and declare_simd
	arguments with enum c_omp_region_type ort.
	(tsubst_omp_clauses): Update calls to finish_omp_clauses.
	(tsubst_omp_attribute): Update calls to tsubst_omp_clauses.
	(tsubst_omp_for_iterator): Update calls to finish_omp_clauses.
	(tsubst_expr): Update calls to tsubst_omp_clauses.
	* semantics.c (finish_omp_clauses): Replace bool arguments
	allow_fields, declare_simd, and is_cilk with bitmask ort.
	(finish_omp_for): Update call to finish_omp_clauses.


diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 59da6c8..ef3493e 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1259,6 +1259,15 @@ enum c_omp_clause_split
   C_OMP_CLAUSE_SPLIT_TASKLOOP = C_OMP_CLAUSE_SPLIT_FOR
 };
 
+enum c_omp_region_type
+{
+  C_ORT_OMP			= 1 << 0,
+  C_ORT_CILK			= 1 << 1,
+  C_ORT_ACC			= 1 << 2,
+  C_ORT_DECLARE_SIMD		= 1 << 3,
+  C_ORT_OMP_DECLARE_SIMD	= C_ORT_OMP | C_ORT_DECLARE_SIMD,
+};
+
 extern tree c_finish_omp_master (location_t, tree);
 extern tree c_finish_omp_taskgroup (location_t, tree);
 extern tree c_finish_omp_critical (location_t, tree, tree, tree);
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7667715..48fa26a 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -13367,7 +13367,7 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
   c_parser_skip_to_pragma_eol (parser);
 
   if (finish_p)
-return c_finish_omp_clauses (clauses, true, false);
+return c_finish_omp_clauses (clauses, C_ORT_ACC);
 
   return clauses;
 }
@@ -13652,8 +13652,8 @@ c_parser_omp_all_clauses (c_parser *parser, omp_clause_mask mask,
   if (finish_p)
 {
   if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_UNIFORM)) != 0)
-	return c_finish_omp_clauses (clauses, false, true, true);
-  return c_finish_omp_clauses (clauses, false, true);
+	return c_finish_omp_clauses (clauses, C_ORT_OMP_DECLARE_SIMD);
+  return c_finish_omp_clauses (clauses, C_ORT_OMP);
 }
 
   return clauses;
@@ -13685,7 +13685,7 @@ c_parser_oacc_cache (location_t loc, c_parser *parser)
   tree stmt, clauses;
 
   clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
-  clauses = c_finish_omp_clauses (clauses, true, false);
+  clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
 
   c_parser_skip_to_pragma_eol (parser);
 
@@ -14022,9 +14022,9 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 {
   clauses = c_oacc_split_loop_clauses (clauses, cclauses);
   if (*cclauses)
-	*cclauses = c_finish_omp_clauses (*cclauses, true, false);
+	*cclauses = c_finish_omp_clauses (*cclauses, C_ORT_ACC);
   if (clauses)
-	clauses = c_finish_omp_clauses (clauses, true, false);
+	clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
 }
 
   tree block = c_begin_compound_stmt (true);
@@ -15228,7 +15228,7 @@ omp_split_clauses (location_t loc, enum tree_code code,
   c_omp_split_clauses (loc, code, mask, clauses, cclauses);
   for (i = 0; i < C_OMP_CLAUSE_SPLIT_COUNT; i++)
 if (cclauses[i])
-  cclauses[i] = c_finish_omp_clauses (cclauses[i], false, true);
+  cclauses[i] = c_finish_omp_clauses (cclauses[i], C_ORT_OMP);
 }
 
 /* OpenMP 4.0:
@@ -16759,7 +16759,7 @@ c_parser_omp_declare_target (c_parser *parser)
 {
   clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_TO_DECLARE,
 	  clauses);
-  clauses = c_finish_omp_clauses (clauses, false, true);
+  clauses = c_finish_omp_clauses (clauses, C_ORT_OMP);
   c_parser_skip_to_pragma_eol (parser);
 }
   

Re: Re: GCC 6.1 Hard-coded C++ header paths and relocation problem on Windows

2016-05-09 Thread Brett Neumeier
On Tue, May 3, 2016 at 10:01 AM, lh_mouse  wrote:
> Should I file a bug report then?
> We need some Linux testers, though not many people on Linux relocate 
> compilers.

For what it's worth -- I encountered the same problem on a GNU/Linux
system. In my specific situation, I'm cross-compiling GCC using an
AMD64-to-mips64el cross-toolchain, and installing the resulting GCC in
a sysroot directory. When I try to use that GCC on a target device
where (of course) the sysroot directory becomes "/", the hard-coded
"/path/to/sysroot" from the host system is still used to find the C++
headers, resulting in the same ".../include/c++/6.1.1/cstdlib:75:25:
fatal error: stdlib.h: No such file or directory" error message you
got.

Changing #include_next to #include in cstdlib and cmath fixed my
problem -- so, thank you very much for this discussion! It helped at
least one other person.

Please let me know if there's any other testing I can do to help.

Cheers,

Brett


[Bug fortran/71014] associate statement inside omp parallel do appears to disable default private attribute for inner loop indices

2016-05-09 Thread anlauf at gmx dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71014

--- Comment #8 from Harald Anlauf  ---
(In reply to Keith Lindsay from comment #6)
> Harald,
> 
> The problem does go away if I add a PRIVATE(i) clause to the OMP directive.
> 
> However, my understanding of OpenMP in fortran is that all loop iteration
> variables, even inner nested loops, in an OpenMP PARALLEL DO construct (and
> some others) are private by default. I.e., they do not need to be declared
> private in the OMP directive. (I think this specification is different than
> the specification for inner loops in C.)

The OpenMP 4.5 standard says (2.15.1.1, for Fortran):

- The loop iteration variable(s) in the associated do-loop(s) of a do,
  parallel do, taskloop, or distribute construct is (are) private.
[...]
- A loop iteration variable for a sequential loop in a parallel or task
  generating construct is private in the innermost such construct that
  encloses the loop.
[...]

Now the question is: what are the "associated do-loop(s)"?  And
what exactly does the second item above mean?

I'm not too versed in reading the standard, and I tend to be on the
conservative side here and interpret the ASSOCIATE construct to
generate something like a BLOCK that makes things interesting, so
that the i loop is to be dealt with separately.  (I usually declare
all loop variables explicitly.)

What do you get when replacing ASSOCIATE by BLOCK/END BLOCK?
I'd expect you experience the same problem.

> Indeed, if I comment out the associate construct, the problem goes away. So
> I'm inferring that the associate construct is interfering with the inner
> loop index being assigned the private attribute.
> 
> Keith

I would recommend to always validate OpenMP parallelized code under
some suitable tool, like valgrind/helgrind, gcc/thread-sanitizer, or
Intel Inspector.  Don't rely on just running your code.

(I get a possible data race when running your original code under
valgrind with gfortran and ifort, and a cross-thread stack access
with ifort/Intel Inspector for the line doing the accumulation.
However, I also get this for the fixed code.)

If you need more insight on what gfortran is doing, add the option
"-fdump-tree-original" and compare the resulting intermediate files
*.003t.original.

[gomp4] backport fix for PR69363

2016-05-09 Thread Cesar Philippidis
I've applied this patch to gomp-4_0-branch which backports some cilk
changes in the c and c++ front ends to gomp-4_0-branch. These changes
were necessary for my recent finish_omp_clauses patch, which I'll be
committing next.

Cesar
2016-05-09  Cesar Philippidis  

	Backport trunk r235290:
	2016-04-20  Ilya Verbin  

	gcc/c-family/
	PR c++/69363
	* c-cilkplus.c (c_finish_cilk_clauses): Remove function.
	* c-common.h (c_finish_cilk_clauses): Remove declaration.

	gcc/c/
	PR c++/69363
	* c-parser.c (c_parser_cilk_all_clauses): Use c_finish_omp_clauses
	instead of c_finish_cilk_clauses.
	* c-tree.h (c_finish_omp_clauses): Add new default argument.
	* c-typeck.c (c_finish_omp_clauses): Add new argument.  Allow
	floating-point variables in the linear clause for Cilk Plus.

	gcc/cp/
	PR c++/69363
	* cp-tree.h (finish_omp_clauses): Add new default argument.
	* parser.c (cp_parser_cilk_simd_all_clauses): Use finish_omp_clauses
	instead of c_finish_cilk_clauses.
	* semantics.c (finish_omp_clauses): Add new argument.  Allow
	floating-point variables in the linear clause for Cilk Plus.

	gcc/testsuite/
	PR c++/69363
	* c-c++-common/cilk-plus/PS/clauses3.c: Adjust dg-error string.
	* c-c++-common/cilk-plus/PS/clauses4.c: New test.
	* c-c++-common/cilk-plus/PS/pr69363.c: New test.


diff --git a/gcc/c-family/c-cilkplus.c b/gcc/c-family/c-cilkplus.c
index 3e7902fd..9f1f364 100644
--- a/gcc/c-family/c-cilkplus.c
+++ b/gcc/c-family/c-cilkplus.c
@@ -41,56 +41,6 @@ c_check_cilk_loop (location_t loc, tree decl)
   return true;
 }
 
-/* Validate and emit code for <#pragma simd> clauses.  */
-
-tree
-c_finish_cilk_clauses (tree clauses)
-{
-  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
-{
-  tree prev = clauses;
-
-  /* If a variable appears in a linear clause it cannot appear in
-	 any other OMP clause.  */
-  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR)
-	for (tree c2 = clauses; c2; c2 = OMP_CLAUSE_CHAIN (c2))
-	  {
-	if (c == c2)
-	  continue;
-	enum omp_clause_code code = OMP_CLAUSE_CODE (c2);
-
-	switch (code)
-	  {
-	  case OMP_CLAUSE_LINEAR:
-	  case OMP_CLAUSE_PRIVATE:
-	  case OMP_CLAUSE_FIRSTPRIVATE:
-	  case OMP_CLAUSE_LASTPRIVATE:
-	  case OMP_CLAUSE_REDUCTION:
-		break;
-
-	  case OMP_CLAUSE_SAFELEN:
-		goto next;
-
-	  default:
-		gcc_unreachable ();
-	  }
-
-	if (OMP_CLAUSE_DECL (c) == OMP_CLAUSE_DECL (c2))
-	  {
-		error_at (OMP_CLAUSE_LOCATION (c2),
-			  "variable appears in more than one clause");
-		inform (OMP_CLAUSE_LOCATION (c),
-			"other clause defined here");
-		// Remove problematic clauses.
-		OMP_CLAUSE_CHAIN (prev) = OMP_CLAUSE_CHAIN (c2);
-	  }
-	  next:
-	prev = c2;
-	  }
-}
-  return clauses;
-}
-
 /* Calculate number of iterations of CILK_FOR.  */
 
 tree
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ddd5c07..59da6c8 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1370,7 +1370,6 @@ extern enum stv_conv scalar_to_vector (location_t loc, enum tree_code code,
    tree op0, tree op1, bool);
 
 /* In c-cilkplus.c  */
-extern tree c_finish_cilk_clauses (tree);
 extern tree c_validate_cilk_plus_loop (tree *, int *, void *);
 extern bool c_check_cilk_loop (location_t, tree);
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 1a4356f..7667715 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -17728,7 +17728,7 @@ c_parser_cilk_all_clauses (c_parser *parser)
 
  saw_error:
   c_parser_skip_to_pragma_eol (parser);
-  return c_finish_cilk_clauses (clauses);
+  return c_finish_omp_clauses (clauses, false, false, false, true);
 }
 
 /* This function helps parse the grainsize pragma for a _Cilk_for statement.
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 70b7bd9..1703162 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -661,7 +661,7 @@ extern tree c_begin_omp_task (void);
 extern tree c_finish_omp_task (location_t, tree, tree);
 extern void c_finish_omp_cancel (location_t, tree);
 extern void c_finish_omp_cancellation_point (location_t, tree);
-extern tree c_finish_omp_clauses (tree, bool, bool, bool = false);
+extern tree c_finish_omp_clauses (tree, bool, bool, bool = false, bool = false);
 extern tree c_build_va_arg (location_t, tree, location_t, tree);
 extern tree c_finish_transaction (location_t, tree, int);
 extern bool c_tree_equal (tree, tree);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 4813d4b..067ce82 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12496,7 +12496,8 @@ c_find_omp_placeholder_r (tree *tp, int *, void *data)
Remove any elements from the list that are invalid.  */
 
 tree
-c_finish_omp_clauses (tree clauses, bool is_oacc, bool is_omp, bool declare_simd)
+c_finish_omp_clauses (tree clauses, bool is_oacc, bool is_omp,
+		  bool declare_simd, bool is_cilk)
 {
   bitmap_head generic_head, firstprivate_head, lastprivate_head;
   bitmap_head 

[Bug c++/69363] ICE when doing a pragma simd reduction with max

2016-05-09 Thread cesar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363

--- Comment #8 from cesar at gcc dot gnu.org ---
Author: cesar
Date: Mon May  9 20:23:31 2016
New Revision: 236047

URL: https://gcc.gnu.org/viewcvs?rev=236047=gcc=rev
Log:
Backport trunk r235290:
2016-04-20  Ilya Verbin  

gcc/c-family/
PR c++/69363
* c-cilkplus.c (c_finish_cilk_clauses): Remove function.
* c-common.h (c_finish_cilk_clauses): Remove declaration.

gcc/c/
PR c++/69363
* c-parser.c (c_parser_cilk_all_clauses): Use c_finish_omp_clauses
instead of c_finish_cilk_clauses.
* c-tree.h (c_finish_omp_clauses): Add new default argument.
* c-typeck.c (c_finish_omp_clauses): Add new argument.  Allow
floating-point variables in the linear clause for Cilk Plus.

gcc/cp/
PR c++/69363
* cp-tree.h (finish_omp_clauses): Add new default argument.
* parser.c (cp_parser_cilk_simd_all_clauses): Use finish_omp_clauses
instead of c_finish_cilk_clauses.
* semantics.c (finish_omp_clauses): Add new argument.  Allow
floating-point variables in the linear clause for Cilk Plus.

gcc/testsuite/
PR c++/69363
* c-c++-common/cilk-plus/PS/clauses3.c: Adjust dg-error string.
* c-c++-common/cilk-plus/PS/clauses4.c: New test.
* c-c++-common/cilk-plus/PS/pr69363.c: New test.


Added:
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses4.c
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/cilk-plus/PS/pr69363.c
Modified:
branches/gomp-4_0-branch/gcc/c-family/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/c-family/c-cilkplus.c
branches/gomp-4_0-branch/gcc/c-family/c-common.h
branches/gomp-4_0-branch/gcc/c/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/c/c-parser.c
branches/gomp-4_0-branch/gcc/c/c-tree.h
branches/gomp-4_0-branch/gcc/c/c-typeck.c
branches/gomp-4_0-branch/gcc/cp/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/cp/cp-tree.h
branches/gomp-4_0-branch/gcc/cp/parser.c
branches/gomp-4_0-branch/gcc/cp/semantics.c
branches/gomp-4_0-branch/gcc/testsuite/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses3.c

Re: SafeStack proposal in GCC

2016-05-09 Thread Joel Sherrill



On 5/9/2016 3:03 PM, Michael Matz wrote:

Hi,

On Mon, 9 May 2016, Rich Felker wrote:


The *context APIs are deprecated and I'm not sure they're worth
supporting with this. It would be a good excuse to get people to
stop using them.


How?  POSIX decided to remove the facilities without any adequate
replacement (thread aren't).


Threads work just as well as the ucontext api for coroutines. Due to the
requirement to save/restore signal masks, the latter requires a syscall,
making it no faster than a voluntary context switch via futex syscall.


Uhm, no.  If you disregard efficiency, sure, POSIX threads are sometimes a
replacement on some platforms.  They still have completely different
activation models (being synchronous with *context, for which you need
even further slow synchronization in a threading model).


One complication on RTEMS which is a single process, multi-threaded RTOS
is that we can no longer check the stack bounds. For threads, we know
where the stack memory is and the range for each thread. For ucontext_t,
it seems this knowledge is unknown to the RTOS.

Thus it would become the responsibility of the run-time using ucontext_t
to put in fence patterns and check those.


Most of the other hacks people used the ucontext API for were complete
hacks with undefined behavior, anyway.


Sure, that doesn't imply the facility should be removed.  I can misuse all
kinds of stuff.


BTW it's not even possible to implement makecontext on most targets due
to the wacky variadic calling convention it uses -- in most ABIs,
there's simply no way to shift the variadic args into the right slots
for calling the start function for the new context without knowing their
types, and the implementation has no way to know the types. So it's
really an unusably broken API.


Of course.  But _that_ implies that a workable replacement should have
been put in place, not the unrealistic stance POSIX took with the removal:
  makecontext2(ucontext_t *ucp, void (*func)(void*), void* cookie);
Done.  I never understood why they left in the hugely
unuseful {sig,}{set,long}jmp() but removed the actually useful *context()
(amended somehow like above).


Ciao,
Michael.



--joel


[Bug c/71033] Segmentation fault c + intel assembly, unable to use EBX

2016-05-09 Thread formateu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71033

Mateusz Forc  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Mateusz Forc  ---
(In reply to Uroš Bizjak from comment #1)
> x86 ABI requires that %ebx is preserved across function call. So, you need
> to save it to stack in f.s and restore it before function returs. Or, you
> can use %edx instead, which can be clobbered in function.

Re: SafeStack proposal in GCC

2016-05-09 Thread Joel Sherrill



On 5/9/2016 2:45 PM, Ian Lance Taylor wrote:

On Mon, May 9, 2016 at 12:41 PM, Joel Sherrill
 wrote:


On 5/9/2016 2:25 PM, Ian Lance Taylor wrote:


On Fri, May 6, 2016 at 10:42 PM, Rich Felker  wrote:



The *context APIs are deprecated and I'm not sure they're worth
supporting with this. It would be a good excuse to get people to stop
using them.



The gccgo library uses them, because there is no working alternative.



FWIW when this transition occurred, that's when the RTEMS port broke.
We don't have these methods.


Yes, that was unfortunate, but it was a significant increase in efficiency.


It would be an interesting exercise to see if they could be
implemented in terms of our internal thread context management
APIs but no one has ever looked into it deeply.


They are short functions, and easy to implement.  They don't need to
use any thread context management, they just manipulate registers.
The catch is that, because they manipulate registers, they are
inherently machine-specific.


I suppose we could reuse implementations from *BSD for a subset
of targets. Those would likely be the targets folks care about
anyway.

Hmm... would those make sense to add to newlib? I am thinking
they are similar to setjmp/longjmp and shouldn't need supervisor
mode access.
 

Ian





[Bug fortran/71027] -fsanitize=address catches out of bounds access on assumed size array only with -O0

2016-05-09 Thread zeccav at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71027

--- Comment #2 from Vittorio Zecca  ---
Yes, you are right, and probably in real programs the subroutine would
not be optimized away.
Thank you for the explanation.

Re: SafeStack proposal in GCC

2016-05-09 Thread Michael Matz
Hi,

On Mon, 9 May 2016, Rich Felker wrote:

> > > The *context APIs are deprecated and I'm not sure they're worth 
> > > supporting with this. It would be a good excuse to get people to 
> > > stop using them.
> > 
> > How?  POSIX decided to remove the facilities without any adequate 
> > replacement (thread aren't).
> 
> Threads work just as well as the ucontext api for coroutines. Due to the 
> requirement to save/restore signal masks, the latter requires a syscall, 
> making it no faster than a voluntary context switch via futex syscall.

Uhm, no.  If you disregard efficiency, sure, POSIX threads are sometimes a 
replacement on some platforms.  They still have completely different 
activation models (being synchronous with *context, for which you need 
even further slow synchronization in a threading model).

> Most of the other hacks people used the ucontext API for were complete 
> hacks with undefined behavior, anyway.

Sure, that doesn't imply the facility should be removed.  I can misuse all 
kinds of stuff.

> BTW it's not even possible to implement makecontext on most targets due 
> to the wacky variadic calling convention it uses -- in most ABIs, 
> there's simply no way to shift the variadic args into the right slots 
> for calling the start function for the new context without knowing their 
> types, and the implementation has no way to know the types. So it's 
> really an unusably broken API.

Of course.  But _that_ implies that a workable replacement should have 
been put in place, not the unrealistic stance POSIX took with the removal:
  makecontext2(ucontext_t *ucp, void (*func)(void*), void* cookie);
Done.  I never understood why they left in the hugely 
unuseful {sig,}{set,long}jmp() but removed the actually useful *context()
(amended somehow like above).


Ciao,
Michael.


[Bug c/71033] Segmentation fault c + intel assembly, unable to use EBX

2016-05-09 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71033

--- Comment #1 from Uroš Bizjak  ---
x86 ABI requires that %ebx is preserved across function call. So, you need to
save it to stack in f.s and restore it before function returs. Or, you can use
%edx instead, which can be clobbered in function.

Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Mon, May 9, 2016 at 12:48 PM, Joel Sherrill
 wrote:
>
> On 5/9/2016 2:45 PM, Ian Lance Taylor wrote:
>>
>> On Mon, May 9, 2016 at 12:41 PM, Joel Sherrill
>>  wrote:
>>>
>>>
>>> On 5/9/2016 2:25 PM, Ian Lance Taylor wrote:


 On Fri, May 6, 2016 at 10:42 PM, Rich Felker  wrote:
>
>
>
> The *context APIs are deprecated and I'm not sure they're worth
> supporting with this. It would be a good excuse to get people to stop
> using them.



 The gccgo library uses them, because there is no working alternative.
>>>
>>>
>>>
>>> FWIW when this transition occurred, that's when the RTEMS port broke.
>>> We don't have these methods.
>>
>>
>> Yes, that was unfortunate, but it was a significant increase in
>> efficiency.
>>
>>> It would be an interesting exercise to see if they could be
>>> implemented in terms of our internal thread context management
>>> APIs but no one has ever looked into it deeply.
>>
>>
>> They are short functions, and easy to implement.  They don't need to
>> use any thread context management, they just manipulate registers.
>> The catch is that, because they manipulate registers, they are
>> inherently machine-specific.
>
>
> I suppose we could reuse implementations from *BSD for a subset
> of targets. Those would likely be the targets folks care about
> anyway.
>
> Hmm... would those make sense to add to newlib? I am thinking
> they are similar to setjmp/longjmp and shouldn't need supervisor
> mode access.

Makes sense to me.

Ian


Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Mon, May 9, 2016 at 12:41 PM, Joel Sherrill
 wrote:
>
> On 5/9/2016 2:25 PM, Ian Lance Taylor wrote:
>>
>> On Fri, May 6, 2016 at 10:42 PM, Rich Felker  wrote:
>>>
>>>
>>> The *context APIs are deprecated and I'm not sure they're worth
>>> supporting with this. It would be a good excuse to get people to stop
>>> using them.
>>
>>
>> The gccgo library uses them, because there is no working alternative.
>
>
> FWIW when this transition occurred, that's when the RTEMS port broke.
> We don't have these methods.

Yes, that was unfortunate, but it was a significant increase in efficiency.

> It would be an interesting exercise to see if they could be
> implemented in terms of our internal thread context management
> APIs but no one has ever looked into it deeply.

They are short functions, and easy to implement.  They don't need to
use any thread context management, they just manipulate registers.
The catch is that, because they manipulate registers, they are
inherently machine-specific.

Ian


Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Mon, May 9, 2016 at 12:35 PM, Rich Felker  wrote:
> On Mon, May 09, 2016 at 09:02:33PM +0200, Michael Matz wrote:
>> Hi,
>>
>> On Sat, 7 May 2016, Rich Felker wrote:
>>
>> > > > * sigaltstack and swapcontext are broken too.
>> > >
>> > > We have prototype that supports swapcontext that we're happy to
>> > > release, but it clearly requires more work before being ready to merge
>> > > upstream.
>> >
>> > The *context APIs are deprecated and I'm not sure they're worth
>> > supporting with this. It would be a good excuse to get people to stop
>> > using them.
>>
>> How?  POSIX decided to remove the facilities without any adequate
>> replacement (thread aren't).
>
> Threads work just as well as the ucontext api for coroutines. Due to
> the requirement to save/restore signal masks, the latter requires a
> syscall, making it no faster than a voluntary context switch via
> futex syscall.

No, threads do not work as well.  You can not ignore efficiency
considerations.  Coroutine switching is much faster with setcontext,
even though it does a system call.


> Most of the other hacks people used the ucontext API for were complete
> hacks with undefined behavior, anyway.

And yet code like gccgo's library works, on many different systems.


> BTW it's not even possible to implement makecontext on most targets
> due to the wacky variadic calling convention it uses -- in most ABIs,
> there's simply no way to shift the variadic args into the right slots
> for calling the start function for the new context without knowing
> their types, and the implementation has no way to know the types. So
> it's really an unusably broken API.

I certainly agree that makecontext is broken by design.  But it is
easy enough, and safe, to use makecontext with functions that take no
arguments.  I would fully support an effort to propagate a replacement
for makecontext that can actually work more generally.  I think the
current approach of simply dropping support is unwise.

Ian


Re: [PATCH] Remove constraints from further i386 define_expand patterns

2016-05-09 Thread Uros Bizjak
On Mon, May 9, 2016 at 6:49 PM, Jakub Jelinek  wrote:
> Hi!
>
> I believe this cleans up all remaining define_expands (have looked
> at tmp-mddump.md with sed picking up only define_expand patterns in there
> and have been looking for any constraints and none were left).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

reload_noff_store and reload_noff_load are part of secondary_reload
infrastructure, and this expander must have constraints, as said in
the documentation for TARGET_SECONDARY_RELOAD:

 You do this by setting 'sri->icode' to the instruction code of a
 pattern in the md file which performs the move.  Operands 0 and 1
 are the output and input of this copy, respectively.  Operands from
 operand 2 onward are for scratch operands.  These scratch operands
 must have a mode, and a single-register-class output constraint.

It is true, that the doc mentions only scratch operands, so it is
probably OK also to remove constraint from non-scratch operands of
these two patterns. Please confirm this with reload expert.

Others are OK.

Uros.

> 2016-05-09  Jakub Jelinek  
>
> * config/i386/i386.md (reload_noff_store, reload_noff_load, set_got,
> set_got_labelled, lwp_llwpcb, lwp_lwpval3, lwp_lwpins3):
> Remove constraints from expanders.
> * config/i386/sse.md (vec_interleave_high,
> vec_interleave_low, _vpermi2var3_maskz,
> _vpermt2var3_maskz): Likewise.
>
> --- gcc/config/i386/i386.md.jj  2016-05-09 11:38:36.0 +0200
> +++ gcc/config/i386/i386.md 2016-05-09 13:33:12.883238591 +0200
> @@ -1891,9 +1891,9 @@ (define_insn "*popfl1"
>  ;; Reload patterns to support multi-word load/store
>  ;; with non-offsetable address.
>  (define_expand "reload_noff_store"
> -  [(parallel [(match_operand 0 "memory_operand" "=m")
> -  (match_operand 1 "register_operand" "r")
> -  (match_operand:DI 2 "register_operand" "=")])]
> +  [(parallel [(match_operand 0 "memory_operand")
> + (match_operand 1 "register_operand")
> + (match_operand:DI 2 "register_operand")])]
>"TARGET_64BIT"
>  {
>rtx mem = operands[0];
> @@ -1907,9 +1907,9 @@ (define_expand "reload_noff_store"
>  })
>
>  (define_expand "reload_noff_load"
> -  [(parallel [(match_operand 0 "register_operand" "=r")
> -  (match_operand 1 "memory_operand" "m")
> -  (match_operand:DI 2 "register_operand" "=r")])]
> +  [(parallel [(match_operand 0 "register_operand")
> + (match_operand 1 "memory_operand")
> + (match_operand:DI 2 "register_operand")])]
>"TARGET_64BIT"
>  {
>rtx mem = operands[1];
> @@ -12522,7 +12522,7 @@ (define_expand "prologue"
>
>  (define_expand "set_got"
>[(parallel
> - [(set (match_operand:SI 0 "register_operand" "=r")
> + [(set (match_operand:SI 0 "register_operand")
>(unspec:SI [(const_int 0)] UNSPEC_SET_GOT))
>(clobber (reg:CC FLAGS_REG))])]
>"!TARGET_64BIT"
> @@ -12542,7 +12542,7 @@ (define_insn "*set_got"
>
>  (define_expand "set_got_labelled"
>[(parallel
> - [(set (match_operand:SI 0 "register_operand" "=r")
> + [(set (match_operand:SI 0 "register_operand")
>(unspec:SI [(label_ref (match_operand 1))]
>   UNSPEC_SET_GOT))
>(clobber (reg:CC FLAGS_REG))])]
> @@ -19041,7 +19041,7 @@ (define_insn "fnclex"
>  ;
>
>  (define_expand "lwp_llwpcb"
> -  [(unspec_volatile [(match_operand 0 "register_operand" "r")]
> +  [(unspec_volatile [(match_operand 0 "register_operand")]
> UNSPECV_LLWP_INTRINSIC)]
>"TARGET_LWP")
>
> @@ -19055,7 +19055,7 @@ (define_insn "*lwp_llwpcb1"
> (set_attr "length" "5")])
>
>  (define_expand "lwp_slwpcb"
> -  [(set (match_operand 0 "register_operand" "=r")
> +  [(set (match_operand 0 "register_operand")
> (unspec_volatile [(const_int 0)] UNSPECV_SLWP_INTRINSIC))]
>"TARGET_LWP"
>  {
> @@ -19079,9 +19079,9 @@ (define_insn "lwp_slwpcb"
> (set_attr "length" "5")])
>
>  (define_expand "lwp_lwpval3"
> -  [(unspec_volatile [(match_operand:SWI48 1 "register_operand" "r")
> -(match_operand:SI 2 "nonimmediate_operand" "rm")
> -(match_operand:SI 3 "const_int_operand" "i")]
> +  [(unspec_volatile [(match_operand:SWI48 1 "register_operand")
> +(match_operand:SI 2 "nonimmediate_operand")
> +(match_operand:SI 3 "const_int_operand")]
> UNSPECV_LWPVAL_INTRINSIC)]
>"TARGET_LWP"
>;; Avoid unused variable warning.
> @@ -19101,11 +19101,11 @@ (define_insn "*lwp_lwpval3_1"
>
>  (define_expand "lwp_lwpins3"
>[(set (reg:CCC FLAGS_REG)
> -   (unspec_volatile:CCC [(match_operand:SWI48 1 "register_operand" "r")
> - (match_operand:SI 2 "nonimmediate_operand" "rm")
> - 

Re: SafeStack proposal in GCC

2016-05-09 Thread Joel Sherrill



On 5/9/2016 2:25 PM, Ian Lance Taylor wrote:

On Fri, May 6, 2016 at 10:42 PM, Rich Felker  wrote:


The *context APIs are deprecated and I'm not sure they're worth
supporting with this. It would be a good excuse to get people to stop
using them.


The gccgo library uses them, because there is no working alternative.


FWIW when this transition occurred, that's when the RTEMS port broke.
We don't have these methods.

It would be an interesting exercise to see if they could be
implemented in terms of our internal thread context management
APIs but no one has ever looked into it deeply.


In general coroutine support requires the ability to designate some
area of memory as stack space.

Ian



--joel


[Bug c/71033] New: Segmentation fault c + intel assembly, unable to use EBX

2016-05-09 Thread formateu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71033

Bug ID: 71033
   Summary: Segmentation fault c + intel assembly, unable to use
EBX
   Product: gcc
   Version: 6.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: formateu at gmail dot com
  Target Milestone: ---

Created attachment 38459
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38459=edit
the preprocessed file

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc-multilib/src/gcc/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --enable-libmpx --with-system-zlib --with-isl
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu
--disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object
--enable-linker-build-id --enable-lto --enable-plugin
--enable-install-libiberty --with-linker-hash-style=gnu
--enable-gnu-indirect-function --enable-multilib --disable-werror
--enable-checking=release
Thread model: posix
gcc version 6.1.1 20160501 (GCC) 


Program runs intel x86 assembly function in main. Use of the EBX register
inside that function causes segmentation fault (after return from function).
It seems like gcc is using EBX instead of EBP before function call.
Program compiled using clang works properly.
Bug was noticed firstly on gcc 5.3.0 version, but is still present on latest
repository version.

Used makefile:

CC=gcc
CFLAGS= -Wall -m32 -O0 -save-temps

all: main.o f.o
 $(CC) $(CFLAGS) main.o f.o -o fun

main.o: main.c
  $(CC) $(CFLAGS) -c main.c -o main.o


command : make && ./fun 2


f.o: f.s
  nasm -f elf -g f.s -o f.o

main.c :
#include "f.h" //only void f(char*)

int main(int argc, char *argv[])
{
  if(argc < 2) {
return 1;
  }

  f(argv[1]);

  return 0;
}

f.s :
;f.i is not generated

  section .text
  global f
f:
  push ebp
  mov ebp, esp
  mov eax, [ebp+8]
  mov ebx, 9
begin:
  mov cl, [eax]
  cmp cl, 0 
  jz end
  add cl, 1
  mov [eax], cl
  inc eax
  jmp begin
end:
  mov esp, ebp
  pop ebp
  ret

[Bug fortran/71014] associate statement inside omp parallel do appears to disable default private attribute for inner loop indices

2016-05-09 Thread klindsay at ucar dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71014

--- Comment #7 from Keith Lindsay  ---
The Linux system that I'm working on has multiple versions of gcc/gfortran
installed. I've compiled and run my example program with different versions and
have found the following:

Versions 4.9.0, 4.9.1, 4.9.2, 4.9.3, 5.0.1, 5.1.0
Program, as is, works (i.e. no reported mismatches over 10 runs)

Versions 5.2.0, 5.3.0, 6.0.1, 6.1.0
Program, as is, reports mismatches (and sometimes hangs)

So it seems like the behavior I'm seeing originated in 5.2.0.

Re: CppCoreGuidelines warnings

2016-05-09 Thread Jason Merrill
On Mon, May 9, 2016 at 5:18 AM, Jonathan Wakely  wrote:
> On 8 May 2016 at 02:10, Christopher Di Bella wrote:
>> Hi all,
>>
>> I've been tracking gcc-digest for a bit, but would like to be a little
>> more involved in the development of gcc.
>>
>> I haven't been able to find anything about the CppCoreGuidelines in
>> gcc -- I'm wondering if there's a warning system in the pipeline that
>> I might have missed in the digest thread? If so, great, who do I need
>> to contact about helping out?
>>
>> If not, I'd like to get a start on implementing a warning system for
>> them. I'll create a branch, but I doubt it'll be ready for gcc 7.1's
>> release.
>
> Hi, I don't think anyone is working on that yet.
>
> See https://gcc.gnu.org/contribute.html for some prerequisites to
> contributing significant changes to GCC.
>
> I don't know the status of the static analysis tool the Microsoft were
> planning to release, which would do a lot of the checking. To
> incorporate the checks into GCC would probably involve changes to both
> the C++ front-end and the C++ library, but I would welcome such
> changes.

As would I.  You can coordinate with me about front end changes.

Thanks,
Jason


Re: SafeStack proposal in GCC

2016-05-09 Thread Rich Felker
On Mon, May 09, 2016 at 09:02:33PM +0200, Michael Matz wrote:
> Hi,
> 
> On Sat, 7 May 2016, Rich Felker wrote:
> 
> > > > * sigaltstack and swapcontext are broken too.
> > > 
> > > We have prototype that supports swapcontext that we're happy to 
> > > release, but it clearly requires more work before being ready to merge 
> > > upstream.
> > 
> > The *context APIs are deprecated and I'm not sure they're worth 
> > supporting with this. It would be a good excuse to get people to stop 
> > using them.
> 
> How?  POSIX decided to remove the facilities without any adequate 
> replacement (thread aren't).

Threads work just as well as the ucontext api for coroutines. Due to
the requirement to save/restore signal masks, the latter requires a
syscall, making it no faster than a voluntary context switch via
futex syscall.

Most of the other hacks people used the ucontext API for were complete
hacks with undefined behavior, anyway.

BTW it's not even possible to implement makecontext on most targets
due to the wacky variadic calling convention it uses -- in most ABIs,
there's simply no way to shift the variadic args into the right slots
for calling the start function for the new context without knowing
their types, and the implementation has no way to know the types. So
it's really an unusably broken API.

Rich


[Bug fortran/71032] New: explicit interface and must not have attributes generates gfortran: internal compiler error: Abort trap: 6 (program f951)

2016-05-09 Thread kendrick.killian at colostate dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71032

Bug ID: 71032
   Summary: explicit interface and must not have attributes
generates gfortran: internal compiler error: Abort
trap: 6 (program f951)
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kendrick.killian at colostate dot edu
  Target Milestone: ---

Created attachment 38458
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38458=edit
source file that generates f951 Segmentation fault

I had a routine that typed a function and declared it as external. I added the
function as an "contains" routine. An obvious over specification. The compiler
issued the
error message, Procedure ... has an explicit interface and must not have
attributes declared
 the following error messages:
---
---

GCC VERSION:
gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/local/gfortran/libexec/gcc/x86_64-apple-darwin14/5.2.0/lto-wrapper
Target: x86_64-apple-darwin14
Configured with: ../gcc-5.2.0/configure --prefix=/usr/local/gfortran
--with-gmp=/Users/fx/devel/gcc/deps-static/x86_64
--enable-languages=c,c++,fortran,objc,obj-c++ --build=x86_64-apple-darwin14
Thread model: posix
gcc version 5.2.0 (GCC) 

system type:
  Model Name:   MacBook Pro
  Model Identifier: MacBookPro8,2
  Processor Name:   Intel Core i7
  System Version:   OS X 10.11.4 (15E65)
  Kernel Version:   Darwin 15.4.0
  Developer Tools:
Version:7.3.1 (7D1014)
Location:   /Applications/Xcode.app

Compile Command:
gfortran -O3  -fno-underscoring -Wunused -Waliasing -Wampersand -Wsurprising
-Wno-tabs  -c internalerr.f90


compiler output:
catanf.f:5.28:
Included at internalerr.f90:40:

  real function carctanf(x,a,b,c,d)
1
internalerr.f90:25.39:

  real agdrat, bgdrat, carctanf
   2
Error: Procedure 'carctanf' at (1) has an explicit interface and must not have
attributes declared at (2)
f951: internal compiler error: Segmentation fault: 11

f951: internal compiler error: Abort trap: 6
gfortran: internal compiler error: Abort trap: 6 (program f951)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.


Files:
Sorry I don't know where preprocessed file (*.i*) is
The short attached source file generate the error.

[Bug fortran/71014] associate statement inside omp parallel do appears to disable default private attribute for inner loop indices

2016-05-09 Thread klindsay at ucar dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71014

--- Comment #6 from Keith Lindsay  ---
Harald,

The problem does go away if I add a PRIVATE(i) clause to the OMP directive.

However, my understanding of OpenMP in fortran is that all loop iteration
variables, even inner nested loops, in an OpenMP PARALLEL DO construct (and
some others) are private by default. I.e., they do not need to be declared
private in the OMP directive. (I think this specification is different than the
specification for inner loops in C.)

Indeed, if I comment out the associate construct, the problem goes away. So I'm
inferring that the associate construct is interfering with the inner loop index
being assigned the private attribute.

Keith

[Bug c/71030] Strange segmentation fault

2016-05-09 Thread formateu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71030

Mateusz Forc  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

Re: SafeStack proposal in GCC

2016-05-09 Thread Ian Lance Taylor
On Fri, May 6, 2016 at 10:42 PM, Rich Felker  wrote:
>
> The *context APIs are deprecated and I'm not sure they're worth
> supporting with this. It would be a good excuse to get people to stop
> using them.

The gccgo library uses them, because there is no working alternative.

In general coroutine support requires the ability to designate some
area of memory as stack space.

Ian


[Bug tree-optimization/71031] New: [7 Regression] ICE in extract_range_from_binary_expr_1, at tree-vrp.c:2535 w/ -Os

2016-05-09 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71031

Bug ID: 71031
   Summary: [7 Regression] ICE in
extract_range_from_binary_expr_1, at tree-vrp.c:2535
w/ -Os
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-7.0.0-alpha20160508 snapshot ICEs when compiling the following reduced
testcase w/ -Os:

int zj;
int **yr;

void
nn (void)
{
  unsigned int od = 4;

  for (;;)
{
  int lk;

  for (lk = 0; lk < 2; ++lk)
{
  static int cm;

  zj = 0;
  if (od == 0)
return;
  ++od;
  for (cm = 0; cm < 2; ++cm)
{
  --od;
  **yr = 0;
}
}
}
}

% gcc-7.0.0-alpha20160508 -c -Os z5y81wfl.c
z5y81wfl.c: In function 'nn':
z5y81wfl.c:5:1: internal compiler error: in extract_range_from_binary_expr_1,
at tree-vrp.c:2535
 nn (void)
 ^~

[Bug c/71030] Strange segmentation fault

2016-05-09 Thread formateu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71030

--- Comment #2 from Mateusz Forc  ---
(In reply to H.J. Lu from comment #1)
> Please provide f.i.

f.i is not generated using -save-temps, how am I supposed to get this file?

[Bug c/71013] [7 Regression] c-common.c:12810:37: error: 'LLONG_MAX' was not declared in this scope

2016-05-09 Thread dave.anglin at bell dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71013

--- Comment #3 from dave.anglin at bell dot net ---
On 2016-05-09 7:29 AM, John David Anglin wrote
> LLONG_MAX is not defined in hpux11.11.  It comes from fixed limits.h:
> ./lib/gcc/hppa64-hp-hpux11.11/5.3.1/include-fixed/limits.h:# undef LLONG_MIN
> ./lib/gcc/hppa64-hp-hpux11.11/5.3.1/include-fixed/limits.h:# define LLONG_MIN 
> (-LLONG_MAX - 1LL)
>
Should have provided more details:

#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
/* Minimum and maximum values a `signed long long int' can hold. */
# undef LLONG_MIN
# define LLONG_MIN (-LLONG_MAX - 1LL)
# undef LLONG_MAX
# define LLONG_MAX __LONG_LONG_MAX__

/* Maximum value an `unsigned long long int' can hold.  (Minimum is 0).  */
# undef ULLONG_MAX
# define ULLONG_MAX (LLONG_MAX * 2ULL + 1ULL)
#endif

  __STDC_VERSION__ is not defined when c-family/c-common.c is

compiled with g++ driver.  It is defined when gcc driver is used:

   else if (CPP_OPTION (pfile, lang) == CLK_STDC94)
 _cpp_define_builtin (pfile, "__STDC_VERSION__ 199409L");
   else if (CPP_OPTION (pfile, lang) == CLK_STDC11
|| CPP_OPTION (pfile, lang) == CLK_GNUC11)
 _cpp_define_builtin (pfile, "__STDC_VERSION__ 201112L");
   else if (CPP_OPTION (pfile, c99))
 _cpp_define_builtin (pfile, "__STDC_VERSION__ 199901L");

glimits.h has same issue.

[Bug c++/71029] large fold expressions compile slowly with -Wall

2016-05-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71029

Marc Glisse  changed:

   What|Removed |Added

   Keywords||compile-time-hog
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-05-09
 Ever confirmed|0   |1

--- Comment #1 from Marc Glisse  ---
Could be my profiling going wrong, but we seem to spend a lot of time in
mark_used, with a running time at least quadratic in 2048, so maybe we mark as
used all the previous elements of the comma chain, for each element?

Clang does not emit the thousands of get<1234> functions. I am not sure we want
to follow them there, as it would make debugging harder.

Re: [RS6000] complex long double ABI_V4 fix

2016-05-09 Thread Michael Meissner
On Fri, May 06, 2016 at 03:54:43PM +0930, Alan Modra wrote:
> Revision 235792 regressed compat/scalar-by-value-6 for powerpc-linux
> -m32 due to accidentally changing the ABI.  By another historical
> accident, complex long double is stupidly passed in gprs for -m32.

Sorry about the breakage.  Thanks for digging into it.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[Bug c/71030] Strange segmentation fault

2016-05-09 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71030

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-05-09
 CC||hjl.tools at gmail dot com
 Ever confirmed|0   |1

--- Comment #1 from H.J. Lu  ---
Please provide f.i.

Re: SafeStack proposal in GCC

2016-05-09 Thread Michael Matz
Hi,

On Sat, 7 May 2016, Rich Felker wrote:

> > > * sigaltstack and swapcontext are broken too.
> > 
> > We have prototype that supports swapcontext that we're happy to 
> > release, but it clearly requires more work before being ready to merge 
> > upstream.
> 
> The *context APIs are deprecated and I'm not sure they're worth 
> supporting with this. It would be a good excuse to get people to stop 
> using them.

How?  POSIX decided to remove the facilities without any adequate 
replacement (thread aren't).


Ciao,
Michael.


[Bug c++/70796] [DR 1030] Initialization order with braced-init-lists still broken

2016-05-09 Thread rs2740 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70796

--- Comment #2 from TC  ---
It occurred to me that one issue here is whether initialization of the
parameter object (of the constructor) is considered a "value computation [or]
side effect associated with" an initializer-clause. If not, then the current
behavior is correct - the increments are sequenced relative to each other but
not to the initialization of the parameter objects (which reads from 'i').

Re: [PATCH,rs6000] Add built-in support for new Power9 darn (deliver a random number) instruction

2016-05-09 Thread Peter Bergner
On Mon, 2016-05-09 at 12:35 -0500, Bill Schmidt wrote:
> On Mon, 2016-05-09 at 08:58 -0500, Segher Boessenkool wrote:
> > On Thu, May 05, 2016 at 10:26:01AM -0600, Kelvin Nilsen wrote:


> > Do we really want to #define short words like "darn"?  If this is already
> > set in stone, so be it.
> 
> I don't think we do, and in any case altivec.h would not be the place to
> do it.  darn is not a vector instruction.
> 
> For these, just having __builtin_darn* be the available interfaces will
> be fine.
> 

Agreed, I don't think we need a fancy short names for this builtin
which will be infrequently used.  The __builtin_darn name is enough.


Peter




[Bug c/71030] New: Strange segmentation fault

2016-05-09 Thread formateu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71030

Bug ID: 71030
   Summary: Strange segmentation fault
   Product: gcc
   Version: 6.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: formateu at gmail dot com
  Target Milestone: ---

Created attachment 38457
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38457=edit
preprocessed file

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc-multilib/src/gcc/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --enable-libmpx --with-system-zlib --with-isl
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu
--disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object
--enable-linker-build-id --enable-lto --enable-plugin
--enable-install-libiberty --with-linker-hash-style=gnu
--enable-gnu-indirect-function --enable-multilib --disable-werror
--enable-checking=release
Thread model: posix
gcc version 6.1.1 20160501 (GCC) 

Used makefile 
CC=gcc
CFLAGS= -Wall -m32 

all: main.o f.o
  $(CC) $(CFLAGS) main.o f.o -o fun

main.o: main.c 
  $(CC) $(CFLAGS) -c main.c -o main.o
f.o: f.s   
  nasm -f elf -g f.s -o f.o

make && ./fun 2 2

Program runs intel x86 assembly function in main. Use of the EBX register
inside that function causes segmentation fault (after return from function).
It seems like gcc is using EBX instead of EBP before function call.
Program compiled using clang works properly.
Bug was noticed firstly on gcc 5.3.0 version, but is still present on latest
repository version.

main.c file : 

#include "f.h" // only void f(int); + guardian

int main()
{
  int var = 4;
  f(var);
  return 0;
}

f.s file :

  section .text
  global f
f:
  push ebp
  mov ebp, esp
  mov eax, [ebp+8]
  mov ebx, 0
begin:
  mov cl, [eax]
  mov ebx, 0
  add cl, 1
  mov [eax], cl
  mov esp, ebp
  pop ebp
  ret

Re: [PATCH], Add PowerPC ISA 3.0 min/max support

2016-05-09 Thread Michael Meissner
On Mon, May 09, 2016 at 09:31:43AM -0500, Segher Boessenkool wrote:
> On Thu, May 05, 2016 at 03:18:39PM -0400, Michael Meissner wrote:
> > At the present time, the code does not support comparisons involving >= and 
> > <=
> > unless the -ffast-math option is used. I hope eventually to support 
> > generating
> > these instructions without having -ffast-math used.
> > 
> > The underlying reason is when fast math is not used, we change the condition
> > from:
> > 
> > (ge:SI (reg:CCFP ) (const_int 0))
> > 
> > to:
> > 
> > (ior:SI (gt:SI (reg:CCFP ) (const_int 0))
> > (eq:SI (reg:CCFP ) (const_int 0)))
> > 
> > The machine independent portion of the compiler does not recognize this when
> > trying to generate conditional moves.
> > 
> > I would imagine the 'fix' is to generate GE/LE all of the time, and then 
> > have a
> > splitter that converts it to IOR of GT/EQ if it is not a conditional move 
> > with
> > ISA 3.0 instructions.
> 
> That sounds like a plan :-)

Sure, but at the moment it is lower priority to do it than all of the other
things that I'm doing at the moment.  Patches by other people are welcome.

> > -;; Return true if operand is MIN or MAX operator.
> > +;; Return true if operand is MIN or MAX operator.  Since this is only used 
> > to
> > +;; convert floating point MIN/MAX operations into FSEL on pre-vsx systems,
> > +;; don't include UMIN or UMAX.
> >  (define_predicate "min_max_operator"
> > -  (match_code "smin,smax,umin,umax"))
> > +  (match_code "smin,smax"))
> 
> Please name it signed_min_max_operator instead?
> 
> > --- gcc/config/rs6000/rs6000.c  
> > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
> > (revision 235831)
> > +++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
> > @@ -20534,6 +20534,12 @@ print_operand (FILE *file, rtx x, int co
> > "local dynamic TLS references");
> >return;
> >  
> > +case '@':
> > +  /* If -mpower9-minmax, use xsmaxcpdp instead of xsmaxdp.  */
> > +  if (TARGET_P9_MINMAX)
> > +   putc ('c', file);
> > +  return;
> 
> I don't think @ is very mnemonic, nor is this special enough for such
> a nice letter.

I don't care what punctuation letter is used, but it needs to be one. What do
you prefer?

> 
> Form looking at how it is used, it seems you can make it part of code_attr
> minmax (and give that a better name, minmax_fp or such)?

It is used to distinguish between generating

XSMAXDP on power7 with -ffast-math
and XSMAXCDPon power9 with/without -ffast-math

I would prefer not to have to change this to C code, hence the use of a
punctuation print operand code. But if you insist, I can just do it with if's.

> > +  rs6000_emit_minmax (dest, (max_p) ? SMAX : SMIN, op0, op1);
> 
> Superfluous parentheses.
> 
> > +rs6000_emit_power9_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> 
> Maybe put some "fp" in the name?  For "minmax" as well.

Sigh, ok.

> > +  if (swap_p)
> > +compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op1, op0);
> > +  else
> > +compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op0, op1);
> 
> if (swap_p)
>   std::swap (op0, op1);

I'll look into it.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[Bug middle-end/71028] [7 regression] ICE in redirect_jump, at jump.c:1560

2016-05-09 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71028

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-05-09
   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Segher Boessenkool  ---
Confirmed (with arm-linux-gnueabi, -O2 to reproduce).  Mine.

Re: [PATCH], Add PowerPC ISA 3.0 vector d-form addressing

2016-05-09 Thread Michael Meissner
On Mon, May 09, 2016 at 08:11:54AM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, May 05, 2016 at 02:05:19PM -0400, Michael Meissner wrote:
> > > > With this patch, I enable -mlra if the user did not specify either 
> > > > -mlra or
> > > > -mno-lra on the command line, and -mcpu=power9 or -mpower9-dform-vector 
> > > > were
> > > > used. I also enabled -mvsx-timode if LRA was used, which also is a 
> > > > RELOAD
> > > > issue, that works with LRA.
> > > 
> > > I don't like enabling LRA if the user didn't ask for it; it is a bit too
> > > surprising.  What do you do if there is -mno-lra explicitly?  You can just
> > > do the same if no-lra is implicit?
> > 
> > Ok.
> 
> You didn't however change this afaics?

The patch kept reload as the default. You have to explicitly enable LRA to get
vector d-forms by default with -mcpu=power9.

> 
> > > > --- gcc/config/rs6000/rs6000.opt
> > > > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
> > > > (revision 235831)
> > > > +++ gcc/config/rs6000/rs6000.opt(.../gcc/config/rs6000) 
> > > > (working copy)
> > > > @@ -470,8 +470,8 @@ Target RejectNegative Joined UInteger Va
> > > >  -mlong-double-  Specify size of long double (64 or 128 bits).
> > > >  
> > > >  mlra
> > > > -Target Report Var(rs6000_lra_flag) Init(0) Save
> > > > -Use LRA instead of reload.
> > > > +Target Undocumented Mask(LRA) Var(rs6000_isa_flags)
> > > > +Use the LRA register allocator instead of the reload register 
> > > > allocator.
> > > 
> > > It wasn't "undocumented" before?  Why the change to a mask bit btw?
> > 
> > It was always meant to be undocumented, but I changed to be similar to
> > before. I am trying to change all of the random switches that set a word to 
> > be
> > an option mask, so I made that part of the change in these next patches.
> 
> I agree it should be undocumented because hopefully one day all reload
> will not exist at all anymore.  OTOH, all other archs with an -mlra
> switch do not have it hidden, so we might as well follow suit there.

I will write up some documentation.

> > I did remove setting it for -mcpu=power9.
> 
> It doesn't look like it?  Please check.
> 
> > @@ -94,6 +95,7 @@
> >  | OPTION_MASK_FPRND\
> >  | OPTION_MASK_HTM  \
> >  | OPTION_MASK_ISEL \
> > +| OPTION_MASK_LRA  \
> >  | OPTION_MASK_MFCRF\
> >  | OPTION_MASK_MFPGPR   \
> >  | OPTION_MASK_MODULO   \

That is POWERPC_MASKS, which is the mask of all option bits that COULD be set
by -mcpu= options. But none of the -mcpu= set it any more (the
previous patch did set it in ISA_3_0_MASKS_SERVER, but it doesn't do it any
more. In retrospect, when I created the option masks, I should have used 2
separate words, one for things like -m32 that can never be changed, and the
other for all of the normal bits, and not need POWERPC_MASKS. But in general, I
always add things to POWERPC_MASKS unless it explictly should not be.

> > > > +mpower9-dform-scalar
> > > > +Target Report Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags)
> > > > +Use/do not use scalar register+offset memory instructions added in ISA 
> > > > 3.0.
> > > > +
> > > > +mpower9-dform-vector
> > > > +Target Report Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags)
> > > > +Use/do not use vector register+offset memory instructions added in ISA 
> > > > 3.0.
> > > > +
> > > >  mpower9-dform
> > > > -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags)
> > > > -Use/do not use vector and scalar instructions added in ISA 3.0.
> > > > +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save
> > > > +Use/do not use register+offset memory instructions added in ISA 3.0.
> > > 
> > > These should probably all be undocumented, though (they're not something
> > > users should use).
> > 
> > I will make -mpower9-dform public (which I thought it was, but evidently I
> > missed adding the documentation for GCC 6), but I will make the -scalar and
> > -vector forms private.
> 
> You think this is something users are expected to twiddle?  Okay then.

I don't expect users to normally twiddle it, but there are times in either bug
fixing and/or extreme benchmarking where they do.

> > [gcc]
> > 2016-05-05  Michael Meissner  
> > 
> > * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Use
> > -mpower9-dform-scalar instead of -mpower9-dform. Add note not to
> > include -mpower9-dform-vector until we switch over to LRA.
> 
> Thanks for the better changelog, much appreciated.  Two spaces after
> a full stop though.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 

[Bug fortran/71027] -fsanitize=address catches out of bounds access on assumed size array only with -O0

2016-05-09 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71027

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-05-09
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
IMO your expectation is invalid:

[Book15] f90/bug% gfc pr71027.f90 -Og -fdump-tree-optimized
[Book15] f90/bug% cat pr71027.f90.211t.optimized 

;; Function main (main, funcdef_no=2, decl_uid=3429, cgraph_uid=2,
symbol_order=2) (executed once)

__attribute__((externally_visible))
main (integer(kind=4) argc, character(kind=1) * * argv)
{
  static integer(kind=4) options.0[9] = {68, 1023, 0, 0, 1, 1, 0, 0, 31};

  :
  _gfortran_set_args (argc_2(D), argv_3(D));
  _gfortran_set_options (9, [0]);
  return 0;

}

i.e., the subroutine sub is optimized away. If you do something such as

  subroutine sub(vv)
  dimension vv(*)
  x=vv(20) ! out of bounds access
  vv(1)=x
  end subroutine

then -fsanitize=address catches the invalid address.

[Bug c++/71029] New: large fold expressions compile slowly with -Wall

2016-05-09 Thread alisdairm at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71029

Bug ID: 71029
   Summary: large fold expressions compile slowly with -Wall
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: alisdairm at me dot com
  Target Milestone: ---

Created attachment 38456
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38456=edit
file is very slow compiled with Wall

The attached file uses a simple fold expression over a comma operator to
consume 'get' on each element of a std::array, so essentially
2047 comma operators separating empty 'sink' function calls.

On my current machine, with just with just g++ -std=c++1z file.cpp, this takes
around 2.5 seconds to compile.  With g++ -std=c++1z -Wall file.cpp it takes 25
seconds, and still reports no warnings.

The generated code is also quite large, regardless of warning setting,
generating an executable of around 650k.  This is less than 5k when compiled
-O3.

Running the same scenario through Clang, the unoptimized code produced s 90k
executable, and consistently gives a 1.5 second compile time, regardless of
warning level.

I suspect that Clang is getting in an early optimization around the do-nothing
function to eliminate a lot of code prior to running warning passes, but that
is entirely speculation as I have no idea how either compiler is implemented ;)


For the curious, compiling with the latest gcc 7 branch available through
MacPorts, this takes around 60% longer again although I am not sure if that
compiler is built with the same optimization settings as the released gcc 6.1
compiler.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2016-05-09 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #18 from Jonathan Wakely  ---
*** Bug 71025 has been marked as a duplicate of this bug. ***

[Bug libstdc++/71025] std::call_once aborts instead of propagating an exception (AIX 6.1)

2016-05-09 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71025

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Jonathan Wakely  ---
Yes this is PR 66146

*** This bug has been marked as a duplicate of bug 66146 ***

[Bug fortran/71014] associate statement inside omp parallel do appears to disable default private attribute for inner loop indices

2016-05-09 Thread anlauf at gmx dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71014

Harald Anlauf  changed:

   What|Removed |Added

 CC||anlauf at gmx dot de

--- Comment #5 from Harald Anlauf  ---
(In reply to Keith Lindsay from comment #4)
> Thanks for taking a look. I've attached the output from the command
> gfortran -v -fopenmp openmp_nested_loops.f90 -o openmp_nested_loops
> on two different systems where I'm seeing the problem.

Do you still see the problem when you replace the line

!$OMP PARALLEL DO

by

!$OMP PARALLEL DO private(i,j) shared(s) default(none)

?

What you see might be a race condition, since you did not declare the
inner loop variable i as private.

[Bug middle-end/71028] New: [7 regression] ICE in redirect_jump, at jump.c:1560

2016-05-09 Thread vp at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71028

Bug ID: 71028
   Summary: [7 regression] ICE in redirect_jump, at jump.c:1560
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vp at gcc dot gnu.org
  Target Milestone: ---

gcc.c-torture/compile/pr37483.c and
gcc.dg/20010822-1.c

FAILs on arm-none-eabi with:

...: internal compiler error: in redirect_jump, at jump.c:1560
 }
 ^
0x98314d redirect_jump(rtx_jump_insn*, rtx_def*, int)
../../gcc/gcc/jump.c:1560
0x10b8276 try_optimize_cfg
../../gcc/gcc/cfgcleanup.c:2900
0x10b8276 cleanup_cfg(int)
../../gcc/gcc/cfgcleanup.c:3150
0x10b9646 execute
../../gcc/gcc/cfgcleanup.c:3279


The following commit seem to cause it:
https://gcc.gnu.org/viewcvs?rev=235904=gcc=rev

[Bug lto/70955] [6/7 Regression] Wrong code generation for __builtin_ms_va_list with -flto

2016-05-09 Thread ssbssa at yahoo dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955

--- Comment #11 from Domani Hannes  ---
I can confirm that this patch works for windows as well.

[Bug fortran/71027] New: -fsanitize=address catches out of bounds access on assumed size array only with -O0

2016-05-09 Thread zeccav at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71027

Bug ID: 71027
   Summary: -fsanitize=address catches out of bounds access on
assumed size array only with -O0
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zeccav at gmail dot com
  Target Milestone: ---

! -fsanitize=address -O0 catches out of bounds access on assumed size array
! any other optimization level, even -Og, inhibits catching
  dimension v(10)
  call sub(v)
  contains
  subroutine sub(vv)
  dimension vv(*)
  x=vv(20) ! out of bounds access
  end subroutine
  end

Re: (R5900) Implementing Vector Support

2016-05-09 Thread Richard Henderson

On 05/06/2016 09:28 PM, Woon yung Liu wrote:

Regarding multiplication of vectors, is there a way to work with a 
multiplication operation that results in something like this (the result is 
spread across these 3 registers), without re-ordering any elements:

RD: A6xB6, A4xB4, A2xB2, A0xA0

LO: A7xB7, A6xB6, A3xB3, A2xA2
HI: A5xB5, A4xB4, A1xB1, A0xA0

A0-A7 and B0-B7 are the 8 elements of two V8HI vectors, which are multiplied 
together to produce a widened multiplication result.

It looks like the vector hi/lo multiplication pattern would work with the 
values in HI and LO, but the order of the elements don't seem to be in a way 
that GCC expects.

Assuming that it is possible to put this pattern to use, does GCC allow the 
vec_widen_smult_hi and
vec_widen_smult_lo patterns to be combined together? Like for the divmod 
(division + modulus) patterns.
The instruction described above (PMULTH) will result in calculation of both the 
hi and lo parts of the result, in one instruction. Hence combining the two 
patterns would be more efficient.


You can use this if you reshuffle the results.

Since it appears that PMULTH naturally produces even results in RD, it would 
seem to make the most sense to attempt to construct the odd results from LO+HI. 
 However, I don't see anything in the TX79 isa that's particularly helpful there.


That said,

pmulth  r0, x, y
pmflo   t1
pmfhi   t2
pcpyld  r1, t1, t2
pcpyud  r2, t2, t1

would appear to produce the results gcc expects for the hi/lo multiples.

Don't worry overmuch about initially generating two copies of the pmulth 
instruction.  We have a similar problem with the ia64 patterns.  Rely on the 
rtl CSE pass to remove the duplicate instructions.



r~


Re: [PATCH] Load external function address via GOT slot

2016-05-09 Thread H.J. Lu
On Fri, Apr 22, 2016 at 6:03 AM, Uros Bizjak  wrote:
> On Fri, Apr 22, 2016 at 2:54 PM, H.J. Lu  wrote:
>> For -fno-plt, we load the external function address via the GOT slot
>> so that linker won't create an PLT entry for extern function address.
>>
>> Tested on x86-64. I also built GCC with -fno-plt.  It removes 99% PLT
>> entries.  OK for trunk?
>>
>> H.J.
>> --
>> gcc/
>>
>> PR target/pr67400
>> * config/i386/i386-protos.h (ix86_force_load_from_GOT_p): New.
>> * config/i386/i386.c (ix86_force_load_from_GOT_p): New function.
>> (ix86_legitimate_address_p): Allow UNSPEC_GOTPCREL for
>> ix86_force_load_from_GOT_p returns true.
>> (ix86_print_operand_address): Support UNSPEC_GOTPCREL if
>> ix86_force_load_from_GOT_p returns true.
>> (ix86_expand_move): Load the external function address via the
>> GOT slot if ix86_force_load_from_GOT_p returns true.
>> * config/i386/predicates.md (x86_64_immediate_operand): Return
>> false if ix86_force_load_from_GOT_p returns true.
>>
>> gcc/testsuite/
>>
>> PR target/pr67400
>> * gcc.target/i386/pr67400-1.c: New test.
>> * gcc.target/i386/pr67400-2.c: Likewise.
>> * gcc.target/i386/pr67400-3.c: Likewise.
>> * gcc.target/i386/pr67400-4.c: Likewise.
>
> Please get someone that knows this linker magic to review the
> functionality first. Maybe Jakub can help?
>

Hi Jakub,

Can you review this patch?

Thanks.

-- 
H.J.
From 3c81d37bb422f9856b373c63dfc6e19e035a7714 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 28 Aug 2015 19:14:49 -0700
Subject: [PATCH] Load external function address via GOT slot

For -fno-plt, we load the external function address via the GOT slot
so that linker won't create an PLT entry for extern function address.

gcc/

	PR target/67400
	* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): New.
	* config/i386/i386.c (ix86_force_load_from_GOT_p): New function.
	(ix86_legitimate_address_p): Allow UNSPEC_GOTPCREL if
	ix86_force_load_from_GOT_p returns true.
	(ix86_print_operand_address): Support UNSPEC_GOTPCREL if
	ix86_force_load_from_GOT_p returns true.
	(ix86_expand_move): Load the external function address via the
	GOT slot if ix86_force_load_from_GOT_p returns true.
	* config/i386/predicates.md (x86_64_immediate_operand): Return
	false if ix86_force_load_from_GOT_p returns true.

gcc/testsuite/

	PR target/67400
	* gcc.target/i386/pr67400-1.c: New test.
	* gcc.target/i386/pr67400-2.c: Likewise.
	* gcc.target/i386/pr67400-3.c: Likewise.
	* gcc.target/i386/pr67400-4.c: Likewise.
---
 gcc/config/i386/i386-protos.h |  1 +
 gcc/config/i386/i386.c| 42 +++
 gcc/config/i386/predicates.md |  4 +++
 gcc/testsuite/gcc.target/i386/pr67400-1.c | 13 ++
 gcc/testsuite/gcc.target/i386/pr67400-2.c | 14 +++
 gcc/testsuite/gcc.target/i386/pr67400-3.c | 16 
 gcc/testsuite/gcc.target/i386/pr67400-4.c | 13 ++
 7 files changed, 103 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr67400-4.c

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 447f67e..99775cb 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -70,6 +70,7 @@ extern bool ix86_expand_set_or_movmem (rtx, rtx, rtx, rtx, rtx, rtx,
 extern bool constant_address_p (rtx);
 extern bool legitimate_pic_operand_p (rtx);
 extern bool legitimate_pic_address_disp_p (rtx);
+extern bool ix86_force_load_from_GOT_p (rtx);
 extern void print_reg (rtx, int, FILE*);
 extern void ix86_print_operand (FILE *, rtx, int);
 
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 05476f3..6d73651 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14833,6 +14833,24 @@ ix86_legitimate_constant_p (machine_mode mode, rtx x)
   return true;
 }
 
+/* True if operand X should be loaded from GOT.  */
+
+bool
+ix86_force_load_from_GOT_p (rtx x)
+{
+  /* External function symbol should be loaded via the GOT slot for
+ -fno-plt.  */
+  return (!flag_plt
+	  && !flag_pic
+	  && ix86_cmodel != CM_LARGE
+	  && TARGET_64BIT
+	  && !TARGET_PECOFF
+	  && !TARGET_MACHO
+	  && GET_CODE (x) == SYMBOL_REF
+	  && SYMBOL_REF_FUNCTION_P (x)
+	  && !SYMBOL_REF_LOCAL_P (x));
+}
+
 /* Determine if it's legal to put X into the constant pool.  This
is not possible for the address of thread-local symbols, which
is checked above.  */
@@ -15213,6 +15231,10 @@ ix86_legitimate_address_p (machine_mode, rtx addr, bool strict)
 	return false;
 
 	  case UNSPEC_GOTPCREL:
+	gcc_assert (flag_pic
+			|| ix86_force_load_from_GOT_p (XVECEXP (XEXP (disp, 0), 0, 0)));
+	  

Re: [PATCH,rs6000] Add built-in support for new Power9 darn (deliver a random number) instruction

2016-05-09 Thread Bill Schmidt
On Mon, 2016-05-09 at 08:58 -0500, Segher Boessenkool wrote:
> Hi Kelvin,
> 
> On Thu, May 05, 2016 at 10:26:01AM -0600, Kelvin Nilsen wrote:
> > (UNSPEC_DARN_32): New usnpec constant.
> 
> Typo.
> 
> > ("darn_32"): New instruction.
> 
> We don't normally use quotes for insn names.
> 
> > (rs6000_builtin_mask_calculate): Add in the RS6000_BTM_MODULO and
> > RS6000_BTM_64BIT flags to the returned mask, depending on
> > configuration. 
> 
> Trailing space (many, in this changelog).
> 
> > --- gcc/config/rs6000/altivec.h (revision 235884)
> > +++ gcc/config/rs6000/altivec.h (working copy)
> > @@ -382,6 +382,11 @@
> >  #define vec_vsubuqm __builtin_vec_vsubuqm
> >  #define vec_vupkhsw __builtin_vec_vupkhsw
> >  #define vec_vupklsw __builtin_vec_vupklsw
> > +
> > +/* Non-Vector additions added in ISA 3.0. */
> > +#define darn __builtin_darn
> > +#define darn_32 __builtin_darn_32
> > +#define darn_raw __builtin_darn_raw
> >  #endif
> 
> Do we really want to #define short words like "darn"?  If this is already
> set in stone, so be it.

I don't think we do, and in any case altivec.h would not be the place to
do it.  darn is not a vector instruction.

For these, just having __builtin_darn* be the available interfaces will
be fine.

My two cents,
Bill

> 
> > +(define_insn "darn_32"
> > +  [(set (match_operand:SI 0 "register_operand" "")
> 
> The constraint should be "r" I suppose?
> 
> > +(unspec:SI [(const_int 0)] UNSPEC_DARN_32))]
> > +  "TARGET_MODULO"
> > +  {
> > + return "darn %0,0";
> > +  }
> > +  [(set_attr "type" "add")  
> 
> Trailing spaces.  "add" isn't the correct type; use "integer" if there
> is no better type.
> 
> > +   (set_attr "length" "4")])
> 
> That is the default, no need to mention it.  Most insns are implicitly
> length 4.
> 
> > +/* Miscellaneous builtins for instructions added in ISA 3.0.  These
> > +   instructions don't require either the DFP or VSX options, just the 
> > basic 
> 
> Trailing space.
> 
> > @@ -3634,6 +3639,8 @@ rs6000_builtin_mask_calculate (void)
> >   | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL  : 0)
> >   | ((TARGET_P8_VECTOR) ? RS6000_BTM_P8_VECTOR : 0)
> >   | ((TARGET_P9_VECTOR) ? RS6000_BTM_P9_VECTOR : 0)
> > + | ((TARGET_MODULO)? RS6000_BTM_MODULO: 0)
> > + | ((TARGET_64BIT) ? RS6000_BTM_64BIT: 0)
> 
> Missing space?
> 
> > +  /* RS6000_BTC_SPECIAL represents no-operand operators.  */
> >gcc_assert (attr == RS6000_BTC_UNARY
> >   || attr == RS6000_BTC_BINARY
> > - || attr == RS6000_BTC_TERNARY);
> > -
> > + || attr == RS6000_BTC_TERNARY
> > + || attr == RS6000_BTC_SPECIAL);
> > +  
> 
> Why SPECIAL and not NULLARY or such?
> 
> > +  if (rs6000_overloaded_builtin_p (d->code))
> > +   {
> > + if (! (type = opaque_ftype_opaque))
> > +   type = opaque_ftype_opaque
> > + = build_function_type_list (opaque_V4SI_type_node,
> > + NULL_TREE);
> > +   }
> 
> Eww.
> 
>   if (!opaque_ftype_opaque)
> opaque_ftype_opaque = build_function_type_list (...);
>   type = opaque_ftype_opaque;
> 
> > + enum insn_code icode = d->icode;
> > + if (d->name == 0)
> > +   {
> > + if (TARGET_DEBUG_BUILTIN)
> > +   fprintf (stderr, "rs6000_builtin, bdesc_0arg[%ld] no name\n",
> > +(long unsigned)i);
> 
> unsigned is %u, not %d.  Space after cast.
> 
> Cheers,
> 
> 
> Segher
> 




Re: [ARM] mno-pic-data-is-text-relative & msingle-pic-base

2016-05-09 Thread Nathan Sidwell

Joey,

This patch will do what you intend it to do. However, I am not sure in part 
related to VxWorks. The logic behind this patch is that 
-mno-pic-data-is-text-relative should enable -msingle-pic-base because 
otherwise it will be useless. The logic itself is orthogonal to OS. So I am not 
convinced the 'else if' shouldn't be just 'if'. It should not change VxWorks 
behaviour if VxWorks enables -msingle-pic-base explicitly. Or otherwise there 
is at least one use case that -mno-pic-data-is-text-relative can be used 
without -msingle-pic-base, which breaks the logic that this whole patch stands 
on.


VxWorks has two modes of code generation -- kernel and RTP.  RTPs don't have a 
fixed mapping between code and data (and use special sequence to initialize the 
PIC register, using vxworks-specific relocs).  Kernel mode doesn't support PIC 
code generation -- see config/vxworks.c


So I don't think there's a problem.

nathan


RE: [ARM] mno-pic-data-is-text-relative & msingle-pic-base

2016-05-09 Thread Joey Ye
Nathan,

This patch will do what you intend it to do. However, I am not sure in part 
related to VxWorks. The logic behind this patch is that 
-mno-pic-data-is-text-relative should enable -msingle-pic-base because 
otherwise it will be useless. The logic itself is orthogonal to OS. So I am not 
convinced the 'else if' shouldn't be just 'if'. It should not change VxWorks 
behaviour if VxWorks enables -msingle-pic-base explicitly. Or otherwise there 
is at least one use case that -mno-pic-data-is-text-relative can be used 
without -msingle-pic-base, which breaks the logic that this whole patch stands 
on.

Thanks,
Joey

> -Original Message-
> From: Nathan Sidwell [mailto:nathanmsidw...@gmail.com] On Behalf Of
> Nathan Sidwell
> Sent: 09 May 2016 15:07
> To: Richard Earnshaw; GCC Patches
> Cc: Joey Ye
> Subject: [ARM] mno-pic-data-is-text-relative & msingle-pic-base
> 
> This patch comes from an off-list conversation between Joey & me.  The
> context is with RTOSs not all singing & dancing dynamic objects and OSes.
> 
> currently, the documentation for -mno-pic-data-is-text-relative (-mno-PDITR)
> says 'Assume that each data segments are relative to text segment at load
> time.
>   Therefore, it permits addressing data using PC-relative operations.
>   This option is on by default for targets other than VxWorks RTP.'
> 
> However, if you use just this option, you still end up with a pic-register 
> init
> sequence that  presumes a fixed mapping.  That's a surprise.  Joey tells me
> its expected use is with -msingle-pic-base (-mSPB), which reserves a global
> register to point at the (single) GOT.  That's what I had expected the -mno-
> PDITR option to have implied.
> 
> Apparently there are legitimate reasons one might want the -mno-PDITR
> behaviour without -mSPB.  I don't know what those are, perhaps Joey could
> clarify?
> 
> Anyway, IMHO that is the rare case and the more common case is that one
> would want to have -mnoPDITR imply -mSPB. (The reverse probably doesn't
> apply.)
> 
> This patch does 3 things.
> 1) have -mno-PDITR imply -mSPB, unless one has explictly provided -m[no-
> ]SPB.
> 2) clarified  the -m[no-]PDITR documentation.
> 3) Added some testcases -- there didn't appear to be any.
> 
> ok?
> 
> nathan



[Bug c++/71010] error: 'begin' was not declared in this scope

2016-05-09 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71010

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #2 from Manuel López-Ibáñez  ---
(In reply to Jonathan Wakely from comment #1)
 Furthermore, your code has undefined behaviour, it is forbidden to add your
> own functions to namespace std. The correct way to do it is to write the
> begin/end overloads in the same namespace as your type (in this case that's
> the global namespace).

I actually did not know that. Would it be possible to warn about this? I guess
libstdc++ files are system-headers and can avoid the warning.

Re: Bug maintenance

2016-05-09 Thread Manuel López-Ibáñez

On 08/05/16 23:13, Oleg Endo wrote:

There are nearly 10,000 still unresolved bugs in Bugzilla, almost
half of which are New, and a third Unconfirmed, so I'm sure any
effort to help reduce the number is of value and appreciated.


That's exactly what prompted me to ask.  There's such a vast number
of them, it's hard to believe that 9 year old bugs are still of
interest.


Sometimes there is.  Before randomly closing any bugs because they are
too old, one should@least have a look@them and see if they're
still an issue etc.  Often things would've been fixed along the way,
but not all of them.


There are some 10-years old bugs that have a very clear description of what 
needs to be done to fix them, it is just that no one has had time to do it yet. 
Others don't have a clear fix, but there is a lot of info about things tried 
but failed. Losing all that info would be bad.


My humble opinion is that going through the list from old to new is not the 
most useful or efficient way to contribute to GCC (if it is the only way you 
want to contribute, then please go ahead, it is still useful). Old bugs do not 
hurt anyone except perhaps when searching for duplicates. In that case, it may 
be worth spending a few minutes checking if it is fixed already, ask the 
submitter for more info, or confirm it if UNCONFIRMED and updating the 
description so one can see clearly that it is not a duplicate.


Triaging old bugs (except for fixing them) is not the most useful: users may 
have simply forgotten all about it or not be able to reproduce it anymore or 
moved on and not care...


On the other hand, it is rather more useful to start with recent bugs, which 
are more likely to be relevant, and confirm them, ask for more info, find 
oldest duplicate with more info, or classify them under various meta-bugs.


Rather than seeing Bugzilla as a TODO list for devs, it is rather more precise 
to see it as a knowledge database about bugs.


Cheers,
Manuel.



[PATCH] Improve sse2_loadld

2016-05-09 Thread Jakub Jelinek
Hi!

I hope this pattern actually shouldn't be used for AVX512*, because
vpinsr should match instead, but just in case it doesn't, all the insns
involved are in AVX512F.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-09  Jakub Jelinek  

* config/i386/sse.md (sse2_loadld): Use v instead of x
constraint in alternatives 0,1,4.

--- gcc/config/i386/sse.md.jj   2016-05-09 14:15:50.0 +0200
+++ gcc/config/i386/sse.md  2016-05-09 15:08:36.034622080 +0200
@@ -13013,11 +13013,11 @@ (define_expand "sse2_loadd"
   "operands[2] = CONST0_RTX (V4SImode);")
 
 (define_insn "sse2_loadld"
-  [(set (match_operand:V4SI 0 "register_operand"   "=x,Yi,x,x,x")
+  [(set (match_operand:V4SI 0 "register_operand"   "=v,Yi,x,x,v")
(vec_merge:V4SI
  (vec_duplicate:V4SI
-   (match_operand:SI 2 "nonimmediate_operand" "m ,r ,m,x,x"))
- (match_operand:V4SI 1 "reg_or_0_operand" "C ,C ,C,0,x")
+   (match_operand:SI 2 "nonimmediate_operand" "m ,r ,m,x,v"))
+ (match_operand:V4SI 1 "reg_or_0_operand" "C ,C ,C,0,v")
  (const_int 1)))]
   "TARGET_SSE"
   "@
@@ -13028,7 +13028,7 @@ (define_insn "sse2_loadld"
vmovss\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "sse2,sse2,noavx,noavx,avx")
(set_attr "type" "ssemov")
-   (set_attr "prefix" "maybe_vex,maybe_vex,orig,orig,vex")
+   (set_attr "prefix" "maybe_vex,maybe_vex,orig,orig,maybe_evex")
(set_attr "mode" "TI,TI,V4SF,SF,SF")])
 
 ;; QI and HI modes handled by pextr patterns.

Jakub


[Bug lto/70955] [6/7 Regression] Wrong code generation for __builtin_ms_va_list with -flto

2016-05-09 Thread zenith432 at users dot sourceforge.net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955

--- Comment #10 from zenith432 at users dot sourceforge.net ---
(In reply to vries from comment #8)
> Created attachment 38453 [details]
> tentative patch

vries, thank you very much.  I verified and looks good.

Built GCC 6.1.0 with patch from released sources on ftp.gnu.org.
[moved the patch to the right place of course]

Built a fairly large UEFI-based project with a good number of ms_va_list. 
Checked the disassembly manually for 1 instance.  Code is right + tested it to
run ok in various scenarios I know to use __builtin_va_arg.

Bug can be changed to resolved as far as I'm concerned.  Not sure whether it
has to wait for commit, so leaving it to TPTB.

[PATCH] Improve XMM16-XMM31 handling in vpinsr*

2016-05-09 Thread Jakub Jelinek
Hi!

vpinsr{b,w} are AVX512BW, vpinsr{d,q} are AVX512DQ.
This patch makes us use v constraint instead of x in those
cases.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-09  Jakub Jelinek  

* config/i386/sse.md (pinsr_evex_isa): New mode attr.
(_pinsr): Add 2 alternatives with
v constraints instead of x and  isa attribute.

* gcc.target/i386/avx512bw-vpinsr-1.c: New test.
* gcc.target/i386/avx512dq-vpinsr-1.c: New test.
* gcc.target/i386/avx512vl-vpinsr-1.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-09 13:31:21.0 +0200
+++ gcc/config/i386/sse.md  2016-05-09 14:15:50.241028739 +0200
@@ -12036,13 +12036,17 @@ (define_mode_attr sse2p4_1
   [(V16QI "sse4_1") (V8HI "sse2")
(V4SI "sse4_1") (V2DI "sse4_1")])
 
+(define_mode_attr pinsr_evex_isa
+  [(V16QI "avx512bw") (V8HI "avx512bw")
+   (V4SI "avx512dq") (V2DI "avx512dq")])
+
 ;; sse4_1_pinsrd must come before sse2_loadld since it is preferred.
 (define_insn "_pinsr"
-  [(set (match_operand:PINSR_MODE 0 "register_operand" "=x,x,x,x")
+  [(set (match_operand:PINSR_MODE 0 "register_operand" "=x,x,x,x,v,v")
(vec_merge:PINSR_MODE
  (vec_duplicate:PINSR_MODE
-   (match_operand: 2 "nonimmediate_operand" "r,m,r,m"))
- (match_operand:PINSR_MODE 1 "register_operand" "0,0,x,x")
+   (match_operand: 2 "nonimmediate_operand" 
"r,m,r,m,r,m"))
+ (match_operand:PINSR_MODE 1 "register_operand" "0,0,x,x,v,v")
  (match_operand:SI 3 "const_int_operand")))]
   "TARGET_SSE2
&& ((unsigned) exact_log2 (INTVAL (operands[3]))
@@ -12059,16 +12063,18 @@ (define_insn "_pinsr\t{%3, %2, %0|%0, %2, %3}";
 case 2:
+case 4:
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
return "vpinsr\t{%3, %k2, %1, %0|%0, %1, %k2, %3}";
   /* FALLTHRU */
 case 3:
+case 5:
   return "vpinsr\t{%3, %2, %1, %0|%0, %1, %2, %3}";
 default:
   gcc_unreachable ();
 }
 }
-  [(set_attr "isa" "noavx,noavx,avx,avx")
+  [(set_attr "isa" "noavx,noavx,avx,avx,,")
(set_attr "type" "sselog")
(set (attr "prefix_rex")
  (if_then_else
@@ -12089,7 +12095,7 @@ (define_insn "_pinsr_vinsert_mask"
--- gcc/testsuite/gcc.target/i386/avx512bw-vpinsr-1.c.jj2016-05-09 
14:36:49.618145755 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-vpinsr-1.c   2016-05-09 
14:49:57.830574216 +0200
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mavx512bw" } */
+
+typedef char v16qi __attribute__((vector_size (16)));
+typedef short v8hi __attribute__((vector_size (16)));
+
+v16qi
+f1 (v16qi a, char b)
+{
+  register v16qi c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v16qi d = c;
+  ((char *) )[3] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrb\[^\n\r]*xmm16" } } */
+
+v8hi
+f2 (v8hi a, short b)
+{
+  register v8hi c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v8hi d = c;
+  ((short *) )[3] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrw\[^\n\r]*xmm16" } } */
--- gcc/testsuite/gcc.target/i386/avx512dq-vpinsr-1.c.jj2016-05-09 
14:39:15.588184128 +0200
+++ gcc/testsuite/gcc.target/i386/avx512dq-vpinsr-1.c   2016-05-09 
14:48:38.0 +0200
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mavx512dq" } */
+
+typedef int v4si __attribute__((vector_size (16)));
+typedef long long v2di __attribute__((vector_size (16)));
+
+v4si
+f1 (v4si a, int b)
+{
+  register v4si c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v4si d = c;
+  ((int *) )[3] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrd\[^\n\r]*xmm16" } } */
+
+v2di
+f2 (v2di a, long long b)
+{
+  register v2di c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v2di d = c;
+  ((long long *) )[1] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler "vpinsrq\[^\n\r]*xmm16" } } */
--- gcc/testsuite/gcc.target/i386/avx512vl-vpinsr-1.c.jj2016-05-09 
14:41:21.195496147 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vpinsr-1.c   2016-05-09 
14:50:32.188114909 +0200
@@ -0,0 +1,63 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512bw -mno-avx512dq" } */
+
+typedef char v16qi __attribute__((vector_size (16)));
+typedef short v8hi __attribute__((vector_size (16)));
+typedef int v4si __attribute__((vector_size (16)));
+typedef long long v2di __attribute__((vector_size (16)));
+
+v16qi
+f1 (v16qi a, char b)
+{
+  register v16qi c __asm ("xmm16") = a;
+  asm volatile ("" : "+v" (c));
+  v16qi d = c;
+  ((char *) )[3] = b;
+  c = d;
+  asm volatile ("" : "+v" (c));
+  return c;
+}
+
+/* { dg-final { scan-assembler-not "vpinsrb\[^\n\r]*xmm16" } } 

[PATCH] vec_extract XMM16-XMM17 improvements

2016-05-09 Thread Jakub Jelinek
Hi!

vpextr{b,w} are in AVX512BW, so is vpsrldq, and vpextr{d,q} are in
AVX512DQ.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-09  Jakub Jelinek  

* config/i386/i386.md (isa): Add x64_avx512dq, enable if
TARGET_64BIT && TARGET_AVX512DQ.
* config/i386/sse.md (*vec_extract): Add avx512bw alternatives.
(*vec_extract_zext): Add avx512bw alternative.
(*vec_extract_0, *vec_extractv4si_0_zext,
*vec_extractv2di_0_sse): Use v constraint instead of x constraint.
(*vec_extractv4si): Add avx512dq and avx512bw alternatives.
(*vec_extractv4si_zext): Add avx512dq alternative.
(*vec_extractv2di_1): Add x64_avx512dq and avx512bw alternatives,
use v instead of x constraint in other alternatives where possible.

* gcc.target/i386/avx512bw-vpextr-1.c: New test.
* gcc.target/i386/avx512dq-vpextr-1.c: New test.

--- gcc/config/i386/i386.md.jj  2016-05-09 13:33:12.0 +0200
+++ gcc/config/i386/i386.md 2016-05-09 16:32:32.219961730 +0200
@@ -796,7 +796,7 @@ (define_attr "isa" "base,x64,x64_sse4,x6
sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx,
avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
fma_avx512f,avx512bw,noavx512bw,avx512dq,noavx512dq,
-   avx512vl,noavx512vl"
+   avx512vl,noavx512vl,x64_avx512dq"
   (const_string "base"))
 
 (define_attr "enabled" ""
@@ -807,6 +807,8 @@ (define_attr "enabled" ""
   (symbol_ref "TARGET_64BIT && TARGET_SSE4_1 && !TARGET_AVX")
 (eq_attr "isa" "x64_avx")
   (symbol_ref "TARGET_64BIT && TARGET_AVX")
+(eq_attr "isa" "x64_avx512dq")
+  (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ")
 (eq_attr "isa" "nox64") (symbol_ref "!TARGET_64BIT")
 (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2")
 (eq_attr "isa" "sse2_noavx")
--- gcc/config/i386/sse.md.jj   2016-05-09 15:08:36.0 +0200
+++ gcc/config/i386/sse.md  2016-05-09 16:43:54.213638239 +0200
@@ -13036,39 +13036,44 @@ (define_mode_iterator PEXTR_MODE12
   [(V16QI "TARGET_SSE4_1") V8HI])
 
 (define_insn "*vec_extract"
-  [(set (match_operand: 0 "register_sse4nonimm_operand" "=r,m")
+  [(set (match_operand: 0 "register_sse4nonimm_operand" 
"=r,m,r,m")
(vec_select:
- (match_operand:PEXTR_MODE12 1 "register_operand" "x,x")
+ (match_operand:PEXTR_MODE12 1 "register_operand" "x,x,v,v")
  (parallel
[(match_operand:SI 2 "const_0_to__operand")])))]
   "TARGET_SSE2"
   "@
%vpextr\t{%2, %1, %k0|%k0, %1, %2}
-   %vpextr\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "*,sse4")
+   %vpextr\t{%2, %1, %0|%0, %1, %2}
+   vpextr\t{%2, %1, %k0|%k0, %1, %2}
+   vpextr\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "*,sse4,avx512bw,avx512bw")
(set_attr "type" "sselog1")
(set_attr "prefix_data16" "1")
(set (attr "prefix_extra")
  (if_then_else
-   (and (eq_attr "alternative" "0")
+   (and (eq_attr "alternative" "0,2")
(eq (const_string "mode") (const_string "V8HImode")))
(const_string "*")
(const_string "1")))
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "maybe_vex")
+   (set_attr "prefix" "maybe_vex,maybe_vex,evex,evex")
(set_attr "mode" "TI")])
 
 (define_insn "*vec_extract_zext"
-  [(set (match_operand:SWI48 0 "register_operand" "=r")
+  [(set (match_operand:SWI48 0 "register_operand" "=r,r")
(zero_extend:SWI48
  (vec_select:
-   (match_operand:PEXTR_MODE12 1 "register_operand" "x")
+   (match_operand:PEXTR_MODE12 1 "register_operand" "x,v")
(parallel
  [(match_operand:SI 2
"const_0_to__operand")]]
   "TARGET_SSE2"
-  "%vpextr\t{%2, %1, %k0|%k0, %1, %2}"
-  [(set_attr "type" "sselog1")
+  "@
+   %vpextr\t{%2, %1, %k0|%k0, %1, %2}
+   vpextr\t{%2, %1, %k0|%k0, %1, %2}"
+  [(set_attr "isa" "*,avx512bw")
+   (set_attr "type" "sselog1")
(set_attr "prefix_data16" "1")
(set (attr "prefix_extra")
  (if_then_else
@@ -13089,9 +13094,9 @@ (define_insn "*vec_extract_mem"
   "#")
 
 (define_insn "*vec_extract_0"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r ,r,x ,m")
+  [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r ,r,v ,m")
(vec_select:SWI48
- (match_operand: 1 "nonimmediate_operand" "mYj,x,xm,x")
+ (match_operand: 1 "nonimmediate_operand" "mYj,v,vm,v")
  (parallel [(const_int 0)])))]
   "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
   "#"
@@ -13101,7 +13106,7 @@ (define_insn_and_split "*vec_extractv4si
   [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
  (vec_select:SI
-   (match_operand:V4SI 1 "register_operand" "x")
+   (match_operand:V4SI 1 "register_operand" "v")
(parallel [(const_int 0)]]
   

  1   2   3   >