RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-07 Thread Jason Merrill
Various places in GCC use negate, bit-and and compare to test whether
an integer is a power of 2, but I think it would be clearer for this
test to be wrapped in a function.

OK for trunk?
commit e2ca9914ce46d56775854f50c21506b220fd50b6
Author: Jason Merrill 
Date:   Wed Sep 7 16:22:32 2016 -0400

* hwint.h (pow2p_hwi): New.

diff --git a/gcc/hwint.h b/gcc/hwint.h
index 6b4d537..3d85fc3 100644
--- a/gcc/hwint.h
+++ b/gcc/hwint.h
@@ -299,4 +299,12 @@ absu_hwi (HOST_WIDE_INT x)
   return x >= 0 ? (unsigned HOST_WIDE_INT)x : -(unsigned HOST_WIDE_INT)x;
 }
 
+/* True if X is a power of two.  */
+
+inline bool
+pow2p_hwi (unsigned HOST_WIDE_INT x)
+{
+  return (x & -x) == x;
+}
+
 #endif /* ! GCC_HWINT_H */


[Bug preprocessor/53469] #pragma GCC diagnostic works, but using _Pragma doesn't for -Wunused-local-typedefs

2016-09-07 Thread spencer8 at sbcglobal dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53469

j8spencer  changed:

   What|Removed |Added

 CC||spencer8 at sbcglobal dot net

--- Comment #4 from j8spencer  ---
I'm still seeing this behavior in 4.8.4.  #pragma can ignore
unused-local-typedefs, _Pragma cannot.  This is on Ubuntu 16.04.  Options are
-fPIE -O3 -NDEBUG -Wall -Wextra -Werror.  Because of -Werror, I also tried
putting in _Pragma ("GCC diagnostic warning \"-Wunused-local-typedefs\"")
before the ignored.  But regular #pragma works just fine.

Re: [RFC][SSA] Iterator to visit SSA

2016-09-07 Thread Kugan Vivekanandarajah
Hi Richard,

On 7 September 2016 at 19:35, Richard Biener  wrote:
> On Wed, Sep 7, 2016 at 2:21 AM, Kugan Vivekanandarajah
>  wrote:
>> Hi Richard,
>>
>> On 6 September 2016 at 19:08, Richard Biener  
>> wrote:
>>> On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah
>>>  wrote:
 Hi Richard,

 On 5 September 2016 at 17:57, Richard Biener  
 wrote:
> On Mon, Sep 5, 2016 at 7:26 AM, Kugan Vivekanandarajah
>  wrote:
>> Hi All,
>>
>> While looking at gcc source, I noticed that we are iterating over SSA
>> variable from 0 to num_ssa_names in some cases and 1 to num_ssa_names
>> in others. It seems that variable 0 is always NULL TREE.
>
> Yeah, that's artificial because we don't assign SSA name version 0 (for 
> some
> unknown reason).
>
>> But it can
>> confuse people who are looking for the first time. Therefore It might
>> be good to follow some consistent usage here.
>>
>> It might be also good to gave a FOR_EACH_SSAVAR iterator as we do in
>> other case. Here is attempt to do this based on what is done in other
>> places. Bootstrapped and regression tested on X86_64-linux-gnu with no
>> new regressions. is this OK?
>
> First of all some bikeshedding - FOR_EACH_SSA_NAME would be better
> as SSAVAR might be confused with iterating over SSA_NAME_VAR.
>
> Then, if you add an iterator why leave the name == NULL handling to 
> consumers?
> That looks odd.
>
> Then, SSA names are just in a vector thus why not re-use the vector 
> iterator?
>
> That is,
>
> #define FOR_EACH_SSA_NAME (name) \
>   for (unsigned i = 1; SSANAMES (cfun).iterate (i, ); ++i)
>
> would be equivalent to your patch?
>
> Please also don't add new iterators that implicitely use 'cfun' but 
> always use
> one that passes it as explicit arg.

 I think defining FOR_EACH_SSA_NAME with vector iterator is better. But
 we will not be able to skill NULL ssa_names with that.
>>>
>>> Why?  Can't you simply do
>>>
>>>   #define FOR_EACH_SSA_NAME (name) \
>>> for (unsigned i = 1; SSANAMES (cfun).iterate (i, ); ++i) \
>>>if (name)
>>>
>>> ?
>>
>> Indeed.  I missed the if at the end :(.  Here is an updated patch.
>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>> regressions.
>
> Quoting myself:
>
>> As I said
>> I'd like 'cfun' to be explicit, thus
>>
>> #define FOR_EACH_SSA_NAME(I, VAR, FN) \
>>  for (I = 1; SSANAMES (FN)->iterate ((I), &(VAR)); ++(I)) \
>>   if (VAR)
>
> the patch doesn't seem to contain that FN part.  Please also put declarations
> of 'name' you need to add right before the FOR_EACH_SSA_NAME rather
> than far away.

Please find the attached patch does this. Bootstrapped and regression
tested on x86_64-linux-gnu with no new regressions.

Is this OK?

Thanks,
Kugan

>
> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>>
 I also added
 index variable to the macro so that there want be any conflicts if the
 index variable "i" (or whatever) is also defined in the loop.

 Bootstrapped and regression tested on x86_64-linux-gnu with no new
 regressions. Is this OK for trunk?

 Thanks,
 Kugan


 gcc/ChangeLog:

 2016-09-06  Kugan Vivekanandarajah  

 * tree-ssanames.h (FOR_EACH_SSA_NAME): New.
 * cfgexpand.c (update_alias_info_with_stack_vars): Use
 FOR_EACH_SSA_NAME to iterate over SSA variables.
 (pass_expand::execute): Likewise.
 * omp-simd-clone.c (ipa_simd_modify_function_body): Likewise.
 * tree-cfg.c (dump_function_to_file): Likewise.
 * tree-into-ssa.c (pass_build_ssa::execute): Likewise.
 (update_ssa): Likewise.
 * tree-ssa-alias.c (dump_alias_info): Likewise.
 * tree-ssa-ccp.c (ccp_finalize): Likewise.
 * tree-ssa-coalesce.c (build_ssa_conflict_graph): Likewise.
 (create_outofssa_var_map): Likewise.
 (coalesce_ssa_name): Likewise.
 * tree-ssa-operands.c (dump_immediate_uses): Likewise.
 * tree-ssa-pre.c (compute_avail): Likewise.
 * tree-ssa-sccvn.c (init_scc_vn): Likewise.
 (scc_vn_restore_ssa_info): Likewise.
 (free_scc_vn): Likwise.
 (run_scc_vn): Likewise.
 * tree-ssa-structalias.c (compute_points_to_sets): Likewise.
 * tree-ssa-ter.c (new_temp_expr_table): Likewise.
 * tree-ssa-copy.c (fini_copy_prop): Likewise.
 * tree-ssa.c (verify_ssa): Likewise.

>
> Thanks,
> Richard.
>
>
>> Thanks,
>> Kugan
>>
>>
>> gcc/ChangeLog:
>>
>> 2016-09-05  Kugan Vivekanandarajah  
>>
>> * tree-ssanames.h 

[Bug c++/77523] New: rejects valid C++14 code with initialized lambda capture

2016-09-07 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77523

Bug ID: 77523
   Summary: rejects valid C++14 code with initialized lambda
capture
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The code is accepted by both Clang and MSVC (it is derived from PR 77522's
test). 


$ g++-trunk -v
Using built-in specs.
COLLECT_GCC=g++-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160907 (experimental) [trunk revision 240025] (GCC) 
$ 
$ clang++ -c -std=c++14 small.cpp
$ 
$ g++-trunk -c -std=c++14 small.cpp
small.cpp: In function ‘void f(T)’:
small.cpp:3:18: error: cannot capture ‘f<>’ by reference
   auto g = [ = f <>] () {};
  ^~~~
small.cpp: In instantiation of ‘void f(T) [with T = int]’:
small.cpp:8:7:   required from here
small.cpp:3:12: error: cannot bind non-const lvalue reference of type ‘void
(*&)(int)’ to an rvalue of type ‘void (*)(int)’
   auto g = [ = f <>] () {};
^
$ 


-


template < class T = int > void f (T)
{
  auto g = [ = f <>] () {};
}

int main ()
{
  f (0);
  return 0; 
}

[Bug c++/77522] New: ICE on invalid code C++14 code: in tsubst_decl, at cp/pt.c:12447

2016-09-07 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77522

Bug ID: 77522
   Summary: ICE on invalid code C++14 code: in tsubst_decl, at
cp/pt.c:12447
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

It affects 4.9.x and later, which have C++14 support. 


$ g++-trunk -c -std=c++14 small.cpp
small.cpp: In instantiation of ‘f(T):: [with T = int]’:
small.cpp:3:18:   required from ‘struct f(T) [with T = int]::’
small.cpp:3:8:   required from ‘void f(T) [with T = int]’
small.cpp:8:7:   required from here
small.cpp:3:24: error: use of ‘f(T) [with T = int]::::__a’ before
deduction of ‘auto’
   auto g = [ = f] () {};
^
small.cpp:3:24: error: use of ‘f(T) [with T = int]::::__a’ before
deduction of ‘auto’
small.cpp:3:24: internal compiler error: in tsubst_decl, at cp/pt.c:12447
0x6fe7c6 tsubst_decl
../../gcc-source-trunk/gcc/cp/pt.c:12447
0x6f32de tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc-source-trunk/gcc/cp/pt.c:12907
0x6e07fa tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15284
0x6dfa43 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15419
0x6dd120 instantiate_decl(tree_node*, int, bool)
../../gcc-source-trunk/gcc/cp/pt.c:22159
0x7266b5 instantiate_class_template_1
../../gcc-source-trunk/gcc/cp/pt.c:10346
0x7266b5 instantiate_class_template(tree_node*)
../../gcc-source-trunk/gcc/cp/pt.c:10416
0x7cbf13 complete_type(tree_node*)
../../gcc-source-trunk/gcc/cp/typeck.c:133
0x6eaf2c tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
../../gcc-source-trunk/gcc/cp/pt.c:17269
0x6df247 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15936
0x5df520 tsubst_init
../../gcc-source-trunk/gcc/cp/pt.c:13966
0x6e3144 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15334
0x6dec3b tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15228
0x6dfa43 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc-source-trunk/gcc/cp/pt.c:15419
0x6dd120 instantiate_decl(tree_node*, int, bool)
../../gcc-source-trunk/gcc/cp/pt.c:22159
0x72a5f2 instantiate_pending_templates(int)
../../gcc-source-trunk/gcc/cp/pt.c:22278
0x76f597 c_parse_final_cleanups()
../../gcc-source-trunk/gcc/cp/decl2.c:4617
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
$


-


template < class T = int > void f (T)
{ 
  auto g = [ = f] () {};
}

int main ()
{ 
  f (0);
  return 0;
}

Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-07 Thread kugan

Hi Bin,

On 07/09/16 17:52, Bin.Cheng wrote:

On Wed, Sep 7, 2016 at 1:10 AM, kugan  wrote:

Hi Bin,


On 07/09/16 04:54, Bin Cheng wrote:


Hi,
LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
overflow in loop niters' type.  Vectorizer needs to generate more code
computing vectorized niters if overflow does happen.  However, For common
loops, there is no overflow actually, this patch tries to prove the
no-overflow information and use that to improve code generation.  At the
moment, no-overflow information comes either from loop niter analysis, or
the truth that we know loop is peeled for non-zero iterations in prologue
peeling.  For the latter case, it doesn't matter if the original
LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (loop_niters_no_overflow): New func.
(vect_transform_loop): Call loop_niters_no_overflow.  Pass the
no-overflow information to vect_do_peeling_for_loop_bound and
vect_gen_vector_loop_niters.


009-prove-no_overflow-for-vect-niters-20160902.txt


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0d37f55..2ef1f9b 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop,
gimple *stmt)
 }
 }

+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);



Wouldn't it truncate by assigning this to int?

Probably, now I think it's unnecessary to use int version niters here,
LOOP_VINFO_NITERS can be used directly.




+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < cst_niters)



Shouldn't you have do the addition and comparison in the type of the loop
index instead of widest_int to see if that overflows?

You mean the type of loop niters?  NITERS is computed from NITERSM1 +
1, I don't think we need to do it again here.


Imagine that you have LOOP_VINFO_NITERSM1 as TYPE_MAX (loop niters 
type). In this case, when you add 1, it will overflow in loop niters 
type but not when you do the computation in widest_int.


But, as you said, if NITERS is already computed in loop niters type, yes 
this compare should be sufficient.


You could do the comparison as wide_int or tree. I think, this would 
make it clearer.


Thanks,
Kugan



Thanks,
bin



[Bug fortran/77507] gfortran rejects keyworded calls to procedures from intrinsic modules

2016-09-07 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77507

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 Ever confirmed|0   |1

--- Comment #3 from Dominique d'Humieres  ---
Confirmed.

[Bug target/77494] -mcpu=cortex-a53 does not allow use of crc extensions

2016-09-07 Thread buzz at exotica dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77494

--- Comment #6 from Jools Wills  ---
Thanks for the fix - I'll mark the other bug as resolved.

[Bug target/77494] -mcpu=cortex-a53 does not allow use of crc extensions

2016-09-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77494

--- Comment #5 from Richard Earnshaw  ---
(In reply to Jools Wills from comment #4)
> Thanks - reported here -
> https://sourceware.org/bugzilla/show_bug.cgi?id=20567

And already fixed in GAS earlier today.

[Bug fortran/77507] gfortran rejects keyworded calls to procedures from intrinsic modules

2016-09-07 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77507

--- Comment #2 from Steve Kargl  ---
On Wed, Sep 07, 2016 at 10:12:22PM +, kargl at gcc dot gnu.org wrote:
>
> This should have been 2 separate bug reports as ieee support
> is distinct from ISO C binding support.  This patch fixes
> the latter, but cannot be committed because you've tied it
> to the former.
> 

This appears to fix both issues.

Index: gcc/fortran/intrinsic.c
===
--- gcc/fortran/intrinsic.c (revision 240029)
+++ gcc/fortran/intrinsic.c (working copy)
@@ -1239,7 +1239,8 @@ add_functions (void)
 *z = "z", *ln = "len", *ut = "unit", *han = "handler",
 *num = "number", *tm = "time", *nm = "name", *md = "mode",
 *vl = "values", *p1 = "path1", *p2 = "path2", *com = "command",
-*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed";
+*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed",
+*c_ptr_1 = "c_ptr_1", *c_ptr_2 = "c_ptr_2";

   int di, dr, dd, dl, dc, dz, ii;

@@ -2914,8 +2915,8 @@ add_functions (void)
   /* The following functions are part of ISO_C_BINDING.  */
   add_sym_2 ("c_associated", GFC_ISYM_C_ASSOCIATED, CLASS_INQUIRY, ACTUAL_NO,
 BT_LOGICAL, dl, GFC_STD_F2003, gfc_check_c_associated, NULL, NULL,
-"C_PTR_1", BT_VOID, 0, REQUIRED,
-"C_PTR_2", BT_VOID, 0, OPTIONAL);
+c_ptr_1, BT_VOID, 0, REQUIRED,
+c_ptr_2, BT_VOID, 0, OPTIONAL);
   make_from_module();

   add_sym_1 ("c_loc", GFC_ISYM_C_LOC, CLASS_INQUIRY, ACTUAL_NO,
Index: libgfortran/ieee/ieee_arithmetic.F90
===
--- libgfortran/ieee/ieee_arithmetic.F90(revision 240029)
+++ libgfortran/ieee/ieee_arithmetic.F90(working copy)
@@ -857,12 +857,12 @@ contains

   ! IEEE_VALUE

-  elemental real(kind=4) function IEEE_VALUE_4(X, C) result(res)
-implicit none
+  elemental real(kind=4) function IEEE_VALUE_4(X, CLASS) result(res)
+
 real(kind=4), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS

-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -895,12 +895,12 @@ contains
  end select
   end function

-  elemental real(kind=8) function IEEE_VALUE_8(X, C) result(res)
-implicit none
+  elemental real(kind=8) function IEEE_VALUE_8(X, CLASS) result(res)
+
 real(kind=8), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS

-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -934,12 +934,12 @@ contains
   end function

 #ifdef HAVE_GFC_REAL_10
-  elemental real(kind=10) function IEEE_VALUE_10(X, C) result(res)
-implicit none
+  elemental real(kind=10) function IEEE_VALUE_10(X, CLASS) result(res)
+
 real(kind=10), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS

-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -971,15 +971,16 @@ contains
 res = 0
  end select
   end function
+
 #endif

 #ifdef HAVE_GFC_REAL_16
-  elemental real(kind=16) function IEEE_VALUE_16(X, C) result(res)
-implicit none
+  elemental real(kind=16) function IEEE_VALUE_16(X, CLASS) result(res)
+
 real(kind=16), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS

-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Paul Eggert

On 09/07/2016 04:52 AM, Mark Wielaard wrote:

If valgrind believes the
memory isn't in valid memory then it will complain about an invalid access.
But if the memory is accessible but uninitialised then it will just track
the undefinedness complain later if such a value is used.


I think the former is what happened in Gnulib fts.c before Gnulib was fixed.


valgrind also has --partial-loads-ok (which in newer versions defaults
to =yes):

Controls how Memcheck handles 32-, 64-, 128- and 256-bit naturally
aligned loads from addresses for which some bytes are addressable
and others are not.


Although this helps in some cases, it does not suffice in general since 
the problem can occur with 16-bit aligned loads. I think that is what 
happened with fts.c.



Does anybody have an example program of the above issue compiled with
gcc that produces false positives with valgrind?



Sure, attached. On Fedora 24 x86-64 (GCC 6.1.1 20160621, valgrind 
3.11.0), when I compile with "gcc -O2 flexouch.c" and run with "valgrind 
./a.out", valgrind complains "Invalid read of size 2". This is because 
GCC compiles "p->d[0] == 2 && p->d[1] == 3" into "cmpw $770, 8(%rax); 
sete %al", which loads the uninitialized byte p->d[1] simultaneously 
with the initialized byte p->d[0].


As mentioned previously, although flexouch.c does not conform to C11, 
this is arguably a defect in C11.


#include 
#include 

struct s { struct s *next; char d[]; };

int
main (void)
{
  struct s *p = malloc (offsetof (struct s, d) + 1);
  p->d[0] = 1;
  return p->d[0] == 2 && p->d[1] == 3;
}


[Bug libgcc/77519] [5/6 Regression] complex multiply excess precision handling inverted

2016-09-07 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77519

Joseph S. Myers  changed:

   What|Removed |Added

   Target Milestone|--- |5.5
Summary|[5/6/7 Regression] complex  |[5/6 Regression] complex
   |multiply excess precision   |multiply excess precision
   |handling inverted   |handling inverted

--- Comment #3 from Joseph S. Myers  ---
Fixed for GCC 7.  Not yet fixed for GCC 5 or 6.

[Bug libgcc/77519] [5/6/7 Regression] complex multiply excess precision handling inverted

2016-09-07 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77519

--- Comment #2 from Joseph S. Myers  ---
Author: jsm28
Date: Wed Sep  7 23:02:56 2016
New Revision: 240033

URL: https://gcc.gnu.org/viewcvs?rev=240033=gcc=rev
Log:
Correct libgcc complex multiply excess precision handling (PR libgcc/77519).

libgcc complex multiply is meant to eliminate excess
precision from certain internal values by forcing them to memory in
exactly those cases where the type has excess precision.  But in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01894.html I
accidentally inverted the logic so that values get forced to memory in
exactly the cases where it's not needed.  (This is a pessimization in
the no-excess-precision case, in principle could lead to bad results
depending on code generation in the excess-precision case.  Note: I do
not have a test demonstrating bad results.)

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Code size
went down on x86_64 as expected; old sizes:

   textdata bss dec hex filename
887   0   0 887 377 _muldc3.o
810   0   0 810 32a _mulsc3.o
   2032   0   02032 7f0 _multc3.o
983   0   0 983 3d7 _mulxc3.o

New sizes:

847   0   0 847 34f _muldc3.o
770   0   0 770 302 _mulsc3.o
   2032   0   02032 7f0 _multc3.o
951   0   0 951 3b7 _mulxc3.o

PR libgcc/77519
* libgcc2.c (NOTRUNC): Invert settings.

Modified:
trunk/libgcc/ChangeLog
trunk/libgcc/libgcc2.c

Re: [Patch libgcc] Enable HCmode multiply and divide (mulhc3/divhc3)

2016-09-07 Thread Bernd Schmidt

On 09/07/2016 06:05 PM, James Greenhalgh wrote:


Hi,

This patch arranges for half-precision complex multiply and divide
routines to be built if __LIBGCC_HAS_HF_MODE__.  This will be true
if the target supports the _Float16 type.

OK?


Ok, but please see Joseph's patch from today: I think you probably want 
to invert the NOTRUNC case here as well.



Bernd


Re: Correct libgcc complex multiply excess precision handling

2016-09-07 Thread Bernd Schmidt

On 09/07/2016 11:48 PM, Joseph Myers wrote:

libgcc complex multiply is meant to eliminate excess
precision from certain internal values by forcing them to memory in
exactly those cases where the type has excess precision.  But in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01894.html I
accidentally inverted the logic so that values get forced to memory in
exactly the cases where it's not needed.  (This is a pessimization in
the no-excess-precision case, in principle could lead to bad results
depending on code generation in the excess-precision case.  Note: I do
not have a test demonstrating bad results.)


Ok.


Bernd



[Bug fortran/77507] gfortran rejects keyworded calls to procedures from intrinsic modules

2016-09-07 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77507

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
This should have been 2 separate bug reports as ieee support
is distinct from ISO C binding support.  This patch fixes
the latter, but cannot be committed because you've tied it
to the former.

Index: intrinsic.c
===
--- intrinsic.c (revision 240029)
+++ intrinsic.c (working copy)
@@ -1239,7 +1239,8 @@ add_functions (void)
 *z = "z", *ln = "len", *ut = "unit", *han = "handler",
 *num = "number", *tm = "time", *nm = "name", *md = "mode",
 *vl = "values", *p1 = "path1", *p2 = "path2", *com = "command",
-*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed";
+*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed",
+*c_ptr_1 = "c_ptr_1", *c_ptr_2 = "c_ptr_2";

   int di, dr, dd, dl, dc, dz, ii;

@@ -2914,8 +2915,8 @@ add_functions (void)
   /* The following functions are part of ISO_C_BINDING.  */
   add_sym_2 ("c_associated", GFC_ISYM_C_ASSOCIATED, CLASS_INQUIRY, ACTUAL_NO,
 BT_LOGICAL, dl, GFC_STD_F2003, gfc_check_c_associated, NULL, NULL,
-"C_PTR_1", BT_VOID, 0, REQUIRED,
-"C_PTR_2", BT_VOID, 0, OPTIONAL);
+c_ptr_1, BT_VOID, 0, REQUIRED,
+c_ptr_2, BT_VOID, 0, OPTIONAL);
   make_from_module();

   add_sym_1 ("c_loc", GFC_ISYM_C_LOC, CLASS_INQUIRY, ACTUAL_NO,

Correct libgcc complex multiply excess precision handling

2016-09-07 Thread Joseph Myers
libgcc complex multiply is meant to eliminate excess
precision from certain internal values by forcing them to memory in
exactly those cases where the type has excess precision.  But in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01894.html I
accidentally inverted the logic so that values get forced to memory in
exactly the cases where it's not needed.  (This is a pessimization in
the no-excess-precision case, in principle could lead to bad results
depending on code generation in the excess-precision case.  Note: I do
not have a test demonstrating bad results.)

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Code size
went down on x86_64 as expected; old sizes:

   textdata bss dec hex filename
887   0   0 887 377 _muldc3.o
810   0   0 810 32a _mulsc3.o
   2032   0   02032 7f0 _multc3.o
983   0   0 983 3d7 _mulxc3.o

New sizes:

847   0   0 847 34f _muldc3.o
770   0   0 770 302 _mulsc3.o
   2032   0   02032 7f0 _multc3.o
951   0   0 951 3b7 _mulxc3.o

OK to commit?

2016-09-07  Joseph Myers  

PR libgcc/77519
* libgcc2.c (NOTRUNC): Invert settings.

Index: libgcc/libgcc2.c
===
--- libgcc/libgcc2.c(revision 240028)
+++ libgcc/libgcc2.c(working copy)
@@ -1866,25 +1866,25 @@
 # define CTYPE SCtype
 # define MODE  sc
 # define CEXT  __LIBGCC_SF_FUNC_EXT__
-# define NOTRUNC __LIBGCC_SF_EXCESS_PRECISION__
+# define NOTRUNC (!__LIBGCC_SF_EXCESS_PRECISION__)
 #elif defined(L_muldc3) || defined(L_divdc3)
 # define MTYPE DFtype
 # define CTYPE DCtype
 # define MODE  dc
 # define CEXT  __LIBGCC_DF_FUNC_EXT__
-# define NOTRUNC __LIBGCC_DF_EXCESS_PRECISION__
+# define NOTRUNC (!__LIBGCC_DF_EXCESS_PRECISION__)
 #elif defined(L_mulxc3) || defined(L_divxc3)
 # define MTYPE XFtype
 # define CTYPE XCtype
 # define MODE  xc
 # define CEXT  __LIBGCC_XF_FUNC_EXT__
-# define NOTRUNC __LIBGCC_XF_EXCESS_PRECISION__
+# define NOTRUNC (!__LIBGCC_XF_EXCESS_PRECISION__)
 #elif defined(L_multc3) || defined(L_divtc3)
 # define MTYPE TFtype
 # define CTYPE TCtype
 # define MODE  tc
 # define CEXT  __LIBGCC_TF_FUNC_EXT__
-# define NOTRUNC __LIBGCC_TF_EXCESS_PRECISION__
+# define NOTRUNC (!__LIBGCC_TF_EXCESS_PRECISION__)
 #else
 # error
 #endif

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug c/77521] New: %qc format directive should quote non-printable characters

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77521

Bug ID: 77521
   Summary: %qc format directive should quote non-printable
characters
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

The GCC-specific %qc format directive prints its character argument in quotes. 
One might expect the directive to quote non-printable characters similarly to
the %s directive but that's not what happens.  %qc prints the character as is. 
As a result, callers of the warning_at and error APIs that use the %qc
directive must be careful not to call it with non-printable characters (for
instance, by using the ISGRAPH() macro as done in c-family/c-format.c) and use
a different directive for those.  Those that don't might end up corrupting the
compiler stderr output as in the test case below.  Those that are careful end
up using a different alternate directive (e.g., %x or %o) resulting in
inconsistent diagnostics.

This bug is to change the %qc directive in the GCC pretty printer to format
non-printable characters using some other directive than the C %c (for example,
"\x%x" as done in c-family/c-format.c).

$ cat t.c && /build/gcc-trunk/gcc/xgcc -B /build/gcc-trunk/gcc -S -Wformat t.c
void g (int foo, int bar)
{
  asm ("combine %2, %0" : "=r" (foo) : "0" (foo), "\n" (bar));
}
t.c: In function ‘g’:
t.c:3:3: error: invalid punctuation ‘
’ in constraint
   asm ("combine %2, %0" : "=r" (foo) : "0" (foo), "\n" (bar));
   ^~~
t.c:3:3: error: invalid punctuation ‘
’ in constraint


The following is a list of GCC formatted output functions with the %qc
directive:

find gcc -name "*.c" ! -path "*/testsuite/*" | xargs grep "%qc")
gcc/fortran/io.c:  const char *unexpected_element  = _("Unexpected element %qc
in format "
gcc/fortran/matchexp.c: gfc_error ("Bad character %qc in OPERATOR name at %C",
name[i]);
gcc/fortran/symbol.c: gfc_error ("Letter %qc already set in IMPLICIT
statement at %C",
gcc/fortran/symbol.c: gfc_error ("Letter %qc already has an IMPLICIT
type at %C",
gcc/c-family/c-lex.c: error_at (*loc, "stray %qc in program", (int) c);
gcc/c-family/c-format.c:" %qc in format",
gcc/c-family/c-format.c:  "use of %qs length
modifier with %qc type"
grep: gcc/cp/.#decl.c: No such file or directory
gcc/stmt.c: warning (0, "output constraint %qc for operand %d "
gcc/stmt.c: error ("input operand constraint contains %qc",
constraint[j]);
gcc/stmt.c: error ("invalid punctuation %qc in constraint",
constraint[j]);
gcc/config/mmix/mmix.c:  internal_error ("MMIX Internal: Missing %qc case
in mmix_print_operand", code);
gcc/config/avr/driver-avr.c:error ("strange device name %qs after %qs:
bad character %qc",
gcc/gcc.c:  error ("spec failure: unrecognized spec option %qc", c);
gcc/gcc.c:  fatal_error (input_location, "braced spec %qs is invalid at %qc",
orig, *p);

[Bug c++/71710] [7 Regression] ICE on valid C++11 code with decltype and alias template: in lookup_member, at cp/search.c:1255

2016-09-07 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71710

--- Comment #3 from Zhendong Su  ---
A related, but simpler test that triggers the same ICE: 





template < typename > struct A
{
  A a;
  template < int > using B = decltype (a);
  B < 0 > b;
};

[Bug fortran/48298] [F03] User-Defined Derived-Type IO (DTIO)

2016-09-07 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48298

--- Comment #19 from Paul Thomas  ---
Author: pault
Date: Wed Sep  7 21:21:16 2016
New Revision: 240032

URL: https://gcc.gnu.org/viewcvs?rev=240032=gcc=rev
Log:
2016-09-07  Dominique Dhumieres  

PR fortran/48298
* gfortran.dg/assumed_rank_12.f90: Correct tree scan.
* gfortran.dg/assumed_type_2.f90: Correct tree scans.
* gfortran.dg/coarray_lib_comm_1.f90: Likewise.
* gfortran.dg/coarray_lib_this_image_2.f90: Likewise.
* gfortran.dg/coarray_lock_7.f90: Likewise.
* gfortran.dg/coarray_stat_function.f90: Likewise.
* gfortran.dg/no_arg_check_2.f90: Likewise.
* gfortran.dg/pr32921.f: Likewise.

Modified:
branches/fortran-dev/gcc/testsuite/ChangeLog.fortran-dev
branches/fortran-dev/gcc/testsuite/gfortran.dg/assumed_rank_12.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/assumed_type_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lib_comm_1.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lib_this_image_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lock_7.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_stat_function.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/no_arg_check_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/pr32921.f

[Bug libgcc/77519] [5/6/7 Regression] complex multiply excess precision handling inverted

2016-09-07 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77519

Joseph S. Myers  changed:

   What|Removed |Added

Summary|[5/6/7 Regression] complex  |[5/6/7 Regression] complex
   |multiply / divide excess|multiply excess precision
   |precision handling inverted |handling inverted

--- Comment #1 from Joseph S. Myers  ---
Correction: only multiply is affected, not divide.

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, Bernd Edlinger wrote:

> Apparently the different -msse default setting made the situation worse.
> I think that will not run on a pentium4 any more.

I think that's x86_64-* defaulting to an x86_64 processor (which implies 
SSE2 support) even with -m32 (unless a --with-arch-32= option is used to 
select a different 32-bit default).

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug c/77520] wrong value for extended ASCII characters in -Wformat message

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77520

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
   Severity|normal  |minor

[Bug c/77520] New: wrong value for extended ASCII characters in -Wformat message

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77520

Bug ID: 77520
   Summary: wrong value for extended ASCII characters in -Wformat
message
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

The argument_parser::find_format_char_info function is c-family/c-format.c
passes the value of an unexpected format character whose type is char as an
argument to a %x directive.  When char is a signed type and the value is in the
extended ASCII range ('\x80' and greater) it is sign-extended and yields a very
large value on output as in the test case below.  The character should be cast
to unsigned char to avoid the sign extension.  The following test case shows
the problem:

$ cat t.c && /home/msebor/build/gcc-49905/gcc/xgcc
-B/home/msebor/build/gcc-49905/gcc -S -Wformat t.c
void f (void)
{
  __builtin_printf ("%\x80");
}
t.c: In function ‘f’:
t.c:3:23: warning: unknown conversion type character 0xff80 in format
[-Wformat=]
   __builtin_printf ("%\x80");
   ^~~~

Clang produces better output:

t.c:3:23: warning: invalid conversion specifier '\x80'

Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-09-07 Thread Alessandro Fanfarillo
Dear all,
the attached patch supports failed images also when -fcoarray=single is used.

Built and regtested on x86_64-pc-linux-gnu.

Cheers,
Alessandro

2016-08-09 5:22 GMT-06:00 Paul Richard Thomas :
> Hi Sandro,
>
> As far as I can see, this is OK barring a couple of minor wrinkles and
> a question:
>
> For coarray_failed_images_err.f90 and coarray_image_status_err.f90 you
> have used the option -fdump-tree-original without making use of the
> tree dump.
>
> Mikael asked you to provide an executable test with -fcoarray=single.
> Is this not possible for some reason?
>
> Otherwise, this is OK for trunk.
>
> Thanks for the patch.
>
> Paul
>
> On 4 August 2016 at 05:07, Alessandro Fanfarillo
>  wrote:
>> * PING *
>>
>> 2016-07-21 13:05 GMT-06:00 Alessandro Fanfarillo :
>>> Dear Mikael and all,
>>>
>>> in attachment the new patch, built and regtested on x86_64-pc-linux-gnu.
>>>
>>> Cheers,
>>> Alessandro
>>>
>>> 2016-07-20 13:17 GMT-06:00 Mikael Morin :
 Le 20/07/2016 à 11:39, Andre Vehreschild a écrit :
>
> Hi Mikael,
>
>
>>> +  if(st == ST_FAIL_IMAGE)
>>> +new_st.op = EXEC_FAIL_IMAGE;
>>> +  else
>>> +gcc_unreachable();
>>
>> You can use
>> gcc_assert (st == ST_FAIL_IMAGE);
>> foo...;
>> instead of
>> if (st == ST_FAIL_IMAGE)
>> foo...;
>> else
>> gcc_unreachable ();
>
>
> Be careful, this is not 100% identical in the general case. For older
> gcc version (gcc < 4008) gcc_assert() is mapped to nothing, esp. not to
> an abort(), so the behavior can change. But in this case everything is
> fine, because the patch is most likely not backported.
>
 Didn't know about this. The difference seems to be very subtle.
 I don't mind much anyway. The original version can stay if preferred, this
 was just a suggestion.

 By the way, if the function is inlined in its single caller, the assert or
 unreachable statement can be removed, which avoids choosing between them.
 That's another suggestion.


>>> +
>>> +  return MATCH_YES;
>>> +
>>> + syntax:
>>> +  gfc_syntax_error (st);
>>> +
>>> +  return MATCH_ERROR;
>>> +}
>>> +
>>> +match
>>> +gfc_match_fail_image (void)
>>> +{
>>> +  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement
>>> at %C")) */
>>> +  /*   return MATCH_ERROR; */
>>> +
>>
>> Can this be uncommented?
>>
>>> +  return fail_image_statement (ST_FAIL_IMAGE);
>>> +}
>>>
>>>  /* Match LOCK/UNLOCK statement. Syntax:
>>>   LOCK ( lock-variable [ , lock-stat-list ] )
>>> diff --git a/gcc/fortran/trans-intrinsic.c
>>> b/gcc/fortran/trans-intrinsic.c index 1aaf4e2..b2f5596 100644
>>> --- a/gcc/fortran/trans-intrinsic.c
>>> +++ b/gcc/fortran/trans-intrinsic.c
>>> @@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr
>>> *expr) m, lbound));
>>>  }
>>>
>>> +static void
>>> +gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
>>> +{
>>> +  unsigned int num_args;
>>> +  tree *args,tmp;
>>> +
>>> +  num_args = gfc_intrinsic_argument_list_length (expr);
>>> +  args = XALLOCAVEC (tree, num_args);
>>> +
>>> +  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
>>> +
>>> +  if (flag_coarray == GFC_FCOARRAY_LIB)
>>> +{
>>
>> Can everything be put under the if?
>> Does it work with -fcoarray=single?
>
>
> IMO coarray=single should not generate code here, therefore putting
> everything under the if should to fine.
>
 My point was more avoiding generating code for the arguments if they are 
 not
 used in the end.
 Regarding the -fcoarray=single case, the function returns a result, which
 can be used in an expression, so I don't think it will work without at 
 least
 hardcoding a fixed value as result in that case.
 But even that wouldn't be enough, as the function wouldn't work 
 consistently
 with the fail image statement.

> Sorry for the comments ...
>
 Comments are welcome here, as far as I know. ;-)

 Mikael
>
>
>
> --
> The difference between genius and stupidity is; genius has its limits.
>
> Albert Einstein
commit 13213642603b4941a2e4ea085b0bfd5cb37f
Author: Alessandro Fanfarillo 
Date:   Wed Sep 7 13:00:17 2016 -0600

Second Review of failed image patch

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index ff5e80b..110bec0 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1217,6 +1217,82 @@ gfc_check_event_query (gfc_expr *event, gfc_expr *count, 
gfc_expr *stat)
   return true;
 }
 
+bool
+gfc_check_image_status (gfc_expr *image, gfc_expr 

Re: [Patch RFC] Modify excess precision logic to permit FLT_EVAL_METHOD=16

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, Joseph Myers wrote:

> How about instead having more than one target macro / hook.  One would 
> indicate that excess precision is used by insn patterns (and be set only 
> for i386 and m68k).  Another would indicate the API-level excess precision 

Or, maybe there would be a single hook taking a tristate argument.

Target hooks need to provide the following information:

(a) What excess precision is implicit in the insn patterns (that is, when 
a value described in middle-end IR as having a particular mode/type may in 
fact implicitly have extra range and precision because insn patterns 
produce results with such extra range and precision, not just with the 
range and precision of the output mode).

(b) What excess precision should be added explicitly to the IR by the 
front end in "fast" mode.

(c) What excess precision should be added explicitly to the IR by the 
front end in "standard" mode.

All of these may be represented by FLT_EVAL_METHOD values.  In what 
follows, "none" means no excess precision (whether the value is 0 or 16); 
"unpredictable" means inherently unpredictable even in the absence of 
register spills (like -mfpmath=sse+387), so FLT_EVAL_METHOD == -1.  In 
principle there could be cases of predictable excess precision that have 
to map to -1 because there is no other value they could map to, but in 
practice I don't expect that to be an issue (given the TS 18661-3 values 
of FLT_EVAL_METHOD being available).

(a) should always be "none" except for the existing x86 and m68k cases.

(b) is "none" in all existing cases, but we have the issue of ARM cases 
without direct binary16 arithmetic where it would be desirable for it to 
apply excess precision to _Float16 values.

(c) is not "none" at present in exactly those cases where 
TARGET_FLT_EVAL_METHOD is nonzero (but in the case of s390 this is really 
a target bug that should be fixed).  It might also be not "none" in future 
for ARM cases like in (b).

(a) can be "unpredictable".  (b) and (c) never can.

Rather than init_excess_precision setting flag_excess_precision, possibly 
turning "standard" into "fast", I think it should set some variable that 
describes the result of whichever of (b) and (c) is applicable - and in 
the cases where "standard" turns into "fast", it would simply happen that 
both (b) and (c) produce the same result.

The effective excess precision seen by the user is the result of applying 
first one of (b) and (c), then (a).  If (a) is not "none", this is not 
entirely predictable.  It's a broken compiler configuration if applying 
(c) yields a type on which (a) is not a no-op, except in the case where 
(a) is "unpredictable" and (c) is "none".  I'm not aware of any likely 
cases where a type would actually get promoted twice by applying those two 
operations.

Right now TARGET_FLT_EVAL_METHOD_NON_DEFAULT is used to give errors for 
-fexcess-precision=standard for languages not supporting it.  With a 
conversion to hooks, that needs to be rethought.  The point is to give an 
error if predictability was requested but cannot be achieved, so I suppose 
ideally the error should be about (a) being not "none", together with 
-fexcess-precision=standard being used.  But if the relevant back-end 
options aren't available at this point to use the hook for (a), the error 
could just be given for all targets (for those languages when that option 
is given).

Effective predictability, for __GCC_IEC_559 in flag_iso mode, means that 
(a) does nothing to any type resulting from whichever of (b) and (c) is in 
effect.

The way __LIBGCC_*_EXCESS_PRECISION__ is used is about eliminating excess 
precision from results assigned to variables - meaning it should be about 
(a) only.

That leaves the question of setting FLT_EVAL_METHOD.  It should relate to 
the effective excess precision seen by the user, the combination of 
whichever of (b) and (c) is in effect with (a).  The only problem is the 
case where that combination is most precisely described by "16", which as 
discussed is not a C11 value and may affect existing code not expecting 
such a value.  The value -1 is compatible with C11 and TS 18661-3 but 
suboptimal, while the value 0 is compatible with C11 only, not with TS 
18661-3 even when no feature test macros are defined.

We already have the option -fno-fp-int-builtin-inexact to ensure certain 
built-in functions follow TS 18661-1 semantics.  It might be reasonable to 
have a new option to enable FLT_EVAL_METHOD using new values.  However, 
I'd be inclined to think that such an option should be on by default for 
-std=gnu*, only off for strict conformance modes.  (There would be both 
__FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__, say, predefined macros, 
so that  could also always use the new value if 
__STDC_WANT_IEC_60559_TYPES_EXT__ is defined.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Bernd Edlinger
On 09/07/16 22:04, Joseph Myers wrote:
> On Wed, 7 Sep 2016, Bernd Edlinger wrote:
>
>> interesting.  I just tried the test case from PR 77330 with _Decimal128.
>> result: _Decimal128 did *not* trap with gcc4.8.4, but it does trap with
>> gcc-7.0.0.
>
> I checked with GCC 4.3; __alignof__ (_Decimal128) was 16 back then.
> Whether particular code happens to make use of that alignment requirement
> is inherently unpredictable.
>

Oh, now I see...

Alignof(_Decimal128) was 16, but gcc4.8 did not enable -msse, and
that must have changed.  When I use gcc -m32 -msse the test case
starts to fail with gcc-4.8.4.

With gcc-7.0.0 -m32 -mno-sse fixes the test case but the alignment is
still 16, as you already said.

Apparently the different -msse default setting made the situation worse.
I think that will not run on a pentium4 any more.


Bernd.


[Bug libgcc/77519] New: [5/6/7 Regression] complex multiply / divide excess precision handling inverted

2016-09-07 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77519

Bug ID: 77519
   Summary: [5/6/7 Regression] complex multiply / divide excess
precision handling inverted
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jsm28 at gcc dot gnu.org
  Target Milestone: ---

libgcc complex multiply and divide are meant to eliminate excess precision from
certain internal values by forcing them to memory in exactly those cases where
the type has excess precision.  But in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01894.html I accidentally
inverted the logic so that values get forced to memory in exactly the cases
where it's not needed.  (This is a pessimization in the no-excess-precision
case, in principle could lead to bad results depending on code generation in
the excess-precision case.)

Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Jakub Jelinek
On Wed, Sep 07, 2016 at 10:19:11AM +0200, Richard Biener wrote:
> > If you want a 64-bit store, you'd need to merge the two, and that would be
> > even more expensive.  It is a matter of say:
> > movl $0x12345678, (%rsp)
> > movl $0x09abcdef, 4(%rsp)
> > vs.
> > movabsq $0x09abcdef12345678, %rax
> > movq %rax, (%rsp)
> > vs.
> > movl $0x09abcdef, %eax
> > salq $32, %rax
> > orq $0x12345678, %rax
> > movq $rax, (%rsp)
> 
> vs.
> 
> movq $LC0, (%rsp)

You don't want to store the address, so you'd use
movq .LC0, %rax
movq %rax, (%rsp)

> I think the important part to notice is that it should be straight forward
> for a target / the expander to split a large store from an immediate
> into any of the above but very hard to do the opposite.  Thus from a
> GIMPLE side "canonicalizing" to large stores (that are eventually
> supported and well-aligned) seems best to me.

I bet many programs assume that say 64-bit aligned store in the source is
atomic in 64-bit apps, without using __atomic_store (..., __ATOMIC_RELAXED);
So such a change would break that.

Jakub


Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Bernhard Reutner-Fischer
On September 6, 2016 5:14:47 PM GMT+02:00, Kyrill Tkachov 
 wrote:
>Hi all,

s/contigous/contiguous/
s/ where where/ where/

+struct merged_store_group
+{
+  HOST_WIDE_INT start;
+  HOST_WIDE_INT width;
+  unsigned char *val;
+  unsigned int align;
+  auto_vec stores;
+  /* We record the first and last original statements in the sequence because
+ because we'll need their vuse/vdef and replacement position.  */
+  gimple *last_stmt;

s/ because because/ because/

Why aren't these two HWIs unsigned, likewise in store_immediate_info and in 
most other spots in the patch?

+ fprintf (dump_file, "Afer writing ");
s/Afer /After/

/access if prohibitively slow/s/ if /is /

I'd get rid of successful_p in imm_store_chain_info::output_merged_stores.


+unsigned int
+pass_store_merging::execute (function *fun)
+{
+  basic_block bb;
+  hash_set orig_stmts;
+
+  FOR_EACH_BB_FN (bb, fun)
+{
+  gimple_stmt_iterator gsi;
+  HOST_WIDE_INT num_statements = 0;
+  /* Record the original statements so that we can keep track of
+statements emitted in this pass and not re-process new
+statements.  */
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gimple_set_visited (gsi_stmt (gsi), false);
+ num_statements++;
+   }
+
+  if (num_statements < 2)
+   continue;

What about debug statements? ISTM you should skip those.
(Isn't visited reset before entry of a pass?)

Maybe I missed the bikeshedding about the name but I'd have used -fmerge-stores 
instead.

Thanks,
>
>The v3 of this patch addresses feedback I received on the version
>posted at [1].
>The merged store buffer is now represented as a char array that we
>splat values onto with
>native_encode_expr and native_interpret_expr. This allows us to merge
>anything that native_encode_expr
>accepts, including floating point values and short vectors. So this
>version extends the functionality
>of the previous one in that it handles floating point values as well.
>
>The first phase of the algorithm that detects the contiguous stores is
>also slightly refactored according
>to feedback to read more fluently.
>
>Richi, I experimented with merging up to MOVE_MAX bytes rather than
>word size but I got worse results on aarch64.
>MOVE_MAX there is 16 (because it has load/store register pair
>instructions) but the 128-bit immediates that we ended
>synthesising were too complex. Perhaps the TImode immediate store RTL
>expansions could be improved, but for now
>I've left the maximum merge size to be BITS_PER_WORD.
>
>I've disabled the pass for PDP-endian targets as the merging code
>proved to be quite fiddly to get right for different
>endiannesses and I didn't feel comfortable writing logic for
>BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN targets without serious
>testing capabilities. I hope that's ok (I note the bswap pass also
>doesn't try to do anything on such targets).
>
>Tested on arm, aarch64, x86_64 and on big-endian arm and aarch64.
>
>How does this version look?
>Thanks,
>Kyrill
>
>[1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01512.html
>
>2016-09-06  Kyrylo Tkachov  
>
> PR middle-end/22141
> * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
> * common.opt (fstore-merging): New Optimization option.
> * opts.c (default_options_table): Add entry for
> OPT_ftree_store_merging.
> * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
> * passes.def: Insert pass_tree_store_merging.
> * tree-pass.h (make_pass_store_merging): Declare extern
> prototype.
> * gimple-ssa-store-merging.c: New file.
> * doc/invoke.texi (Optimization Options): Document
> -fstore-merging.
>
>2016-09-06  Kyrylo Tkachov  
> Jakub Jelinek  
>
> PR middle-end/22141
> * gcc.c-torture/execute/pr22141-1.c: New test.
> * gcc.c-torture/execute/pr22141-2.c: Likewise.
> * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
> * gcc.target/aarch64/ldp_stp_4.c: Likewise.
> * gcc.dg/store_merging_1.c: New test.
> * gcc.dg/store_merging_2.c: Likewise.
> * gcc.dg/store_merging_3.c: Likewise.
> * gcc.dg/store_merging_4.c: Likewise.
> * gcc.dg/store_merging_5.c: Likewise.
> * gcc.dg/store_merging_6.c: Likewise.
> * gcc.target/i386/pr22141.c: Likewise.
> * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.




Re: [PATCH] -fsanitize=thread fixes (PR sanitizer/68260)

2016-09-07 Thread Jakub Jelinek
On Wed, Sep 07, 2016 at 09:07:45AM +0200, Richard Biener wrote:
> > @@ -493,6 +504,8 @@ instrument_builtin_call (gimple_stmt_ite
> > if (!tree_fits_uhwi_p (last_arg)
> > || memmodel_base (tree_to_uhwi (last_arg)) >= MEMMODEL_LAST)
> >   return;
> > +   if (lookup_stmt_eh_lp (stmt))
> > + remove_stmt_from_eh_lp (stmt);
> 
> These changes look bogus to me -- how can the tsan instrumentation
> function _not_ throw when the original function we replace it can?

The __tsan*atomic* functions are right now declared to be nothrow, so the
patch just matches how they are declared.
While the sync-builtins.def use
#define ATTR_NOTHROWCALL_LEAF_LIST (flag_non_call_exceptions ? \
ATTR_LEAF_LIST : ATTR_NOTHROW_LEAF_LIST)
I guess I could use the same for the tsan atomics, but wonder if it will work
properly when the libtsan is built with exceptions disabled and without
-fnon-call-exceptions.  Perhaps it would, at least if it is built with
-fasynchronous-unwind-tables (which is the case for x86_64 and aarch64 and
tsan isn't supported elsewhere).
> 
> It seems you are just avoiding the ICEs for now "wrong-code".  (and
> how does this relate to -fnon-call-exceptions as both are calls?)
> 
> The instrument_expr case seems to leave the original stmt in-place
> (and thus the EH).

Those are different.  For loads and stores there is just added call that
logs in the load or store, but for atomics the atomics are done inside of
the library and the library tracks everything.

Jakub


[Bug fortran/66459] bogus warning 'w.offset' may be used uninitialized in this function [-Wmaybe-uninitialized]

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66459

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #3 from Manuel López-Ibáñez  ---
If you do

gfortran -Wuninitialized test.f90 -fdump-tree-all-all-lineno -O1

and look at test.f90.162t.uninit1, we see:

  # .MEM_24 = PHI <.MEM_15(D)(28), .MEM_68(37)>
  # w$dim$1$stride_46 = PHI 
  # w$offset_26 = PHI 

but this code is transformed by optimization. The unoptimized SSA contains:

  [test.f90:9:0] # VUSE <.MEM_18>
  _22 = [test.f90:9:0] *m_21(D);
  [test.f90:9:0] _23 = MAX_EXPR <_22, 0>;
  [test.f90:9:0] _24 = (integer(kind=8)D.9) _23;
...
  [test.f90:9:0] _39 = _24;
...
  [test.f90:9:0] _68 = ~_39;
...
  [test.f90:9:0] wD.3400.offsetD.3387 = _68;

and the gimple generated by Fortran contains something similar:

[test.f90:9:0] D.3429 = [test.f90:9:0] *mD.3381;
[test.f90:9:0] D.3430 = MAX_EXPR ;
[test.f90:9:0] D.3402 = (integer(kind=8)D.9) D.3430;

It seems that *mD.3381 is not initialized.

(It is very strange that gfortran converts user-defined variables to lowercase.
It makes reading the dumps more difficult. It also does many unnecessary
copies, making the code harder to analyze.)

Re: [PATCH] Fix location of command line backend reported issues (PR middle-end/77475)

2016-09-07 Thread Christophe Lyon
On 6 September 2016 at 12:41, Christophe Lyon
 wrote:
> On 6 September 2016 at 12:12, Jakub Jelinek  wrote:
>> On Tue, Sep 06, 2016 at 12:07:47PM +0200, Christophe Lyon wrote:
>>> On 5 September 2016 at 19:20, Jakub Jelinek  wrote:
>>> > Hi!
>>> >
>>> > While it would be perhaps nice to pass explicit location_t in the target
>>> > option handling code, there are hundreds of error/warning/sorry calls
>>> > in lots of backends, and lots of those routines are used not just
>>> > for the process_options time (i.e. command line options), but also for
>>> > pragma GCC target and target option handling, so at least for the time 
>>> > being
>>> > I think it is easier to just use UNKNOWN_LOCATION for the command line
>>> > option diagnostics.
>>> >
>>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>> >
>>> > 2016-09-05  Jakub Jelinek  
>>> >
>>> > PR middle-end/77475
>>> > * toplev.c (process_options): Temporarily set input_location
>>> > to UNKNOWN_LOCATION around targetm.target_option.override () call.
>>> >
>>>
>>> This patch caused regressions on aarch64. The following tests now fail:
>>>   gcc.target/aarch64/arch-diagnostics-1.c  (test for errors, line 1)
>>>   gcc.target/aarch64/arch-diagnostics-1.c (test for excess errors)
>>>   gcc.target/aarch64/arch-diagnostics-2.c  (test for errors, line 1)
>>>   gcc.target/aarch64/arch-diagnostics-2.c (test for excess errors)
>>>   gcc.target/aarch64/cpu-diagnostics-1.c  (test for errors, line 1)
>>>   gcc.target/aarch64/cpu-diagnostics-1.c (test for excess errors)
>>>   gcc.target/aarch64/cpu-diagnostics-2.c  (test for errors, line 1)
>>>   gcc.target/aarch64/cpu-diagnostics-2.c (test for excess errors)
>>>   gcc.target/aarch64/cpu-diagnostics-3.c  (test for errors, line 1)
>>>   gcc.target/aarch64/cpu-diagnostics-3.c (test for excess errors)
>>>   gcc.target/aarch64/cpu-diagnostics-4.c  (test for errors, line 1)
>>>   gcc.target/aarch64/cpu-diagnostics-4.c (test for excess errors)
>>
>> Those tests need adjustments, not to expect such errors on line 1, but on
>> line 0.
>>
>> I think following untested patch should fix that:
>>
> Indeed it does, thanks!
>

I took the liberty of committing it (r240030) on your behalf with this
ChangeLog:
2016-09-07  Jakub Jelinek  

PR middle-end/77475
* gcc.target/aarch64/arch-diagnostics-1.c: Expect error on line 0.
* gcc.target/aarch64/arch-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-1.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-3.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-4.c: Likewise.

Thanks,

Christophe

> Christophe
>
>> --- gcc/testsuite/gcc.target/aarch64/arch-diagnostics-1.c.jj2012-10-23 
>> 19:54:58.0 +0200
>> +++ gcc/testsuite/gcc.target/aarch64/arch-diagnostics-1.c   2016-09-06 
>> 12:10:32.241560531 +0200
>> @@ -1,4 +1,4 @@
>> -/* { dg-error "unknown" "" {target "aarch64*-*-*" } } */
>> +/* { dg-error "unknown" "" {target "aarch64*-*-*" } 0 } */
>>  /* { dg-options "-O2 -march=dummy" } */
>>
>>  void f ()
>> --- gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-2.c.jj 2016-05-18 
>> 10:59:49.0 +0200
>> +++ gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-2.c2016-09-06 
>> 12:11:08.170110003 +0200
>> @@ -1,4 +1,4 @@
>> -/* { dg-error "missing" "" {target "aarch64*-*-*" } } */
>> +/* { dg-error "missing" "" {target "aarch64*-*-*" } 0 } */
>>  /* { dg-skip-if "do not override -mcpu" { *-*-* } { "-mcpu=*" } { "" } } */
>>  /* { dg-options "-O2 -mcpu=cortex-a53+no" } */
>>
>> --- gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-1.c.jj 2016-05-18 
>> 10:59:49.0 +0200
>> +++ gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-1.c2016-09-06 
>> 12:11:01.698191158 +0200
>> @@ -1,4 +1,4 @@
>> -/* { dg-error "unknown" "" {target "aarch64*-*-*" } } */
>> +/* { dg-error "unknown" "" {target "aarch64*-*-*" } 0 } */
>>  /* { dg-skip-if "do not override -mcpu" { *-*-* } { "-mcpu=*" } { "" } } */
>>  /* { dg-options "-O2 -mcpu=dummy" } */
>>
>> --- gcc/testsuite/gcc.target/aarch64/arch-diagnostics-2.c.jj2012-10-23 
>> 19:54:58.0 +0200
>> +++ gcc/testsuite/gcc.target/aarch64/arch-diagnostics-2.c   2016-09-06 
>> 12:10:39.737466536 +0200
>> @@ -1,4 +1,4 @@
>> -/* { dg-error "missing" "" {target "aarch64*-*-*" } } */
>> +/* { dg-error "missing" "" {target "aarch64*-*-*" } 0 } */
>>  /* { dg-options "-O2 -march=+dummy" } */
>>
>>  void f ()
>> --- gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-3.c.jj 2016-05-18 
>> 10:59:49.0 +0200
>> +++ gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-3.c2016-09-06 
>> 12:11:13.628041564 +0200
>> @@ -1,4 +1,4 @@
>> -/* { dg-error "invalid feature" "" {target "aarch64*-*-*" } } */
>> +/* { dg-error "invalid feature" "" {target 

[Bug middle-end/77475] unnecessary or misleading context in reporting command line problems

2016-09-07 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77475

--- Comment #4 from Christophe Lyon  ---
Author: clyon
Date: Wed Sep  7 20:18:17 2016
New Revision: 240030

URL: https://gcc.gnu.org/viewcvs?rev=240030=gcc=rev
Log:
PR middle-end/77475: Fix AArch64 testcases.

2016-09-07  Jakub Jelinek  

PR middle-end/77475
* gcc.target/aarch64/arch-diagnostics-1.c: Expect error on line 0.
* gcc.target/aarch64/arch-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-1.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-3.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-4.c: Likewise.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/arch-diagnostics-1.c
trunk/gcc/testsuite/gcc.target/aarch64/arch-diagnostics-2.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-1.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-2.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-3.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-4.c

Re: Ping**2! Re: [PATCH, Fortran] Extension: AUTOMATIC/STATIC symbol attributes with -fdec-static

2016-09-07 Thread Fritz Reese
On Wed, Sep 7, 2016 at 9:31 AM, Andre Vehreschild  wrote:
> Hi Fritz,
>
> please note: I do not have official review privileges. So my vote here
> is rather an advise to you and the official reviewers. Often such a
> inofficial review helps to speed things up, because the official ones
> are pointed to the nics and nacs and don't have to bother with the
> minor things.

Andre,

Thank you very much for your comments. I have only been contributing
to GNU/GCC for a year or two and appreciate the advice. I definitely
strive to keep my patches in line with the relevant standards and
style guides. Attached is a replacement for the original patch which
addresses your concerns.


> - Do I understand this correctly: AUTOMATIC and STATIC have to come last,
>   i.e., right before the :: where declaring, e.g., a variable?
Not quite, you are probably misled by this code in decl.c:

> match
> gfc_match_static (void)
> {
...
> gfc_match (" ::");

(And equivalent for gfc_match_automatic.) These are similar to
gfc_match_save, and like gfc_match_save are only called to match
variable attribute specification _statements_, such as:
> SAVE :: x
> AUTOMATIC :: y

STATIC and AUTOMATIC are matched in any order in an attribute
specification _list_, as with SAVE, through the giant switch() earlier
in decl.c/match_attr_spec(). This applies to the following:
> INTEGER, SAVE, DIMENSION(3) :: x
> INTEGER, AUTOMATIC, DIMENSION(3) :: y


> - Running:
>
>   $ contrib/check_GNU_style.sh dec_static.patch
>
>   Reports some style issues in the C code, that should be fixed before
>   commit. (Style in Fortran testcases does not matter that much.)
I was not aware of this script - thanks!


> Please change formatting in a separate patch or not at all (here!).
> This policy is to distinguish cosmetic changes from relevant ones.
Fixed. These changes are usually accidental - I try not to reformat
code that I'm not otherwise touching.


>> diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
...
>> +With @option{-fdec-static} GNU Fortran supports the explicit specification 
>> of
>> +two addition variable attributes: @code{STATIC} and @code{AUTOMATIC}. These
...
> But is it only for variables? Can't it be used for equivalences or
> other constructs, too?
Yes, good point, perhaps 'entities' is a better term here:

+With @option{-fdec-static} GNU Fortran supports the DEC extended attributes
+@code{STATIC} and @code{AUTOMATIC} to provide explicit specification of entity
+storage. [...]

>> diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
...
>> +@item -fdec-static
>> +@opindex @code{fdec-static}
>> +Enable STATIC and AUTOMATIC as attributes specifying storage location.
>> +STATIC is equivalent to SAVE, and locals are typically AUTOMATIC by default.
>
> Well, this description to me sounds like: "Those attributes are
> useless, because they can be substituted." This is clearly not what you
> intend. I propose to include into the description that with "this
> switch the dec-extension" is available "to explicitly specify the
> storage of entities". Then the last sentence is still a good hint for
> all fortraneers that don't know the extension.

I guess I subconsciously made them sound "useless" because I hoped
users would think twice about using the extensions and use standard
conforming constructs instead. :-)  But, maybe you are right and this
would be clearer:

+@item -fdec-static
+@opindex @code{fdec-static}
+Enable DEC-style STATIC and AUTOMATIC attributes to explicitly specify
+the storage of variables and other objects.


>> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
...
>> +fdec-static
>> +Fortran Var(flag_dec_static)
>> +Enable STATIC and AUTOMATIC attributes.
>
> How about: Enable the dec-extension of STATIC and AUTOMATIC attributes.
> Just a proposal.
How about this, to match invoke.texi:

+fdec-static
+Fortran Var(flag_dec_static)
+Enable DEC-style STATIC and AUTOMATIC attributes.


> -  Please add some testcases where the new error messages are tested.
Yes, good idea! cf. attached for dec_static_3.f90 and
dec_static_4.f90. These tests gave me a chance to realize I should
emit some better error messages so I made a minor change there:

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index db431dd..be8e9f7 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -7838,6 +7838,8 @@ gfc_match_automatic (void)
   switch (m)
   {
   case MATCH_NO:
+break;
+
   case MATCH_ERROR:
 return MATCH_ERROR;

@@ -7856,7 +7858,7 @@ gfc_match_automatic (void)

   if (!seen_symbol)
 {
-  gfc_error ("Expected var-list in AUTOMATIC statement at %C");
+  gfc_error ("Expected entity-list in AUTOMATIC statement at %C");
   return MATCH_ERROR;
 }

... And similar for gfc_match_static. (Nb. "entity-list" was chosen to
correspond with the "save-entity-list" descriptor used by F90 to
specify the SAVE statement.)

Andre, thanks again for your comments. I 

[ARM] PR 67591 ARM v8 Thumb IT blocks are deprecated

2016-09-07 Thread Christophe Lyon
Hi,

The attached patch is a first part to solve PR 67591: it removes
several occurrences of "IT blocks containing 32-bit Thumb
instructions are deprecated in ARMv8" messages in the
gcc/g++/libstdc++/fortran testsuites.

It does not remove them all yet. This patch only modifies the
*cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,
*and_scc_scc and *and_scc_scc_cmp patterns.
Additional work is required in sub_shiftsi etc, at least.
I've started looking at these, but I decided I could already
post this self-contained patch to check if this implementation
is OK.

Regarding *cmp_and and *cmp_ior patterns, the addition of the
enabled_for_depr_it attribute is aggressive in the sense that it keeps
only the alternatives with 'l' and 'Py' constraints, while in some
cases the constraints could be relaxed. Indeed, these 2 patterns can
swap their input comparisons, meaning that any of them can be emitted
in the IT-block, and is thus subject to the ARMv8 deprecation.
The generated code is possibly suboptimal in the cases where the
operands are not swapped, since 'r' could be used.

Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html

Bootstrapped OK on armv8l HW.

Is this OK?

Thanks,

Christophe
2016-09-05  Christophe Lyon  

PR target/67591
* config/arm/arm.md (*cmp_and): Add enabled_for_depr_it attribute.
(*cmp_ior): Likewise.
(*ior_scc_scc): Add alternative for enabled_for_depr_it attribute.
(*ior_scc_scc_cmp): Likewise.
(*and_scc_scc): Likewise.
(*and_scc_scc_cmp): Likewise.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 318db75..0374bdd 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -9340,6 +9340,7 @@
   [(set_attr "conds" "set")
(set_attr "predicable" "no")
(set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
+   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
(set_attr_alternative "length"
   [(const_int 6)
(const_int 8)
@@ -9422,6 +9423,7 @@
   "
   [(set_attr "conds" "set")
(set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
+   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
(set_attr_alternative "length"
   [(const_int 6)
(const_int 8)
@@ -9444,13 +9446,13 @@
 )
 
 (define_insn_and_split "*ior_scc_scc"
-  [(set (match_operand:SI 0 "s_register_operand" "=Ts")
+  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
(ior:SI (match_operator:SI 3 "arm_comparison_operator"
-[(match_operand:SI 1 "s_register_operand" "r")
- (match_operand:SI 2 "arm_add_operand" "rIL")])
+[(match_operand:SI 1 "s_register_operand" "r,l")
+ (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
(match_operator:SI 6 "arm_comparison_operator"
-[(match_operand:SI 4 "s_register_operand" "r")
- (match_operand:SI 5 "arm_add_operand" "rIL")])))
+[(match_operand:SI 4 "s_register_operand" "r,l")
+ (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT
&& (arm_select_dominance_cc_mode (operands[3], operands[6], DOM_CC_X_OR_Y)
@@ -9469,6 +9471,7 @@
  DOM_CC_X_OR_Y),
CC_REGNUM);"
   [(set_attr "conds" "clob")
+   (set_attr "enabled_for_depr_it" "no,yes")
(set_attr "length" "16")
(set_attr "type" "multiple")]
 )
@@ -9478,13 +9481,13 @@
 (define_insn_and_split "*ior_scc_scc_cmp"
   [(set (match_operand 0 "dominant_cc_register" "")
(compare (ior:SI (match_operator:SI 3 "arm_comparison_operator"
- [(match_operand:SI 1 "s_register_operand" "r")
-  (match_operand:SI 2 "arm_add_operand" "rIL")])
+ [(match_operand:SI 1 "s_register_operand" "r,l")
+  (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
 (match_operator:SI 6 "arm_comparison_operator"
- [(match_operand:SI 4 "s_register_operand" "r")
-  (match_operand:SI 5 "arm_add_operand" "rIL")]))
+ [(match_operand:SI 4 "s_register_operand" "r,l")
+  (match_operand:SI 5 "arm_add_operand" "rIL,lPy")]))
 (const_int 0)))
-   (set (match_operand:SI 7 "s_register_operand" "=Ts")
+   (set (match_operand:SI 7 "s_register_operand" "=Ts,Ts")
(ior:SI (match_op_dup 3 [(match_dup 1) (match_dup 2)])
(match_op_dup 6 [(match_dup 4) (match_dup 5)])))]
   "TARGET_32BIT"
@@ -9499,18 +9502,19 @@
(set (match_dup 7) (ne:SI (match_dup 0) (const_int 0)))]
   ""
   [(set_attr "conds" "set")
+   

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, Bernd Edlinger wrote:

> interesting.  I just tried the test case from PR 77330 with _Decimal128.
> result: _Decimal128 did *not* trap with gcc4.8.4, but it does trap with
> gcc-7.0.0.

I checked with GCC 4.3; __alignof__ (_Decimal128) was 16 back then.  
Whether particular code happens to make use of that alignment requirement 
is inherently unpredictable.

> So it looks to me, the ABI has changed incompatible recently, and I

No, it's been the same since _Decimal128 was added in GCC 4.3 (__float128 
was first supported for 32-bit x86 in 4.4, and also had alignment 16 back 
then).

> I think that we need more flexibility in that area than we have
> right now, because config/i386/i386.h is doing the ABI,
> but it is shared by linux/glibc, windows, vxworks, darwin,
> solaris, to name a few.  I can't believe that we can just
> change that ABI for the whole world at the same time.

If the system compilers don't support these types, effectively we can.  
If the system compilers do support these types (and don't follow HJ's ABI 
document which says they are 16-byte-aligned), it's up to the target OS 
maintainers to make sure we stay compatible with them.

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug fortran/56670] Allocatable-length character var causes bogus warning with -Wall

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56670

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #4 from Manuel López-Ibáñez  ---
(In reply to Rich Townsend from comment #0)
> ...causes the following bogus warning:

If you compile with -fdump-tree-all-all-lineno and look at test.f90.004t.gimple
you'll see that gfortran generates:

  [test.f90:5:0] name_formatD.3382 = 0B;
  [test.f90:7:0] {
integer(kind=4)D.8 D.3383;

[test.f90:7:0] if (name_formatD.3382 != 0B) goto L.1D.3384; else goto
;
...
L.1D.3384:
[test.f90:7:0] if (.name_formatD.3381 == 0) goto L.2D.3385; else goto
;

If the last line is reached, then .name_format is used uninitialized. This
cannot happen here, because the first 'if' is always false. However, GCC does
not know that without the analysis done when using optimization. With -O1 there
is no warning.

(That code is quite strange, why test for != 0B after initializing it to 0B?)

> uninit_test.f90: In function ‘uninit_test’:
> uninit_test.f90:7:0: warning: ‘.name_format’ may be used uninitialized in
> this function [-Wmaybe-uninitialized]
>name_format = ''
>  ^
> 
> (Note also that the warning arises in the main program, and not in a
> function as the message suggests).

This is the output in a modern gfortran (with colors!):

test.f90:7:0:

   name_format = ''
 ^
Warning: ‘.name_format’ may be used uninitialized in this function
[-Wmaybe-uninitialized]

Re: [PATCH][AArch64] Improve legitimize_address

2016-09-07 Thread Christophe Lyon
Hi Wilco,

On 7 September 2016 at 14:43, Richard Earnshaw (lists)
 wrote:
> On 06/09/16 14:14, Wilco Dijkstra wrote:
>> Improve aarch64_legitimize_address - avoid splitting the offset if it is
>> supported.  When we do split, take the mode size into account.  BLKmode
>> falls into the unaligned case but should be treated like LDP/STP.
>> This improves codesize slightly due to fewer base address calculations:
>>
>> int f(int *p) { return p[5000] + p[7000]; }
>>
>> Now generates:
>>
>> f:
>>   add x0, x0, 16384
>>   ldr w1, [x0, 3616]
>>   ldr w0, [x0, 11616]
>>   add w0, w1, w0
>>   ret
>>
>> instead of:
>>
>> f:
>>   add x1, x0, 16384
>>   add x0, x0, 24576
>>   ldr w1, [x1, 3616]
>>   ldr w0, [x0, 3424]
>>   add w0, w1, w0
>>   ret
>>
>> OK for trunk?
>>
>> ChangeLog:
>> 2016-09-06  Wilco Dijkstra  
>>
>> gcc/
>>   * config/aarch64/aarch64.c (aarch64_legitimize_address):
>>   Avoid use of base_offset if offset already in range.
>
> OK.
>
> R.

After this patch, I've noticed a regression:
FAIL: gcc.target/aarch64/ldp_vec_64_1.c scan-assembler ldp\td[0-9]+, d[0-9]
You probably need to adjust the scan pattern.

Thanks,

Christophe


>
>> --
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 
>> 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3
>>  100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -5058,9 +5058,19 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
>> machine_mode mode)
>>/* For offsets aren't a multiple of the access size, the limit is
>>-256...255.  */
>>else if (offset & (GET_MODE_SIZE (mode) - 1))
>> - base_offset = (offset + 0x100) & ~0x1ff;
>> + {
>> +   base_offset = (offset + 0x100) & ~0x1ff;
>> +
>> +   /* BLKmode typically uses LDP of X-registers.  */
>> +   if (mode == BLKmode)
>> + base_offset = (offset + 512) & ~0x3ff;
>> + }
>> +  /* Small negative offsets are supported.  */
>> +  else if (IN_RANGE (offset, -256, 0))
>> + base_offset = 0;
>> +  /* Use 12-bit offset by access size.  */
>>else
>> - base_offset = offset & ~0xfff;
>> + base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
>>
>>if (base_offset != 0)
>>   {
>>
>


[Bug fortran/77504] "is used uninitialized" with allocatable string and array constructors

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77504

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #6 from Manuel López-Ibáñez  ---
If you do

gfortran -Wuninitialized test.f90 -fdump-tree-all-all-lineno

and look at test.f90.004t.gimple

you'll see that gfortran generates gimple such as:

  [test.f90:3:0] D.3445 = (sizetype) .help_textD.3381;
  [test.f90:3:0] D.3446 = (bitsizetype) D.3445;
  [test.f90:3:0] D.3438 = D.3446 * 8;
  [test.f90:3:0] D.3439 = (sizetype) .help_textD.3381;
  [test.f90:3:0] [test.f90:3:0] help_textD.3396.dataD.3382 = 0B;

I don't know much about Fortran or gfortran, but that is clearly using an
uninitialized value .help_text.

Interestingly, none of the D.* variables above are used for anything else, so I
wonder what is the purpose of that code.

[Bug c++/33952] -Wfatal-errors truncates multi-line error messages.

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33952

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #6 from Manuel López-Ibáñez  ---
(In reply to Richard Biener from comment #3)
> Apart from the issue regarding that the last two errors should be notes this
> is really impossible to "fix" if -Wfatal-errors should continue to work as
> designed.  That is, the only way would be to annotate all _callers_ of
> diagnostic
> functions to eventually terminate compilation, which is a too large overhead
> really.

Well, the fix is clear. A diagnostic should be an error/warning/sorry/... plus
one or more notes. The API should allow attaching notes to diagnostics and
everything is emitted at once. Clear but not easy to implement because of how
diagnostics work right now.

The second and third error are notes now, so using -fmax-errors=2 will print
the whole message and stop before the next error.

A hackish fix would be for -Wfatal-errors to check before emitting another
diagnostic (except notes) and stop if the numbers of errors is already 1. That
is better than -fmax-errors=2, since the latter will continue until the next
error, printing warnings along the way.

Re: Proposal: readable and writable attributes on variables

2016-09-07 Thread Jeff Law

On 09/01/2016 09:04 AM, Martin Sebor wrote:


Understood.  I think a write-only attribute or type qualifier would
make sense.  Until/unless it's implemented I would recommend to work
around its absence by hiding access to the registers behind a read-
only and write-only functional API.
As you noted earlier Martin, if we bake it into the typesystem, then you 
get the desired warnings when you mix-n-match types.  For that reason I 
see a type qualifier is superior to an attribute.


IIRC the national labs that were looking at the alignment attribute 
essentially came to the same conclusion -- bake it into the core of the 
typesystem and rely on the typesystem to ensure you don't lose the data.


Jeff


[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #4 from Manuel López-Ibáñez  ---
(In reply to petschy from comment #2)
> Here I see the same flags, yet for these two NULLs gcc warns.

prog.cc:6:28: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 , false, false };
^

because the location given is wrong, so the location in this case is not within
the system-header flags. If the location was right (under __null), it would not
warn. That is the bug.

[Bug fortran/77505] Negative character length not treated as LEN=0

2016-09-07 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77505

--- Comment #2 from Steve Kargl  ---
On Wed, Sep 07, 2016 at 07:01:40PM +, kargl at gcc dot gnu.org wrote:
> --- Comment #1 from kargl at gcc dot gnu.org ---
> This is going to require someone smarting than I to fix.
> The problem lies in trans-array.c (get_array_ctor_strlen).
> This function walks the elments of the array constructor
> looking for the string length.  It nave takes into account
> the type spec in [character(len=ii) :: ].

s/nave/never

Note, the problem may actually be in some other function
that calls get_array_ctor_strlen, because this function
never have been called!

[Bug fortran/77505] Negative character length not treated as LEN=0

2016-09-07 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77505

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||kargl at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from kargl at gcc dot gnu.org ---
This is going to require someone smarting than I to fix.
The problem lies in trans-array.c (get_array_ctor_strlen).
This function walks the elments of the array constructor
looking for the string length.  It nave takes into account
the type spec in [character(len=ii) :: ].

[Bug target/77494] -mcpu=cortex-a53 does not allow use of crc extensions

2016-09-07 Thread buzz at exotica dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77494

--- Comment #4 from Jools Wills  ---
Thanks - reported here - https://sourceware.org/bugzilla/show_bug.cgi?id=20567

[Bug tree-optimization/67971] Failure to unify conditional argument selection with conditional result selection

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67971

--- Comment #3 from Andrew Pinski  ---
Note we should able to optimize this just to
int
f1 (int cond, double x, double y)
{ 
  double z1, z2;

  z2 = __builtin_cos (cond ? x : y);
  return z2 == z2;
}


But that is 64700.

[Bug fortran/77518] New: ICE in gfc_advance_chain, at fortran/trans.c:58

2016-09-07 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77518

Bug ID: 77518
   Summary: ICE in gfc_advance_chain, at fortran/trans.c:58
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

Compiles with 4.8/4.9, aborts with versions 5/6/7 :


$ cat z1.f90
program p
   type t
   end type
   class(t), allocatable :: z[:]
   print *, sizeof(z)
end


$ gfortran-7-20160904 -fcoarray=single z1.f90
z1.f90:5:0:

print *, sizeof(z)

internal compiler error: in gfc_advance_chain, at fortran/trans.c:58
0x72227b gfc_advance_chain(tree_node*, int)
../../gcc/fortran/trans.c:58
0x755d5f gfc_class_vptr_get(tree_node*)
../../gcc/fortran/trans-expr.c:149
0x756778 class_vtab_field_get
../../gcc/fortran/trans-expr.c:222
0x756778 gfc_class_vtab_size_get(tree_node*)
../../gcc/fortran/trans-expr.c:255
0x77b8e9 gfc_conv_intrinsic_sizeof
../../gcc/fortran/trans-intrinsic.c:6006
0x783d6b gfc_conv_intrinsic_function(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-intrinsic.c:8276
0x760f72 gfc_conv_expr(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-expr.c:7602
0x767ca8 gfc_conv_expr_reference(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-expr.c:7737
0x78aa06 gfc_trans_transfer(gfc_code*)
../../gcc/fortran/trans-io.c:2459
0x7227d7 trans_code
../../gcc/fortran/trans.c:1886
0x787828 build_dt
../../gcc/fortran/trans-io.c:1958
0x7227f7 trans_code
../../gcc/fortran/trans.c:1858
0x7517a8 gfc_generate_function_code(gfc_namespace*)
../../gcc/fortran/trans-decl.c:6224
0x6dd050 translate_all_program_units
../../gcc/fortran/parse.c:5911
0x6dd050 gfc_parse_file()
../../gcc/fortran/parse.c:6117
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

[Bug fortran/77517] ICE in conv_intrinsic_move_alloc, at fortran/trans-intrinsic.c:9517

2016-09-07 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77517

--- Comment #2 from Gerhard Steinmetz  
---

For completeness, older versions below 5 give :


$ gfortran-4.9 z1.f90
z1.f90:3.20:

   call move_alloc (a, b)
1
Error: 'from' argument of 'move_alloc' intrinsic at (1) must be the same type
and kind as 'to'


$ gfortran-4.9 z2.f90
z2.f90:1.9:

program p
 1
Internal Error at (1):
Invalid expression in gfc_element_size.


$ gfortran-4.8 z2.f90
z2.f90: In function 'p':
z2.f90:1:0: internal compiler error: in gfc_typenode_for_spec, at
fortran/trans-types.c:1136
 program p
 ^
Please submit a full bug report,
with preprocessed source if appropriate.

---


Backup, now silently accepted :

program p
   class(*), allocatable :: a, b
contains
   subroutine b
   end
end

program p
   class(*), allocatable :: a, b
contains
   subroutine a
   end
end

program p
   class(*), allocatable :: a, b
contains
   function b()
   end
end

program p
   class(*), allocatable :: a, b
contains
   function a()
   end
end

[Bug fortran/77517] ICE in conv_intrinsic_move_alloc, at fortran/trans-intrinsic.c:9517

2016-09-07 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77517

--- Comment #1 from Gerhard Steinmetz  
---

Similar with subroutine named "a" :


$ cat z2.f90
program p
   class(*), allocatable :: a, b
   call move_alloc (a, b)
contains
   subroutine a
   end
end


$ gfortran-7-20160904 z2.f90
f951: internal compiler error: Invalid expression in gfc_element_size.
0x68953f gfc_internal_error(char const*, ...)
../../gcc/fortran/error.c:1317
0x71304b gfc_element_size(gfc_expr*)
../../gcc/fortran/target-memory.c:126
0x671517 find_intrinsic_vtab
../../gcc/fortran/class.c:2583
0x671517 gfc_find_vtab(gfc_typespec*)
../../gcc/fortran/class.c:2725
0x66703d gfc_check_move_alloc(gfc_expr*, gfc_expr*)
../../gcc/fortran/check.c:3347
0x6a4f87 gfc_intrinsic_sub_interface(gfc_code*, int)
../../gcc/fortran/intrinsic.c:4676
0x6f2972 resolve_unknown_s
../../gcc/fortran/resolve.c:3356
0x6f2972 resolve_call
../../gcc/fortran/resolve.c:3472
0x6ef8c0 gfc_resolve_code(gfc_code*, gfc_namespace*)
../../gcc/fortran/resolve.c:10733
0x6f1cb2 resolve_codes
../../gcc/fortran/resolve.c:15697
0x6f1dae gfc_resolve(gfc_namespace*)
../../gcc/fortran/resolve.c:15732
0x6dce7a resolve_all_program_units
../../gcc/fortran/parse.c:5850
0x6dce7a gfc_parse_file()
../../gcc/fortran/parse.c:6102
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

[Bug fortran/77517] New: ICE in conv_intrinsic_move_alloc, at fortran/trans-intrinsic.c:9517

2016-09-07 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77517

Bug ID: 77517
   Summary: ICE in conv_intrinsic_move_alloc, at
fortran/trans-intrinsic.c:9517
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

Invalid code with a name conflict between variable and subroutine.
Affects versions 5, 6, 7 :


$ cat z1.f90
program p
   class(*), allocatable :: a, b
   call move_alloc (a, b)
contains
   subroutine b
   end
end


$ gfortran-7-20160904 z1.f90
z1.f90:3:0:

call move_alloc (a, b)

internal compiler error: in conv_intrinsic_move_alloc, at
fortran/trans-intrinsic.c:9517
0x77fe2e conv_intrinsic_move_alloc
../../gcc/fortran/trans-intrinsic.c:9516
0x77fe2e gfc_conv_intrinsic_subroutine(gfc_code*)
../../gcc/fortran/trans-intrinsic.c:9770
0x7229f2 trans_code
../../gcc/fortran/trans.c:1746
0x7517a8 gfc_generate_function_code(gfc_namespace*)
../../gcc/fortran/trans-decl.c:6224
0x6dd050 translate_all_program_units
../../gcc/fortran/parse.c:5911
0x6dd050 gfc_parse_file()
../../gcc/fortran/parse.c:6117
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Bernd Edlinger
On Wed, 7 Sep 2016, Joseph Myers wrote:
 > On Wed, 7 Sep 2016, Florian Weimer wrote:
 >
 > > The existence of such a cut-off constant appears useful, but it's 
not clear if
 > > it should be tied to the fundamental alignment (especially, as this 
discussion
 > > shows, the fundamental alignment will be somewhat in flux).
 >
 > I don't think it's in flux.  I think it's very rare for a new basic type
 > to be added that has increased alignment requirements compared to the
 > existing basic types.  _Decimal128 was added in GCC 4.3, which increased
 > the requirement on 32-bit x86 (only) (32-bit x86 malloc having been 
buggy
 > in that regard ever since then); __float128 / _Float128 did not increase
 > the requirement relative to that introduced with _Decimal128. 
(Obviously
 > if CPLEX part 2 defines vector types that might have larger alignment
requirements it must avoid defining them as basic types.)


Hmm...

interesting.  I just tried the test case from PR 77330 with _Decimal128.
result: _Decimal128 did *not* trap with gcc4.8.4, but it does trap with
gcc-7.0.0.

include 
#include 

struct s {
 _Decimal128 f1;
};

int main(void)
{
 struct s *p = malloc(sizeof(struct s));
 printf("%p\n", p);
 p->f1 = 1.234;
 return 0;
}

gcc -m32 test.c

./a.out
0x9588008
Segmentation fault (core dumped)


So it looks to me, the ABI has changed incompatible recently, and I
believe that we need to proceed more carefully here.  While in the
the future most targets will support aligned _Float128, we can
still support 4-byte aligned _Decimal128 on a per-target base.

I think that we need more flexibility in that area than we have
right now, because config/i386/i386.h is doing the ABI,
but it is shared by linux/glibc, windows, vxworks, darwin,
solaris, to name a few.  I can't believe that we can just
change that ABI for the whole world at the same time.

See my experiment at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77330#c9
which may be a hack, but shows that is no impossibility.



Bernd.


[Bug fortran/77516] New: ICE in is_gimple_min_invariant, at gimple-expr.c:706

2016-09-07 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77516

Bug ID: 77516
   Summary: ICE in is_gimple_min_invariant, at gimple-expr.c:706
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

Invalid code with a zero or negative safelen value.
Affects versions 5, 6, 7 (not supported below 5) at -Os, -O1 or higher.


$ cat z1.f90
program p
   integer :: i, x
   x = 0
!$omp simd safelen(0) reduction(+:x)
   do i = 1, 8
  x = x + 1
   end do
   print *, x
end


$ gfortran-7-20160904 -O2 -fopenmp z1.f90
z1.f90:4:0:

 !$omp simd safelen(0) reduction(+:x)

internal compiler error: Segmentation fault
0xc2100f crash_signal
../../gcc/toplev.c:336
0x98a370 is_gimple_min_invariant(tree_node const*)
../../gcc/gimple-expr.c:706
0x9bc9d0 gimplify_compound_lval
../../gcc/gimplify.c:2208
0x9b394a gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc/gimplify.c:10495
0x9c6ab5 gimplify_modify_expr
../../gcc/gimplify.c:4824
0x9b578a gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc/gimplify.c:10543
0x9b8ae6 gimplify_stmt(tree_node**, gimple**)
../../gcc/gimplify.c:5805
0x9b9621 gimplify_and_add(tree_node*, gimple**)
../../gcc/gimplify.c:427
0x9b9621 gimplify_assign(tree_node*, tree_node*, gimple**)
../../gcc/gimplify.c:12062
0xb26abb lower_rec_input_clauses
../../gcc/omp-low.c:5249
0xb2b8b9 lower_omp_for
../../gcc/omp-low.c:15131
0xb1b946 lower_omp_1
../../gcc/omp-low.c:17041
0xb1b946 lower_omp
../../gcc/omp-low.c:17178
0xb1aadc lower_omp_1
../../gcc/omp-low.c:17017
0xb1aadc lower_omp
../../gcc/omp-low.c:17178
0xb1b47c lower_omp_1
../../gcc/omp-low.c:17026
0xb1b47c lower_omp
../../gcc/omp-low.c:17178
0xb2265f execute_lower_omp
../../gcc/omp-low.c:17913
0xb2265f execute
../../gcc/omp-low.c:17950

[Bug target/60918] [mips] unaligned memory copy with LWL/LWR

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60918

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
This code is correct because GCC assumes memcpy is used on normal memory.
If you use memcpy on non-normal memory you should not be using memcpy here.

   0: 8c04123c  lw  a0,4668(zero)
   4: 3c021234  lui v0,0x1234
   8: 24425678  addiu v0,v0,22136
   c: 8c031240  lw  v1,4672(zero)
  10: ac82  sw  v0,0(a0)
  14: 8885  lwl a1,0(a0)  -> first access: LWL command
  18: 1021  move  v0,zero
  1c: 98850003  lwr a1,3(a0)  -> second access: LWR command
  20:   nop
  24: a865  swl a1,0(v1)  -> for store operation the same procedure
  28: 03e8  jr  ra
  2c: b8650003  swr a1,3(v1)  -> second store command

Re: [Patch RFC] Modify excess precision logic to permit FLT_EVAL_METHOD=16

2016-09-07 Thread Joseph Myers
On Tue, 6 Sep 2016, James Greenhalgh wrote:

> In c-family/c-cppbuiltin.c I've updated cpp_iec_559_value such that also
> allow setting __GEC_IEC_559 if FLT_EVAL_METHOD=16, and I've updated
> c_cpp_builtins to handle the new value, and use the new enum names.

I think the special cases in this patch show that the abstractions are 
wrong.

How about instead having more than one target macro / hook.  One would 
indicate that excess precision is used by insn patterns (and be set only 
for i386 and m68k).  Another would indicate the API-level excess precision 
requested by the target and might take an argument to indicate whether the 
"fast" or "standard" case is in use (in the x86 case the results would 
differ depending on the argument, in the ARM case they wouldn't).

* If the first hook returns true, excess precision is unpredictable in the 
"fast" case.  Otherwise excess precision is predictable.

* For short-circuiting excess_precision_type, maybe an internal setting 
EXCESS_PRECISION_NONE would make sense, and other settings would be turned 
into it if permitted by the results of the second hook.

(For s390, talk to the maintainers, but I think they really need to 
eliminate the bogus float_t definition in glibc and the FLT_EVAL_METHOD 
setting that goes along with it.  This is the sort of theoretical ABI 
change that should be safe in practice.)

> +   machine_mode float16_type_mode = (FLOATN_TYPE_NODE (0)
> + ? TYPE_MODE (FLOATN_TYPE_NODE (0))
> + : VOIDmode);

This is obviously wrong.  Use float16_type_node.

> +   ||mode == TYPE_MODE (float_type_node));

Missing space after ||.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch libgcc] Enable HCmode multiply and divide (mulhc3/divhc3)

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, James Greenhalgh wrote:

> 2016-09-07  James Greenhalgh  
> 

I'd think this should say

PR target/63250

as being part of fixing that bug (although not a fix by itself).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 3/4][Ada,DJGPP] Ada support for DJGPP

2016-09-07 Thread Andris Pavenis

On 09/05/2016 09:42 AM, Eric Botcazou wrote:

Attached output is from last test build (r239639 with DJGPP related patches
applied, last version of patches for Ada).

Very strange error, line 28 of gtype-ada.h is supposed to have a guard for
nodes containing the 'common' structure.  Can you post an excerpt of the file?

Verified that contents of gtype-ada.h from DJGPP build is identical with one from my Linux build of 
yesterdays trunk (except of CR LF used for line separator in DJGPP build).


I'll try to find the revision from which the problem appears.

Andris





[Bug target/63346] xserver_xorg-server-1.15.1 crash on RaspberryPi when compiled with gcc-4.9

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63346

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |5.4

--- Comment #6 from Andrew Pinski  ---
Fixed then.

[Bug bootstrap/30136] bootstrap fail for 4.3-20061209

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30136

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #10 from Andrew Pinski  ---
So let's close as invalid then.

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, Florian Weimer wrote:

> The existence of such a cut-off constant appears useful, but it's not clear if
> it should be tied to the fundamental alignment (especially, as this discussion
> shows, the fundamental alignment will be somewhat in flux).

I don't think it's in flux.  I think it's very rare for a new basic type 
to be added that has increased alignment requirements compared to the 
existing basic types.  _Decimal128 was added in GCC 4.3, which increased 
the requirement on 32-bit x86 (only) (32-bit x86 malloc having been buggy 
in that regard ever since then); __float128 / _Float128 did not increase 
the requirement relative to that introduced with _Decimal128.  (Obviously 
if CPLEX part 2 defines vector types that might have larger alignment 
requirements it must avoid defining them as basic types.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] [PR libcpp/71681] Fix handling header.gcc in subdirectories

2016-09-07 Thread Andris Pavenis
This patch fixes handling header.gcc in subdirectories when command line option -remap has been 
used. Current version finds header.gcc in directories from GCC include directory search path but 
fails to find them in subdirectories due to missing directory separator.


Andris

2016-09-07 Andris Pavenis  

* files.c (remap_filename): Fix handling -remap in subdirectories


>From 77e02ba755fa9ea66e046ecf6dbc61c306bc2a71 Mon Sep 17 00:00:00 2001
From: Andris Pavenis 
Date: Wed, 7 Sep 2016 18:22:32 +0300
Subject: [PATCH] * files.c (remap_filename): Fix handling -remap in
 subdirectories

---
 libcpp/files.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/libcpp/files.c b/libcpp/files.c
index c8bb637..26a7330 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -1672,7 +1672,7 @@ static char *
 remap_filename (cpp_reader *pfile, _cpp_file *file)
 {
   const char *fname, *p;
-  char *new_dir;
+  char *new_dir, *p3;
   cpp_dir *dir;
   size_t index, len;
 
@@ -1701,9 +1701,15 @@ remap_filename (cpp_reader *pfile, _cpp_file *file)
 	return NULL;
 
   len = dir->len + (p - fname + 1);
-  new_dir = XNEWVEC (char, len + 1);
+  new_dir = XNEWVEC (char, len + 2);
+  p3 = new_dir + dir->len;
   memcpy (new_dir, dir->name, dir->len);
-  memcpy (new_dir + dir->len, fname, p - fname + 1);
+  if (dir->len && !IS_DIR_SEPARATOR(dir->name[dir->len - 1]))
+	{
+	  *p3++ = '/';
+	  len++;
+	}
+  memcpy (p3, fname, p - fname + 1);
   new_dir[len] = '\0';
 
   dir = make_cpp_dir (pfile, new_dir, dir->sysp);
-- 
2.7.4



[Bug middle-end/77515] GCC fusing of multiply-add ["FMA"] occurring at "-O3" withOUT "-ffast-math" and withOUT "-ffp-contract=fast"

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77515

--- Comment #2 from Andrew Pinski  ---
>This seems to be in violation of preservation of exactly the same results

Huh?  Where did you read that?  

Also GCC defaults not to on but to:
-ffp-contract=fast


Read:
https://gcc.gnu.org/onlinedocs/gcc-6.2.0/gcc/Optimize-Options.html#index-ffp-contract-715

Where it says that :).

[Bug middle-end/77515] GCC fusing of multiply-add ["FMA"] occurring at "-O3" withOUT "-ffast-math" and withOUT "-ffp-contract=fast"

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77515

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
GCC enables it with -O2 not just -O3.
GCC also defaults to -ffp-contract=on.

This has been true since 2.95.3 or even earlier (most likely ever since PowerPC
support was added).

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #10 from Segher Boessenkool  ---
(In reply to avieira from comment #9)
> > > So I dont know... Only thing I can think of is better "value-range"-like
> > > analysis for combine, but that might be too costly?

That is what nonzero_bits etc. is about.  We could do much better nowadays
with the generic DF framework.

> > So we are not really looking for combine to combine the shift stmt
> > with the xor stmt?  Because combine doesn't consider that because of
> > the multi-use.
> 
> AFAIK, combine will not combine the shift and xor because they are in
> different basic blocks. The multi-use prevents it from tracking the origin
> of r112 back to a point where it knows that it its higher bits are all 0.

Yes.  Cross BB prevents combining insns; multiple use does not (but it
does of course make it less likely, and you typically get a 3->2 combination).

Is code hoisting making the code better at all here?  (At RTL level)

[committed] Move class substring_loc from c-family into gcc

2016-09-07 Thread David Malcolm
On Sat, 2016-09-03 at 10:13 +0100, Manuel López-Ibáñez wrote:
> On 02/09/16 23:55, Martin Sebor wrote:
> > > diff --git a/gcc/substring-locations.h b/gcc/substring
> > > -locations.h
> > > index f839c74..bb0de4f 100644
> > > --- a/gcc/substring-locations.h
> > > +++ b/gcc/substring-locations.h
> > > @@ -20,6 +20,73 @@ along with GCC; see the file COPYING3.  If not
> > > see
> > >   #ifndef GCC_SUBSTRING_LOCATIONS_H
> > >   #define GCC_SUBSTRING_LOCATIONS_H
> > > 
> > > +#include 
> > > +
> 
> Is this header file going to be used in the middle-end? If so, then
> it is
> suspicious that it includes cpplib.h. Otherwise, perhaps it should
> live in
> c-family/

The include of cpplib.h turned out not to be necessary, so I removed it.
I also reworded the comment for class substring_loc; here's the patch
I ended up committing.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Committed to trunk as r240028.

gcc/ChangeLog:
* Makefile.in (OBJS): Add substring-locations.o.
* langhooks-def.h (class substring_loc): New forward decl.
(lhd_get_substring_location): New decl.
(LANG_HOOKS_GET_SUBSTRING_LOCATION): New macro.
(LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_GET_SUBSTRING_LOCATION.
* langhooks.c (lhd_get_substring_location): New function.
* langhooks.h (class substring_loc): New forward decl.
(struct lang_hooks): Add field get_substring_location.
* substring-locations.c: New file, taking definition of
format_warning_va and format_warning_at_substring from
c-family/c-format.c, making them non-static.
* substring-locations.h (class substring_loc): Move class here
from c-family/c-common.h.  Add and rewrite comments.
(format_warning_va): New decl.
(format_warning_at_substring): New decl.
(get_source_location_for_substring): Add comment.

gcc/c-family/ChangeLog:
* c-common.c (get_cpp_ttype_from_string_type): Handle being passed
a POINTER_TYPE.
(substring_loc::get_location): Move to substring-locations.c,
keeping implementation as...
(c_get_substring_location): New function, from the above, reworked
to use accessors rather than member lookup.
* c-common.h (class substring_loc): Move to substring-locations.h,
replacing with a forward decl.
(c_get_substring_location): New decl.
* c-format.c: Include "substring-locations.h".
(format_warning_va): Move to substring-locations.c.
(format_warning_at_substring): Likewise.

gcc/c/ChangeLog:
* c-lang.c (LANG_HOOKS_GET_SUBSTRING_LOCATION): Use
c_get_substring_location for this new langhook.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_string_literals.c: Include
"substring-locations.h".
---
 gcc/Makefile.in|   1 +
 gcc/c-family/c-common.c|  23 +--
 gcc/c-family/c-common.h|  32 +---
 gcc/c-family/c-format.c| 157 +
 gcc/c/c-lang.c |   3 +
 gcc/langhooks-def.h|   8 +-
 gcc/langhooks.c|   8 +
 gcc/langhooks.h|   9 +
 gcc/substring-locations.c  | 195 +
 gcc/substring-locations.h  |  71 
 .../diagnostic_plugin_test_string_literals.c   |   1 +
 11 files changed, 312 insertions(+), 196 deletions(-)
 create mode 100644 gcc/substring-locations.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7c18285..332c85e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1443,6 +1443,7 @@ OBJS = \
store-motion.o \
streamer-hooks.o \
stringpool.o \
+   substring-locations.o \
target-globals.o \
targhooks.o \
timevar.o \
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 399ba97..ec1d87a 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -1122,6 +1122,9 @@ static enum cpp_ttype
 get_cpp_ttype_from_string_type (tree string_type)
 {
   gcc_assert (string_type);
+  if (TREE_CODE (string_type) == POINTER_TYPE)
+string_type = TREE_TYPE (string_type);
+
   if (TREE_CODE (string_type) != ARRAY_TYPE)
 return CPP_OTHER;
 
@@ -1148,23 +1151,23 @@ get_cpp_ttype_from_string_type (tree string_type)
 
 GTY(()) string_concat_db *g_string_concat_db;
 
-/* Attempt to determine the source location of the substring.
-   If successful, return NULL and write the source location to *OUT_LOC.
-   Otherwise return an error message.  Error messages are intended
-   for GCC developers (to help debugging) rather than for end-users.  */
+/* Implementation of LANG_HOOKS_GET_SUBSTRING_LOCATION.  */
 
 const char *
-substring_loc::get_location (location_t *out_loc) const
+c_get_substring_location (const 

[Bug tree-optimization/77514] [6/7 Regression] ICE in VN_INFO_GET, at tree-ssa-sccvn.c:406 w/ -O2 (and above)

2016-09-07 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77514

Markus Trippelsdorf  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code

--- Comment #2 from Markus Trippelsdorf  ---
On the other hand: l0 += (*rs ^ (l0 &= 1));
looks like undefined behavior.

[Bug tree-optimization/77514] [6/7 Regression] ICE in VN_INFO_GET, at tree-ssa-sccvn.c:406 w/ -O2 (and above)

2016-09-07 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77514

Markus Trippelsdorf  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||trippels at gcc dot gnu.org
Summary|[7 Regression] ICE in   |[6/7 Regression] ICE in
   |VN_INFO_GET, at |VN_INFO_GET, at
   |tree-ssa-sccvn.c:406 w/ -O2 |tree-ssa-sccvn.c:406 w/ -O2
   |(and above) |(and above)
 Ever confirmed|0   |1

--- Comment #1 from Markus Trippelsdorf  ---
gcc-6 is also affected. 
Git blame points to r228320.

[Bug middle-end/77515] New: GCC fusing of multiply-add ["FMA"] occurring at "-O3" withOUT "-ffast-math" and withOUT "-ffp-contract=fast"

2016-09-07 Thread abe_skolnik at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77515

Bug ID: 77515
   Summary: GCC fusing of multiply-add ["FMA"] occurring at "-O3"
withOUT "-ffast-math" and withOUT "-ffp-contract=fast"
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: abe_skolnik at yahoo dot com
  Target Milestone: ---

In GCC, on both x86_64 and AArch64, fusing of multiply-add ["FMA"] is occurring
at "-O3" withOUT "-ffast-math" and withOUT "-ffp-contract=fast".  This seems to
be in violation of preservation of exactly the same results as "-O2" and lower
without those "-f<...>" flags, since the fusing may result in higher precision
and therefor different results.

Clang/LLVM without "-f<...>" flags, on both x86_64 and AArch64, only performs
fusing at "-Ofast", not at "-O3".



fma_test.c
--
double fma(double a, double b, double c) {
  return a*b+c;
}



example console log
---
> clang_amd64 -march=haswell -O3fma_test.c -S -o - |  grep -c fmadd
0
> clang_amd64 -march=haswell -O3fma_test.c -S -o - | egrep -c 
> '(addsd|mulsd)'
2

> clang_amd64 -march=haswell -Ofast fma_test.c -S -o - |  grep -c fmadd
1
> clang_amd64 -march=haswell -Ofast fma_test.c -S -o - | egrep -c 
> '(addsd|mulsd)'
0

> clang_aarch64  -O3fma_test.c -S -o - |  grep -c fmadd
0
> clang_aarch64  -O3fma_test.c -S -o - | egrep -c '(fadd|fmul)'
2

> clang_aarch64  -Ofast fma_test.c -S -o - |  grep -c fmadd
1
> clang_aarch64  -Ofast fma_test.c -S -o - | egrep -c '(fadd|fmul)'
0

> gcc_aarch64-O3fma_test.c -S -o - |  grep -c fmadd
1
> gcc_aarch64-O3fma_test.c -S -o - | egrep -c '(fadd|fmul)'
0

> gcc_amd64   -march=haswell -O3fma_test.c -S -o - | grep -c fmadd
1
> gcc_amd64   -march=haswell -O3fma_test.c -S -o - | egrep -c 
> '(addsd|mulsd)'
0

[Bug tree-optimization/77514] New: [7 Regression] ICE in VN_INFO_GET, at tree-ssa-sccvn.c:406 w/ -O2 (and above)

2016-09-07 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77514

Bug ID: 77514
   Summary: [7 Regression] ICE in VN_INFO_GET, at
tree-ssa-sccvn.c:406 w/ -O2 (and above)
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-7.0.0-alpha20160904 ICEs when compiling the following reduces snippet at
-O2 (-O3, -Ofast):

void
m1 (char l0, char e8, int hw)
{
  char *rs = 

 yu:
  l0 = 1;
  while (l0 != 0)
{
  l0 = -l0;
  l0 += (*rs ^ (l0 &= 1));
}
  for (;;)
{
  if (hw != 0)
goto yu;
  rs = 
}
}


% gcc-7.0.0-alpha20160904 -O2 -c nhbgsfda.c -Wall -Wextra -Wpedantic
nhbgsfda.c: In function 'm1':
nhbgsfda.c:11:10: warning: operation on 'l0' may be undefined
[-Wsequence-point]
   l0 += (*rs ^ (l0 &= 1));
  ^~
nhbgsfda.c:2:1: internal compiler error: in VN_INFO_GET, at
tree-ssa-sccvn.c:406
 m1 (char l0, char e8, int hw)
 ^~

[Bug fortran/57117] [OOP] ICE for sourced allocation of a polymorphic entity using TRANSPOSE

2016-09-07 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57117

--- Comment #14 from Dominique d'Humieres  ---
> Created attachment 39581 [details]
> Shorter version to fix the issue.
> ...

The patch fixes the ICEs, but generates wrong-code for the original test: the
output at run time is

 FAIL:  T T T T F T T T T T T T T T T

Same thing for the test in comment 5:

 ti
   1.2.3.  
 4.5.6.   
7.8.9. 
 ri
   1.2.3.  
 4.5.0.   
0.0.0. 
   1   2   3   4   0   0   
   0   0   0

Re: why do we need xtensa-config.h?

2016-09-07 Thread augustine.sterl...@gmail.com
On Wed, Sep 7, 2016 at 9:21 AM, augustine.sterl...@gmail.com
 wrote:
> Hope this helps, and I'm happy to answer more questions.

Also, one technique commonly used by people who ship software for
Xtensa is to write it such that it could compile for any variant at
all. This requires care, but is quite doable.


Re: why do we need xtensa-config.h?

2016-09-07 Thread augustine.sterl...@gmail.com
On Tue, Sep 6, 2016 at 11:55 PM, Thomas Schwinge
 wrote:
> Hi!
>
> Neither do I really know anything about Xtensa, nor do I have a lot of
> experience in these parts of GCC back ends, but:

There is a lot of background to know here. Unfortunately, I have no
familiarity with making debian packages, so I'm unfamiliar with that
side of it.

First--and perhaps most important--the current method of configuring
GCC for xtensa targets has worked well for nearly two decades. As far
as I know, it is rare to encounter problems. Because of that, the bar
to change it will probably be fairly high to change it.

> On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
> wrote:
>> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
>> way to provide this firmware as opensource/free package for debian. Main
>> problem seems to be the need to patch gcc xtensa-config.h to make it
>> suitable for our CPU.
>>
>> I have fallowing questions:
>>
>> do we really need this patch?
>> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch
>
> That I can't tell.  ;-)

You need something like that patch, for sure.

>> Is it possible or welcome to extend gcc to be configurable without
>> patching it all the time?
>
> Yes, I would think.  The macros modified in the above patch to GCC's
> include/xtensa-config.h file look like these ought to be modifiable with
> -m* options defined by the Xtensa back end, and you'd then assign
> specific defaults to a specific CPU variant, and build GCC (or build a
> multilib) for that configuration.

Today, there are literally hundreds of variants of the xtensa cpu
actually realized and in use. Having a list of all those variants and
their defaults inside gcc would be awkward and unwieldy.

But--and here's the rub--literally tomorrow, someone could design a
hundred more that are different from all of the ones already out
there. There is literally an unlimited number of potential variants,
each with potentially brand new, never conceived instructions. (Adding
clever custom instructions is xtensa's raison d'etre.)

With the current configurability mechanism, supporting all of those
variants inside gcc (and, in fact, the rest of the gnu-toolchain) is
simply a matter of using the correct xtensa-config.h for that
particular variant. If we were to go with the "-m with defaults"
mechanism, we would need some way of adding the defaults for the new
variant to gcc.

But that is patching gcc also, and once you go there, you may as well
use the original method.

>
> This file include/xtensa-config.h is #included in
> gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,

Note that "-m" options can't change the instructions in crti.S and
lib?funcs.S, but macros can and do.



On the debian packaging side. Forgive me for my ignorance on the
topic; I don't know that the tool-flow is, or what the requirements
are. As far as I am aware, this is the first time someone has tried to
make a debian package for xtensa.

Anyway, I wouldn't expect patching gcc (or configuring it) for an
individual package is the right thing. It should probably happen as
part of some kind of "setup toolchain" step.

Typically, patching gcc for a xtensa config happens just once
immediately after designing the processor, or--if you aren't the
designer yourself--when one starts development for that variant.

Surely if someone is building this package, they would have already
built some other software for that particular xtensa target. (Perhaps
as part of a larger debian build?)

Also, this package should probably only be built when targeting this
particular xtensa variant (not xtensa generally). I don't know how one
restricts this in the debian packaging mechanism.

Hope this helps, and I'm happy to answer more questions.


[Patch libgcc] Enable HCmode multiply and divide (mulhc3/divhc3)

2016-09-07 Thread James Greenhalgh

Hi,

This patch arranges for half-precision complex multiply and divide
routines to be built if __LIBGCC_HAS_HF_MODE__.  This will be true
if the target supports the _Float16 type.

OK?

Thanks,
James

---

libgcc/

2016-09-07  James Greenhalgh  

*  Makefile.in (lib2funcs): Build _mulhc3 and _divhc3.
* libgcc2.h (LIBGCC_HAS_HF_MODE): Conditionally define.
(HFtype): Likewise.
(HCtype): Likewise.
(__divhc3): Likewise.
(__mulhc3): Likewise.
* libgcc2.c: Support _mulhc3 and _divhc3.

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index ba37c65..53e3ea2 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -414,8 +414,9 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _ashldi3 _ashrdi3 _cmpdi2 _ucmpdi2	   \
 	_negvsi2 _negvdi2 _ctors _ffssi2 _ffsdi2 _clz _clzsi2 _clzdi2  \
 	_ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2	   \
 	_paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2	   \
-	_mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3	   \
-	_divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
+	_mulhc3 _mulsc3 _muldc3 _mulxc3 _multc3 _divhc3 _divsc3	   \
+	_divdc3 _divxc3 _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2	   \
+	_clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
diff --git a/libgcc/libgcc2.c b/libgcc/libgcc2.c
index 0a716bf..ec3b21f 100644
--- a/libgcc/libgcc2.c
+++ b/libgcc/libgcc2.c
@@ -1852,7 +1852,8 @@ NAME (TYPE x, int m)
 
 #endif
 
-#if ((defined(L_mulsc3) || defined(L_divsc3)) && LIBGCC2_HAS_SF_MODE) \
+#if((defined(L_mulhc3) || defined(L_divhc3)) && LIBGCC2_HAS_HF_MODE) \
+|| ((defined(L_mulsc3) || defined(L_divsc3)) && LIBGCC2_HAS_SF_MODE) \
 || ((defined(L_muldc3) || defined(L_divdc3)) && LIBGCC2_HAS_DF_MODE) \
 || ((defined(L_mulxc3) || defined(L_divxc3)) && LIBGCC2_HAS_XF_MODE) \
 || ((defined(L_multc3) || defined(L_divtc3)) && LIBGCC2_HAS_TF_MODE)
@@ -1861,7 +1862,13 @@ NAME (TYPE x, int m)
 #undef double
 #undef long
 
-#if defined(L_mulsc3) || defined(L_divsc3)
+#if defined(L_mulhc3) || defined(L_divhc3)
+# define MTYPE	HFtype
+# define CTYPE	HCtype
+# define MODE	hc
+# define CEXT	__LIBGCC_HF_FUNC_EXT__
+# define NOTRUNC __LIBGCC_HF_EXCESS_PRECISION__
+#elif defined(L_mulsc3) || defined(L_divsc3)
 # define MTYPE	SFtype
 # define CTYPE	SCtype
 # define MODE	sc
@@ -1922,7 +1929,7 @@ extern void *compile_type_assert[sizeof(INFINITY) == sizeof(MTYPE) ? 1 : -1];
 # define TRUNC(x)	__asm__ ("" : "=m"(x) : "m"(x))
 #endif
 
-#if defined(L_mulsc3) || defined(L_muldc3) \
+#if defined(L_mulhc3) || defined(L_mulsc3) || defined(L_muldc3) \
 || defined(L_mulxc3) || defined(L_multc3)
 
 CTYPE
@@ -1992,7 +1999,7 @@ CONCAT3(__mul,MODE,3) (MTYPE a, MTYPE b, MTYPE c, MTYPE d)
 }
 #endif /* complex multiply */
 
-#if defined(L_divsc3) || defined(L_divdc3) \
+#if defined(L_divhc3) || defined(L_divsc3) || defined(L_divdc3) \
 || defined(L_divxc3) || defined(L_divtc3)
 
 CTYPE
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 72bb873..c46fb77 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -34,6 +34,12 @@ extern void __clear_cache (char *, char *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
   __attribute__ ((__noreturn__));
 
+#ifdef __LIBGCC_HAS_HF_MODE__
+#define LIBGCC2_HAS_HF_MODE 1
+#else
+#define LIBGCC2_HAS_HF_MODE 0
+#endif
+
 #ifdef __LIBGCC_HAS_SF_MODE__
 #define LIBGCC2_HAS_SF_MODE 1
 #else
@@ -133,6 +139,10 @@ typedef unsigned int UTItype	__attribute__ ((mode (TI)));
 #endif
 #endif
 
+#if LIBGCC2_HAS_HF_MODE
+typedef		float HFtype	__attribute__ ((mode (HF)));
+typedef _Complex float HCtype	__attribute__ ((mode (HC)));
+#endif
 #if LIBGCC2_HAS_SF_MODE
 typedef 	float SFtype	__attribute__ ((mode (SF)));
 typedef _Complex float SCtype	__attribute__ ((mode (SC)));
@@ -424,6 +434,10 @@ extern SItype __negvsi2 (SItype);
 #endif /* COMPAT_SIMODE_TRAPPING_ARITHMETIC */
 
 #undef int
+#if LIBGCC2_HAS_HF_MODE
+extern HCtype __divhc3 (HFtype, HFtype, HFtype, HFtype);
+extern HCtype __mulhc3 (HFtype, HFtype, HFtype, HFtype);
+#endif
 #if LIBGCC2_HAS_SF_MODE
 extern DWtype __fixsfdi (SFtype);
 extern SFtype __floatdisf (DWtype);


[Bug target/63346] xserver_xorg-server-1.15.1 crash on RaspberryPi when compiled with gcc-4.9

2016-09-07 Thread ps.report at gmx dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63346

--- Comment #5 from Peter Seiderer  ---
Seems to be fixed in 5.4.0, tested with the original buildroot/xserver/dillo
testcase (with up to date buildroot) and the provided fbpict.c testcase.

[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #3 from Jonathan Wakely  ---
(In reply to petschy from comment #0)
> For c++11 and later code, why is NULL defined as __null, rather than nullptr?

Because defining NULL as nullptr would violate the requirements of the
standard, which very intentionally says that NULL is an integral constant
expression, not nullptr.

Re: [PATCH] Detect whether target can use -fprofile-update=atomic

2016-09-07 Thread Christophe Lyon
On 7 September 2016 at 11:34, Martin Liška  wrote:
> On 09/07/2016 09:45 AM, Christophe Lyon wrote:
>> On 6 September 2016 at 15:45, Martin Liška  wrote:
>>> On 09/06/2016 03:31 PM, Jakub Jelinek wrote:
 sizeof (gcov_type) talks about the host gcov type, you want instead the
 target gcov type.  So
 TYPE_SIZE (gcov_type_node) == 32 vs. 64 (or TYPE_SIZE_UNIT (gcov_type_node)
 == 4 vs. 8).
 As SImode and DImode are in fact 4*BITS_PER_UNIT and 8*BITS_PER_UNIT,
 TYPE_SIZE_UNIT comparisons for 4 and 8 are most natural.
 And I wouldn't add gcc_unreachable, just warn for weirdo arches always.

   Jakub
>>>
>>> Thank you Jakub for helping me with that. I've used TYPE_SIZE_UNIT macro.
>>>
>>> Ready for trunk?
>>> Martin
>>
>> Hi Martin,
>>
>> On targets which do not support atomic profile update, your patch generates a
>> warning on gcc.dg/tree-prof/val-profiler-threads-1.c, making it fail.
>>
>> Do we need a new effective-target ?
>>
>> Christophe
>>
>
> Hi.
>
> Thanks for observation, I'm sending a patch that does that.
> Can you please test it?
>
It does work indeed, thanks.
(tested on arm* targets)

Christophe

> Thanks,
> Martin


Re: [PING] Re: [PATCH, i386] Fix some warnings/errors that appear when enabling -Wnarrowing when building gcc

2016-09-07 Thread Uros Bizjak
On Tue, Sep 6, 2016 at 8:06 PM, Eric Gallager  wrote:
> On 9/6/16, Uros Bizjak  wrote:
>> On Tue, Sep 6, 2016 at 5:33 PM, Eric Gallager  wrote:
>>> Ping? CC-ing an i386 maintainer since the patch mostly touches
>>> i386-specific files. Also, to clarify, I say "warnings/errors" because
>>> they start off as warnings in stage 1 but then become errors in stage
>>> 2. Note also that my patch leaves out the part where I modify the
>>> configure script to enable -Wnarrowing, because the rest of the code
>>> isn't quite ready for that yet.
>>
>> You are probably referring to [1]? It looks OK, modulo:
>>
>> +DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~(0U))
>>
>> where parenthesis are not needed.
>>
>>
>> Please resubmit the patch with a ChangeLog entry, as instructed in [2]
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02129.html
>> [2] https://gcc.gnu.org/contribute.html#patches
>>
>> Uros.
>>
>
>
> Okay, reattached. Here's a ChangeLog entry to put in gcc/ChangeLog:
>
> 2016-09-06  Eric Gallager  
>
> * config/i386/i386.c: Add 'U' suffix to constants to avoid
> -Wnarrowing.
> * config/i386/x86-tune.def: Likewise.
> * opts.c: Likewise.
>
>
> (Please also note that I don't have commit access.)

Thanks, committed with slightly adjusted ChangeLog:

2016-09-07  Eric Gallager  

* config/i386/i386.c: Add 'U' suffix to processor feature bits
to avoid -Wnarrowing warning.
* config/i386/x86-tune.def: Likewise for DEF_TUNE selector bitmasks.
* opts.c: Likewise for SANITIZER_OPT bitmasks.

Uros.


[Bug target/77483] [6/7 regression] gcc.target/i386/mask-unpack.c etc. FAIL

2016-09-07 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77483

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #5 from H.J. Lu  ---
(In reply to Eric Botcazou from comment #2)
> 
> Well, this patch is a workaround for a pass that wreaks serious havoc except
> on Linux.  Feel free to come up with a better solution...

Is there a bug report?

[Bug fortran/60483] [5/6/7 Regression] No IMPLICIT type error with: ASSOCIATE( X => derived_type() ) [i.e. w/ structure constructor]

2016-09-07 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60483

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||vehre at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |vehre at gcc dot gnu.org

Re: [x86] Disable STV pass if -mstackrealign is enabled.

2016-09-07 Thread H.J. Lu
On Wed, Aug 31, 2016 at 12:29 PM, Uros Bizjak  wrote:
>> the new STV pass generates SSE instructions in 32-bit mode very late in the
>> pipeline and doesn't bother about realigning the stack, so it wreaks havoc on
>> OSes where you need to realign the stack, e.g. Windows, but I guess Solaris 
>> is
>> equally affected.  Therefore the attached patch disables it if -mstackrealign
>> is enabled (the option is automatically enabled on Windows and Solaris when
>> SSE support is enabled), as already done for -mpreferred-stack-boundary={2,3}
>> and -mincoming-stack-boundary={2,3}.
>>
>> Tested on x86/Windows, OK for mainline and 6 branch?
>>
>>
>> 2016-08-31  Eric Botcazou  
>>
>>* config/i386/i386.c (ix86_option_override_internal): Also disable the
>>STV pass if -mstackrealign is enabled.
>
> OK for mainline and gcc-6 branch.
>

Is there a testcase to show the problem with -mincoming-stack-boundary=
on Linux?

-- 
H.J.


PING: Re: [PATCH, LRA] Fix PR rtl-optimization 77289, LRA matching constraint problem

2016-09-07 Thread Peter Bergner
Ping this patch:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02099.html

Peter



[Bug fortran/57117] [OOP] ICE for sourced allocation of a polymorphic entity using TRANSPOSE

2016-09-07 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57117

--- Comment #13 from vehre at gcc dot gnu.org ---
Created attachment 39581
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39581=edit
Shorter version to fix the issue.

Hi all,

Dominique pointed out, that the patches proposed by Paul conflict with my accaf
patch. I took a look and found a less-intrusive version to fix this issue.

Note!!! This patch is base on my patch for pr72832 available from:

https://gcc.gnu.org/ml/fortran/2016-09/msg7.html

Bootstraps and regtests ok on x86_64-linux/F23. I haven't tested this patch in
combination with my accaf patch yet (time constraints).

- Andre

Re: Ping**2! Re: [PATCH, Fortran] Extension: AUTOMATIC/STATIC symbol attributes with -fdec-static

2016-09-07 Thread Andre Vehreschild
Hi Fritz,

please note: I do not have official review privileges. So my vote here
is rather an advise to you and the official reviewers. Often such a
inofficial review helps to speed things up, because the official ones
are pointed to the nics and nacs and don't have to bother with the
minor things.

So here it comes:

- Do I understand this correctly: AUTOMATIC and STATIC have to come last,
  i.e., right before the :: where declaring, e.g., a variable?


- Running:

  $ contrib/check_GNU_style.sh dec_static.patch

  Reports some style issues in the C code, that should be fixed before
  commit. (Style in Fortran testcases does not matter that much.)


- I have deleted huge parts of the diff and just kept the parts I had a
  question/remark for:

> diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
> index b34ae86..a0cf78b 100644
> --- a/gcc/fortran/gfortran.texi
> +++ b/gcc/fortran/gfortran.texi
> @@ -2120,7 +2121,6 @@ consider @code{BACKSPACE} or @code{REWIND} to properly 
> position
>  the file before the EOF marker.  As an extension, the run-time error may
>  be disabled using -std=legacy.
>  
> -

Please change formatting in a separate patch or not at all (here!).
This policy is to distinguish cosmetic changes from relevant ones.

>  @node STRUCTURE and RECORD
>  @subsection @code{STRUCTURE} and @code{RECORD}
>  @cindex @code{STRUCTURE}
> @@ -2420,6 +2420,53 @@ here:
>@tab @code{--} @tab @code{FLOATI} @tab @code{FLOATJ} @tab @code{FLOATK}
>  @end multitable
>  
> +@node AUTOMATIC and STATIC attributes
> +@subsection @code{AUTOMATIC} and @code{STATIC} attributes
> +@cindex variable attributes
> +@cindex @code{AUTOMATIC}
> +@cindex @code{STATIC}
> +
> +With @option{-fdec-static} GNU Fortran supports the explicit specification of
> +two addition variable attributes: @code{STATIC} and @code{AUTOMATIC}. These

two additional variable ...
^^ 

But is it only for variables? Can't it be used for equivalences or
other constructs, too?

> diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
> index 15c131a..a5da59e 100644
> --- a/gcc/fortran/invoke.texi
> +++ b/gcc/fortran/invoke.texi
> @@ -255,6 +255,11 @@ instead where possible.
>  Enable B/I/J/K kind variants of existing integer functions (e.g. BIAND, 
> IIAND,
>  JIAND, etc...). For a complete list of intrinsics see the full documentation.
>  
> +@item -fdec-static
> +@opindex @code{fdec-static}
> +Enable STATIC and AUTOMATIC as attributes specifying storage location.
> +STATIC is equivalent to SAVE, and locals are typically AUTOMATIC by default.

Well, this description to me sounds like: "Those attributes are
useless, because they can be substituted." This is clearly not what you
intend. I propose to include into the description that with "this
switch the dec-extension" is available "to explicitly specify the
storage of entities". Then the last sentence is still a good hint for
all fortraneers that don't know the extension.

> +
>  @item -fdollar-ok
>  @opindex @code{fdollar-ok}
>  @cindex @code{$}
> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
> index 8ec5400..260512d 100644
> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> @@ -432,6 +432,10 @@ fdec-structure
>  Fortran
>  Enable support for DEC STRUCTURE/RECORD.
>  
> +fdec-static
> +Fortran Var(flag_dec_static)
> +Enable STATIC and AUTOMATIC attributes.

How about: Enable the dec-extension of STATIC and AUTOMATIC attributes.
Just a proposal.

> +
>  fdefault-double-8
>  Fortran Var(flag_default_double)
>  Set the default double precision kind to an 8 byte wide type.


> diff --git a/gcc/testsuite/gfortran.dg/dec_static_1.f90 
> b/gcc/testsuite/gfortran.dg/dec_static_1.f90
> new file mode 100644
> index 000..4dcfc7c
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/dec_static_1.f90

-  Please add some testcases where the new error messages are tested.

So much from my side. Btw, I haven't applied the patch and tested
whether it runs or collides with other proposed patches. That is
usually done by Dominique and I did not want to waste doing it a second
time.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Bernd Schmidt



On 09/07/2016 10:19 AM, Richard Biener wrote:

On Tue, 6 Sep 2016, Jakub Jelinek wrote:



If you want a 64-bit store, you'd need to merge the two, and that would be
even more expensive.  It is a matter of say:
movl $0x12345678, (%rsp)
movl $0x09abcdef, 4(%rsp)
vs.
movabsq $0x09abcdef12345678, %rax
movq %rax, (%rsp)
vs.
movl $0x09abcdef, %eax
salq $32, %rax
orq $0x12345678, %rax
movq $rax, (%rsp)


vs.

movq $LC0, (%rsp)

?


Not the same. That moves the address of $LC0.


Bernd


[PATCH][expmed.c] PR middle-end/77426 Delete duplicate condition in synth_mult

2016-09-07 Thread Kyrill Tkachov

Hi all,

The duplicate mode check in synth can just be deleted IMO. It was introduced as 
part of r139821 that was
a much larger change introducing size/speed differentiation to the RTL midend. 
So I think it's just a typo/copy-pasto.

Tested on aarch64-none-elf.
Ok?

Thanks,
Kyrill

2016-09-07  Kyrylo Tkachov  

PR middle-end/77426
* expmed.c (synth_mult): Delete duplicate mode check.
diff --git a/gcc/expmed.c b/gcc/expmed.c
index 1cedf023c8e8916d887bd3a9d9a723e3cc2354f7..a5da8836f21debcda3b834cb869348ea6cb33414 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2572,7 +2572,6 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
   entry_ptr = alg_hash_entry_ptr (hash_index);
   if (entry_ptr->t == t
   && entry_ptr->mode == mode
-  && entry_ptr->mode == mode
   && entry_ptr->speed == speed
   && entry_ptr->alg != alg_unknown)
 {


Re: [PATCH][AArch64] Improve legitimize_address

2016-09-07 Thread Richard Earnshaw (lists)
On 06/09/16 14:14, Wilco Dijkstra wrote:
> Improve aarch64_legitimize_address - avoid splitting the offset if it is
> supported.  When we do split, take the mode size into account.  BLKmode
> falls into the unaligned case but should be treated like LDP/STP.
> This improves codesize slightly due to fewer base address calculations:
> 
> int f(int *p) { return p[5000] + p[7000]; }
> 
> Now generates:
> 
> f:
>   add x0, x0, 16384
>   ldr w1, [x0, 3616]
>   ldr w0, [x0, 11616]
>   add w0, w1, w0
>   ret
> 
> instead of:
> 
> f:
>   add x1, x0, 16384
>   add x0, x0, 24576
>   ldr w1, [x1, 3616]
>   ldr w0, [x0, 3424]
>   add w0, w1, w0
>   ret
> 
> OK for trunk?
> 
> ChangeLog:
> 2016-09-06  Wilco Dijkstra  
> 
> gcc/
>   * config/aarch64/aarch64.c (aarch64_legitimize_address):
>   Avoid use of base_offset if offset already in range.

OK.

R.

> --
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -5058,9 +5058,19 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
> machine_mode mode)
>/* For offsets aren't a multiple of the access size, the limit is
>-256...255.  */
>else if (offset & (GET_MODE_SIZE (mode) - 1))
> - base_offset = (offset + 0x100) & ~0x1ff;
> + {
> +   base_offset = (offset + 0x100) & ~0x1ff;
> +
> +   /* BLKmode typically uses LDP of X-registers.  */
> +   if (mode == BLKmode)
> + base_offset = (offset + 512) & ~0x3ff;
> + }
> +  /* Small negative offsets are supported.  */
> +  else if (IN_RANGE (offset, -256, 0))
> + base_offset = 0;
> +  /* Use 12-bit offset by access size.  */
>else
> - base_offset = offset & ~0xfff;
> + base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
>  
>if (base_offset != 0)
>   {
> 



[Bug libfortran/77393] [7 Regression] Revision r237735 changed the behavior of F0.0

2016-09-07 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77393

--- Comment #14 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #13 from Jerry DeLisle  ---
> Author: jvdelisle
> Date: Tue Sep  6 23:22:26 2016
> New Revision: 240018
>
> URL: https://gcc.gnu.org/viewcvs?rev=240018=gcc=rev
> Log:
> 2016-09-06  Jerry DeLisle  
>
> PR libgfortran/77393
> * io/write_float.def (build_float_string): Recognize when the
> result will not fit in the user provided, star fill, and exit
> early.
>
> * gfortran.dg/fmt_f0_2.f90: Update test.
> * gfortran.dg/fmt_f0_3.f90: New test.

I've only now managed to test the patch: it passes on both
i386-pc-solaris2.12 and sparc-sun-solaris2.12.

Thanks.
Rainer

Re: [PATCH GCC 8/9]Adjust test case for CFG changes in vectorized loop

2016-09-07 Thread Jeff Law

On 09/06/2016 12:53 PM, Bin Cheng wrote:

Hi,
After CFG changes in vectorizer, the epilog loop now can be completely peeled, 
resulting in changes in the number of instructions that these tests check.  
This patch adjusts related checking strings.

Thanks,
bin


gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gcc.target/i386/l_fma_float_1.c: Revise test.
* gcc.target/i386/l_fma_float_2.c: Ditto.
* gcc.target/i386/l_fma_float_3.c: Ditto.
* gcc.target/i386/l_fma_float_4.c: Ditto.
* gcc.target/i386/l_fma_float_5.c: Ditto.
* gcc.target/i386/l_fma_float_6.c: Ditto.
* gcc.target/i386/l_fma_double_1.c: Ditto.
* gcc.target/i386/l_fma_double_2.c: Ditto.
* gcc.target/i386/l_fma_double_3.c: Ditto.
* gcc.target/i386/l_fma_double_4.c: Ditto.
* gcc.target/i386/l_fma_double_5.c: Ditto.
* gcc.target/i386/l_fma_double_6.c: Ditto.

OK when prerequisites are approved.
jeff



Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Jeff Law

On 09/07/2016 02:19 AM, Richard Biener wrote:

On Tue, 6 Sep 2016, Jakub Jelinek wrote:


On Tue, Sep 06, 2016 at 04:59:23PM +0100, Kyrill Tkachov wrote:

On 06/09/16 16:32, Jakub Jelinek wrote:

On Tue, Sep 06, 2016 at 04:14:47PM +0100, Kyrill Tkachov wrote:

The v3 of this patch addresses feedback I received on the version posted at [1].
The merged store buffer is now represented as a char array that we splat values 
onto with
native_encode_expr and native_interpret_expr. This allows us to merge anything 
that native_encode_expr
accepts, including floating point values and short vectors. So this version 
extends the functionality
of the previous one in that it handles floating point values as well.

The first phase of the algorithm that detects the contiguous stores is also 
slightly refactored according
to feedback to read more fluently.

Richi, I experimented with merging up to MOVE_MAX bytes rather than word size 
but I got worse results on aarch64.
MOVE_MAX there is 16 (because it has load/store register pair instructions) but 
the 128-bit immediates that we ended
synthesising were too complex. Perhaps the TImode immediate store RTL 
expansions could be improved, but for now
I've left the maximum merge size to be BITS_PER_WORD.

At least from playing with this kind of things in the RTL PR22141 patch,
I remember storing 64-bit constants on x86_64 compared to storing 2 32-bit
constants usually isn't a win (not just for speed optimized blocks but also for
-Os).  For 64-bit store if the constant isn't signed 32-bit or unsigned
32-bit you need movabsq into some temporary register which has like 3 times 
worse
latency than normal store if I remember well, and then store it.


We could restrict the maximum width of the stores generated to 32 bits on 
x86_64.
I think this would need another parameter or target macro for the target to set.
Alternatively, is it a possibility for x86 to be a bit smarter in its DImode 
mov-immediate
expansion? For example break up the 64-bit movabsq immediate into two SImode 
immediates?


If you want a 64-bit store, you'd need to merge the two, and that would be
even more expensive.  It is a matter of say:
movl $0x12345678, (%rsp)
movl $0x09abcdef, 4(%rsp)
vs.
movabsq $0x09abcdef12345678, %rax
movq %rax, (%rsp)
vs.
movl $0x09abcdef, %eax
salq $32, %rax
orq $0x12345678, %rax
movq $rax, (%rsp)


vs.

movq $LC0, (%rsp)

?


etc.  Guess it needs to be benchmarked on contemporary CPUs, I'm pretty sure
the last sequence is the worst one.


I think the important part to notice is that it should be straight forward
for a target / the expander to split a large store from an immediate
into any of the above but very hard to do the opposite.  Thus from a
GIMPLE side "canonicalizing" to large stores (that are eventually
supported and well-aligned) seems best to me.

Agreed.





I'm aware of that. The patch already has logic to avoid emitting unaligned 
accesses
for SLOW_UNALIGNED_ACCESS targets. Beyond that the patch introduces the 
parameter
PARAM_STORE_MERGING_ALLOW_UNALIGNED that can be used by the user or target to
forbid generation of unaligned stores by the pass altogether. Beyond that I'm 
not sure
how to behave more intelligently here. Any ideas?


Dunno, the heuristics was the main problem with my patch.  Generally, I'd
say there is a difference between cold and hot blocks, in cold ones perhaps
unaligned stores are more appropriate (if supported at all and not way too
slow), while in hot ones less desirable.


Note that I repeatedly argue that if we can canonicalize sth to "larger"
then even if unaligned, the expander should be able to produce optimal
code again (it might not do, of course).
And agreed.  Furthermore, it's in line with our guiding principles WRT 
separation of the tree/SSA optimizers from target dependencies.


So let's push those decisions into the expanders/backend/target and 
canonicalize to the larger stores.


jeff




[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread petschy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #2 from petschy at gmail dot com ---
I don't want to enable them. The problem is not with too little but too many
warnings.

A snippet from one of the problematic files:

{ NULL, NULL, false, false }

is preprocessed to 

 { 
# 62 "AdsPlugin.cpp" 3 4
  __null
# 62 "AdsPlugin.cpp"
  , 
# 62 "AdsPlugin.cpp" 3 4
__null
# 62 "AdsPlugin.cpp"
, false, false }
};

Here I see the same flags, yet for these two NULLs gcc warns.

Re: [PATCH GCC 7/9]Skip loops iterating only 1 time in predictive commoning

2016-09-07 Thread Jeff Law

On 09/06/2016 12:53 PM, Bin Cheng wrote:

Hi,
For loops which are bounded to iterate only 1 time (thus loop's latch doesn't 
roll), there is nothing to predictive common, this patch detects/skips these 
cases.  A test is also added in 
gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f for this.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-predcom.c (tree_predictive_commoning_loop): Skip loop that only
iterates 1 time.

gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gfortran.dg/vect/fast-math-mgrid-resid.f: New test string.


OK.
jeff


Re: [PATCH GCC 3/9]Support rewriting non-lcssa phis for vars live outside of vect-loop

2016-09-07 Thread Jeff Law

On 09/06/2016 12:51 PM, Bin Cheng wrote:

Hi,
Current implementation requires that variables live outside of vect-loop 
satisfying LCSSA form, this patch relaxes the restriction.  It keeps the old 
behavior for LCSSA PHI node by replacing use of live var with result of that 
PHI; for other uses of live var, it simply replaces all uses outside loop with 
the newly computed var.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (vectorizable_live_operation): Support handling
for live variable outside loop but not in lcssa form.


OK.
jeff


Re: [PATCH GCC 5/9]Put copied loop after its preheader and after the original loop's latch in basic block link list

2016-09-07 Thread Jeff Law

On 09/06/2016 12:52 PM, Bin Cheng wrote:

Hi,
This simple patch changes slpeel_tree_duplicate_loop_edge_cfg by putting copied 
loop after its new preheader and after the original loop's latch in basic 
block's linked list.  It doesn't change CFG at all, but makes the dump cfg a 
little bit easier to read.  I assume this is good for basic block reordering 
too?

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Put
duplicated loop after its preheader and after the original loop.

In theory bb reordering ought to clean this up.  But better to generate 
good code from the start when it's easy to do so.


OK.

jeff


Re: [PATCH GCC 4/9]Check niters for peeling for data access gaps in analyzer

2016-09-07 Thread Jeff Law

On 09/06/2016 12:51 PM, Bin Cheng wrote:

Hi,
This patch checks if loop has enough niters for peeling for data access gaps in 
vect_analyze_loop_2, while now this check is in vect_transform_loop stage.  The 
problem is vectorizer may vectorize loops without enough iterations and 
generate false guard on the vectorized loop.  Though the loop is successfully 
vectorized, it will never be executed, and most likely, it will be removed 
during cfg-cleanup.  Examples can be found in revised tests of this patch.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (vect_analyze_loop_2): Check and skip loop if it
has no enough iterations for LOOP_VINFO_PEELING_FOR_GAPS.

gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gcc.dg/vect/vect-98.c: Refine test case.
* gcc.dg/vect/vect-strided-a-u8-i8-gap2.c: Ditto.
* gcc.dg/vect/vect-strided-u8-i8-gap2.c: Ditto.
* gcc.dg/vect/vect-strided-u8-i8-gap4.c: Ditto.


OK.
jeff


Re: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Richard Biener
On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  wrote:
> Hi!
>
> I trimmed the CC list -- I'm looking for advice about debugging a lto1
> ICE.
>
> On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
> wrote:
>> On Fri, 19 Aug 2016, Richard Biener wrote:
>> > Can you quickly verify if LTO works with the new types?  I don't see 
>> > anything
>> > that would prevent it but having new global trees and backends 
>> > initializing them
>> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
>>
>> Well, the execution tests are in gcc.dg/torture, which is run with various
>> options including -flto (and I've checked the testsuite logs to confirm
>> these tests are indeed run with such options).  Is there something else
>> you think should be tested?
>
> As I noted in :
>
> As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
> nvptx
> offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
> data-type
> mismatch that is not visible with Intel MIC offloading (using the same 
> data
> types) but explodes with nvptx.  I'm having a look.
>
> I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
> backtrace of the ICE looks as follows:
>
> #0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
> line=line@entry=727, function=function@entry=0x10d6e3a 
> <_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
> [...]/source-gcc/gcc/diagnostic.c:1414
> #1  0x0058c9ef in vec::operator[] 
> (this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
> #2  0x0058ca33 in vec::operator[] 
> (this=this@entry=0x1691998, ix=ix@entry=185) at 
> [...]/source-gcc/gcc/vec.h:1211

so it wants tree 185 which is (given the low number) likely one streamed by
preload_common_nodes.  This is carefully crafted to _not_ diverge by
frontend (!) it wasn't even designed to cope with global trees being present
or not dependent on target (well, because the target is always the
same! mind you!)

Now -- in theory it should deal with NULLs just fine (resulting in
error_mark_node), but it can diverge when there are additional
compount types (like vectors, complex
or array or record types) whose element types are not in the set of
global trees.
The complex _FloatN types would be such a case given they appear before their
components.  That mixes up the ordering at least.

So I suggest to add a print_tree to where it does the streamer_tree_cache_append
and compare cc1 and lto1 outcome.

The ICE above means the lto1 has fewer preloaded nodes I guess.

Richard.

> #3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
> ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
> #4  0x00c73eb9 in streamer_get_pickled_tree 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1112
> #5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
> hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
> #6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
> data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
> #7  0x00c720d2 in lto_input_ts_list_tree_pointers 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
> #8  0x00c7444e in streamer_read_tree_body 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1077
> #9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1285
> #10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1315
> #11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1427
> #12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
> entry_len=entry_len@entry=0x7fffcea8) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1339
> #13 0x005890f7 in lto_read_decls 
> (decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
> resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
> #14 0x005898c8 in lto_file_finalize 
> (file_data=file_data@entry=0x77fc, 

Re: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Richard Biener
On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  wrote:
> Hi!
>
> I trimmed the CC list -- I'm looking for advice about debugging a lto1
> ICE.
>
> On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
> wrote:
>> On Fri, 19 Aug 2016, Richard Biener wrote:
>> > Can you quickly verify if LTO works with the new types?  I don't see 
>> > anything
>> > that would prevent it but having new global trees and backends 
>> > initializing them
>> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
>>
>> Well, the execution tests are in gcc.dg/torture, which is run with various
>> options including -flto (and I've checked the testsuite logs to confirm
>> these tests are indeed run with such options).  Is there something else
>> you think should be tested?
>
> As I noted in :
>
> As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
> nvptx
> offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
> data-type
> mismatch that is not visible with Intel MIC offloading (using the same 
> data
> types) but explodes with nvptx.  I'm having a look.
>
> I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
> backtrace of the ICE looks as follows:
>
> #0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
> line=line@entry=727, function=function@entry=0x10d6e3a 
> <_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
> [...]/source-gcc/gcc/diagnostic.c:1414
> #1  0x0058c9ef in vec::operator[] 
> (this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
> #2  0x0058ca33 in vec::operator[] 
> (this=this@entry=0x1691998, ix=ix@entry=185) at 
> [...]/source-gcc/gcc/vec.h:1211

so it wants tree 185 which is (given the low number) likely one streamed by
preload_common_nodes.  This is carefully crafted to _not_ diverge by
frontend (!) it wasn't even designed to cope with global trees being present
or not dependent on target (well, because the target is always the
same! mind you!)

Now -- in theory it should deal with NULLs just fine (resulting in
error_mark_node), but it can diverge when there are additional
compount types (like vectors, complex
or array or record types) whose element types are not in the set of
global trees.
The complex _FloatN types would be such a case given they appear before their
components.  That mixes up the ordering at least.

So I suggest to add a print_tree to where it does the streamer_tree_cache_append
and compare cc1 and lto1 outcome.

The ICE above means the lto1 has fewer preloaded nodes I guess.

Richard.

> #3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
> ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
> #4  0x00c73eb9 in streamer_get_pickled_tree 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1112
> #5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
> hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
> #6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
> data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
> #7  0x00c720d2 in lto_input_ts_list_tree_pointers 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
> #8  0x00c7444e in streamer_read_tree_body 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1077
> #9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1285
> #10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1315
> #11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1427
> #12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
> entry_len=entry_len@entry=0x7fffcea8) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1339
> #13 0x005890f7 in lto_read_decls 
> (decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
> resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
> #14 0x005898c8 in lto_file_finalize 
> (file_data=file_data@entry=0x77fc, 

[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #1 from Andreas Schwab  ---
The "3" flag on the line marker marks the following lines as originating from a
system header where warnings are suppressed.  Use -Wsystem-headers to enable
them.

Re: [PATCH GCC 2/9]Add interface reseting original copy tables in cfg.c

2016-09-07 Thread Jeff Law

On 09/06/2016 12:50 PM, Bin Cheng wrote:

Hi,
This simple patch adds interface reseting original copy table in cfg.c.  This 
will be used in rewriting vect_do_peeling_* functions in vectorizer so that we 
don't need to release/allocate tables between prolog and epilog peeling.

Thanks,
bin

2016-09-01  Bin Cheng  

* cfg.c (reset_original_copy_tables): New func.
* cfg.h (reset_original_copy_tables): New decl.

Needs a function comment for reset_original_copy_tables.  Should be fine 
with that change.


Jeff


Re: [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c

2016-09-07 Thread Jeff Law

On 09/06/2016 12:49 PM, Bin Cheng wrote:

Hi,
This is a patch set generating new control flow graph for vectorized loop and 
its peeling loops.  At the moment, CFG for vecorized loop is complicated and 
sub-optimal.  Major issues are like:
A) For both prologue and vectorized loop, it generates guard/branch before 
loops checking if the following (prologue/vectorized) loop should be skipped.  
It also generates guard/branch after loops checking if the next loop 
(vectorized/epilogue) loop should be skipped.
B) Depending on how conditional set is supported by targets, it may generates 
one additional if-statement (branch) setting the niters for prologue loop.
C) In the worst cases, up to 4 branch instructions need to be executed before 
vectorized loop is entered.
D) For loops without enough niters, it checks some (niters_prologue) 
iterations with prologue loop; then checks if the rest number of iterations (niters 
- niters_prologue) is enough for vectorization; if not, it skips vectorized loop 
and continues with epilogue loop.  This is bad since vectorized loop won't be 
executed at all after all the hassle.

This patch set improves it by merging different checks thus only 2 branch 
instructions (could be further reduced in combination with loop versioning) are 
executed before vectorized loop; it does better in compile time analysis in 
order to avoid prologue/epilogue peeling if possible; it improves code 
generation in various ways (live overflow handling, generating short live 
ranges).  In terms of implementation, it tries to factor SSA updating code out 
of CFG changing code, I think this may help future work replacing slpeel_* with 
generic GIMPLE loop copier.

So far there are 9 patches in the set, patch [1-5] are small prerequisites for 
major change which is done by patch 6.  Patch [7-9] are small patches either 
address test case or improve code generation.  Final bootstrap and test of 
patch set ongoing on x86_64 and AArch64.  Assume no new failure or will be 
fixed, any comments on this?

This is the first patch deleting useless code in tree-vect-loop-manip.c, as 
well as fixing obvious code style issue.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop-manip.c (slpeel_can_duplicate_loop_p): Fix code
style issue.
(vect_do_peeling_for_loop_bound, vect_do_peeling_for_alignment):
Remove useless code.
Seems obvious to me -- I can't think of any reason why we'd emit a NULL 
sequence to the loop preheader edge.


jeff



Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Thomas Schwinge
Hi!

I trimmed the CC list -- I'm looking for advice about debugging a lto1
ICE.

On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
wrote:
> On Fri, 19 Aug 2016, Richard Biener wrote:
> > Can you quickly verify if LTO works with the new types?  I don't see 
> > anything
> > that would prevent it but having new global trees and backends initializing 
> > them
> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
> 
> Well, the execution tests are in gcc.dg/torture, which is run with various 
> options including -flto (and I've checked the testsuite logs to confirm 
> these tests are indeed run with such options).  Is there something else 
> you think should be tested?

As I noted in :

As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
nvptx
offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
data-type
mismatch that is not visible with Intel MIC offloading (using the same data
types) but explodes with nvptx.  I'm having a look.

I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
backtrace of the ICE looks as follows:

#0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
line=line@entry=727, function=function@entry=0x10d6e3a 
<_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
[...]/source-gcc/gcc/diagnostic.c:1414
#1  0x0058c9ef in vec::operator[] 
(this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
#2  0x0058ca33 in vec::operator[] 
(this=this@entry=0x1691998, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:1211
#3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
#4  0x00c73eb9 in streamer_get_pickled_tree 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
[...]/source-gcc/gcc/tree-streamer-in.c:1112
#5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
#6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
#7  0x00c720d2 in lto_input_ts_list_tree_pointers 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
#8  0x00c7444e in streamer_read_tree_body 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:1077
#9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1285
#10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1315
#11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1427
#12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
entry_len=entry_len@entry=0x7fffcea8) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1339
#13 0x005890f7 in lto_read_decls 
(decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
#14 0x005898c8 in lto_file_finalize 
(file_data=file_data@entry=0x77fc, file=file@entry=0x15eedb0) at 
[...]/source-gcc/gcc/lto/lto.c:2037
#15 0x00589928 in lto_create_files_from_ids 
(file=file@entry=0x15eedb0, file_data=file_data@entry=0x77fc, 
count=count@entry=0x7fffd054) at [...]/source-gcc/gcc/lto/lto.c:2047
#16 0x00589a7a in lto_file_read (file=0x15eedb0, 
resolution_file=resolution_file@entry=0x0, count=count@entry=0x7fffd054) at 
[...]/source-gcc/gcc/lto/lto.c:2088
#17 0x00589e84 in read_cgraph_and_symbols (nfiles=1, 
fnames=0x160e990) at [...]/source-gcc/gcc/lto/lto.c:2798
#18 0x0058a572 in lto_main () at [...]/source-gcc/gcc/lto/lto.c:3299
#19 0x00a48eff in compile_file () at 
[...]/source-gcc/gcc/toplev.c:466
#20 0x00550943 in do_compile () at 
[...]/source-gcc/gcc/toplev.c:2010
#21 toplev::main (this=this@entry=0x7fffd180, argc=argc@entry=20, 
argv=0x15daf20, argv@entry=0x7fffd288) at [...]/source-gcc/gcc/toplev.c:2144
#22 0x00552717 in main (argc=20, argv=0x7fffd288) at 
[...]/source-gcc/gcc/main.c:39

(Comparing to yesterday's r240004, the 

  1   2   >