Re: [PATCH] i386: Add AVX512 unaligned intrinsics

2019-07-09 Thread Uros Bizjak
On Tue, Jul 9, 2019 at 11:44 PM Sunil Pandey  wrote:
>
> __m512i _mm512_loadu_epi32( void * sa);
> __m512i _mm512_loadu_epi64( void * sa);
> void _mm512_storeu_epi32(void * d, __m512i a);
> void _mm256_storeu_epi32(void * d, __m256i a);
> void _mm_storeu_epi32(void * d, __m128i a);
> void _mm512_storeu_epi64(void * d, __m512i a);
> void _mm256_storeu_epi64(void * d, __m256i a);
> void _mm_storeu_epi64(void * d, __m128i a);
>
> Tested on x86-64.
>
> OK for trunk?
>
> --Sunil Pandey
>
>
> gcc/
>
> PR target/90980
> * config/i386/avx512fintrin.h (__v16si_u): New data type
> (__v8di_u): Likewise
> (_mm512_loadu_epi32): New.
> (_mm512_loadu_epi64): Likewise.
> (_mm512_storeu_epi32): Likewise.
> (_mm512_storeu_epi64): Likewise.
> * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New.
> (_mm256_storeu_epi32): Likewise.
> (_mm_storeu_epi64): Likewise.
> (_mm256_storeu_epi64): Likewise.
>
> gcc/testsuite/
>
> PR target/90980
> * gcc.target/i386/avx512f-vmovdqu32-3.c: New test.
> * gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise.
> * gcc.target/i386/pr90980-1.c: Likewise.
> * gcc.target/i386/pr90980-2.c: Likewise.

+/* Internal data types for implementing unaligned version of intrinsics.  */
+typedef int __v16si_u __attribute__ ((__vector_size__ (64),
+  __aligned__ (1)));
+typedef long long __v8di_u __attribute__ ((__vector_size__ (64),
+   __aligned__ (1)));

You should define only one generic __m512i_u type, something like:

typedef long long __m512i_u __attribute__ ((__vector_size__ (64),
__may_alias__, __aligned__ (1)));

Please see avxintrin.h how __m256i_u is defined and used.

Uros.


[PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb

2019-07-09 Thread Kewen.Lin
Hi all,

6: NOTE_INSN_BASIC_BLOCK 2

   12: r135:CC=cmp(r122:DI,0)
   13: pc={(r135:CC!=0)?L52:pc}
  REG_DEAD r135:CC
  REG_BR_PROB 1041558836
   31: L31:
   17: NOTE_INSN_BASIC_BLOCK 3

The above RTL sequence is from pass doloop dumping with -fdump-rtl-all-slim, I
misunderstood that: the fall through BB of BB 2 is BB 3, since BB 3 is placed
just next to BB 2.  Then I found the contradiction that BB 3 will have some
uninitialized regs if it's true.

I can get the exact information with "-blocks" dumping and even detailed one
with "-details".  But I'm thinking whether it's worth to giving some
information on "-slim" dump (or more exactly without "-blocks") to avoid some
confusion especially for new comers like me.

This patch is to add one line to hint what's the fallthrough BB if it's the
one closely sitting, for example:

6: NOTE_INSN_BASIC_BLOCK 2

   12: r135:CC=cmp(r122:DI,0)
   13: pc={(r135:CC!=0)?L52:pc}
  REG_DEAD r135:CC
  REG_BR_PROB 1041558836
;;  pc falls through to BB 10 
   31: L31:
   17: NOTE_INSN_BASIC_BLOCK 3

Bootstrapped and regression test passed on powerpc64le-unknown-linux-gnu.

Is it a reasonable patch? If yes, is it ok for trunk?


Thanks,
Kewen

-

gcc/ChangeLog

2019-07-08  Kewen Lin  

* gcc/cfgrtl.c (print_rtl_with_bb): Check and call 
hint_if_pc_fall_through_not_next for jump insn with two successors.
(hint_if_pc_fall_through_not_next): New function.

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index a1ca5992c41..928b9b0f691 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -2164,7 +2164,26 @@ rtl_dump_bb (FILE *outf, basic_block bb, int indent, 
dump_flags_t flags)
 }

 }
-
+
+/* For dumping without specifying basic blocks option, when we see PC is one of
+   jump targets, it's easy to misunderstand the next basic block is the
+   fallthrough one, but it's not so true sometimes.  This function is to dump
+   hints for the case where basic block of next insn isn't the fall through
+   target.  */
+
+static void
+hint_if_pc_fall_through_not_next (FILE *outf, basic_block bb)
+{
+  gcc_assert (outf);
+  gcc_assert (EDGE_COUNT (bb->succs) >= 2);
+  const rtx_insn *einsn = BB_END (bb);
+  const rtx_insn *ninsn = NEXT_INSN (einsn);
+  edge e = FALLTHRU_EDGE (bb);
+  basic_block dest = e->dest;
+  if (BB_HEAD (dest) != ninsn)
+fprintf (outf, ";;  pc falls through to BB %d\n", dest->index);
+}
+
 /* Like dump_function_to_file, but for RTL.  Print out dataflow information
for the start of each basic block.  FLAGS are the TDF_* masks documented
in dumpfile.h.  */
@@ -2255,6 +2274,14 @@ print_rtl_with_bb (FILE *outf, const rtx_insn 
*rtx_first, dump_flags_t flags)
  putc ('\n', outf);
}
}
+ else if (GET_CODE (tmp_rtx) == JUMP_INSN
+  && GET_CODE (PATTERN (tmp_rtx)) == SET)
+   {
+ bb = BLOCK_FOR_INSN (tmp_rtx);
+ const_rtx src = SET_SRC (PATTERN (tmp_rtx));
+ if (bb != NULL && GET_CODE (src) == IF_THEN_ELSE)
+   hint_if_pc_fall_through_not_next (outf, bb);
+   }
}

   free (start);



[PING^1][PATCH v4 3/3] PR80791 Consider doloop cmp use in ivopts

2019-07-09 Thread Kewen.Lin
Hi all,

I'd like to gentle ping the below patch:
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01225.html

The previous version for more context/background:
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01126.html

Thanks a lot in advance!


on 2019/6/20 下午8:16, Kewen.Lin wrote:
> Hi,
> 
> Sorry, the previous patch is incomplete.
> New one attached.  Sorry for inconvenience.
> 
> on 2019/6/20 下午8:08, Kewen.Lin wrote:
>> Hi Segher,
>>
>>> On Wed, Jun 19, 2019 at 07:47:34PM +0800, Kewen.Lin wrote:
 +/* Return true if count register for branch is supported.  */
 +
 +static bool
 +rs6000_have_count_reg_decr_p ()
 +{
 +  return flag_branch_on_count_reg;
 +}
>>>
>>> rs6000 unconditionally supports these instructions, not just when that
>>> flag is set.  If you need to look at the flag, the *caller* of this new
>>> hook should, not every implementation of the hook.  So just "return true"
>>> here?
>>
>> Good point!  Updated it as hookpod.
>>
 +/* For doloop use, if the algothrim selects some candidate which invalid 
 for
>>>
>>> "algorithm", "which is invalid".
>>
 +   some cost like zero rather than original inifite cost.  The point is to
>>>
>>> "infinite"
>>>
>>
>> Thanks for catching!  I should run spelling check next time.  :)
>>
>> New version attached with comments addressed.
>>
>>
>> Thanks,
>> Kewen
>>



[og9] OpenACC assumed-size arrays with non-lexical data mappings

2019-07-09 Thread Julian Brown
Hi,

This patch provides support for implicit mapping of assumed-sized
arrays for OpenACC, in cases where those arrays have previously been
mapped using non-lexical data mappings (e.g. "#pragma acc enter data").

Previously posted here:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02090.html

and then revised:

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00069.html

It's not clear if this is required behaviour for OpenACC, but at least
one test program we are using relies on the semantics introduced by this
patch.

Tested with offloading to nvptx. I will apply to the
openacc-gcc-9-branch shortly.

Julian

ChangeLog

2019-07-10  Cesar Philippidis  
Thomas Schwinge  
Julian Brown  

gcc/
* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
assumed-size arrays in map clauses for Fortran/OpenMP.
* omp-low.c (lower_omp_target): Set the size of assumed-size
Fortran arrays to one to allow use of data already mapped on
the offload device.

gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.

>From 2c5a7e445ebadc920730c732279732d2f9b40598 Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Thu, 4 Jul 2019 18:14:41 -0700
Subject: [PATCH 2/3] Assumed-size arrays with non-lexical data mappings

	gcc/fortran/
	* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
	assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.
	* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
	assumed-size arrays in map clauses for Fortran/OpenMP.
	* omp-low.c (lower_omp_target): Set the size of assumed-size Fortran
	arrays to one to allow use of data already mapped on the offload device.
---
 gcc/fortran/ChangeLog.openacc |  9 +
 gcc/fortran/trans-openmp.c| 22 +-
 gcc/gimplify.c| 14 ++
 gcc/omp-low.c |  5 +
 4 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/gcc/fortran/ChangeLog.openacc b/gcc/fortran/ChangeLog.openacc
index c44a5ebdb3b..beba7d94ad2 100644
--- a/gcc/fortran/ChangeLog.openacc
+++ b/gcc/fortran/ChangeLog.openacc
@@ -1,3 +1,12 @@
+2019-07-10  Julian Brown  
+
+	* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
+	assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.
+	* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
+	assumed-size arrays in map clauses for Fortran/OpenMP.
+	* omp-low.c (lower_omp_target): Set the size of assumed-size Fortran
+	arrays to one to allow use of data already mapped on the offload device.
+
 2019-07-10  Julian Brown  
 
 	* openmp.c (resolve_oacc_data_clauses): Allow polymorphic allocatable
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index d5ae0b717df..db009130c85 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1137,10 +1137,18 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
   tree decl = OMP_CLAUSE_DECL (c);
 
   /* Assumed-size arrays can't be mapped implicitly, they have to be mapped
- explicitly using array sections.  An exception is if the array is
- mapped explicitly in an enclosing data construct for OpenACC, in which
- case we see GOMP_MAP_FORCE_PRESENT here and do not need to raise an
- error.  */
+ explicitly using array sections.  For OpenACC this restriction is lifted
+ if the array has already been mapped:
+
+   - Using a lexically-enclosing data region: in that case we see the
+ GOMP_MAP_FORCE_PRESENT mapping kind here.
+
+   - Using a non-lexical data mapping ("acc enter data").
+
+ In the latter case we change the mapping type to GOMP_MAP_FORCE_PRESENT.
+ This raises an error for OpenMP in our the caller
+ (gimplify.c:gimplify_adjust_omp_clauses_1).  OpenACC will raise a runtime
+ error if the assumed-size array is not mapped.  */
   if (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FORCE_PRESENT
   && TREE_CODE (decl) == PARM_DECL
   && GFC_ARRAY_TYPE_P (TREE_TYPE (decl))
@@ -1148,11 +1156,7 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
   && GFC_TYPE_ARRAY_UBOUND (TREE_TYPE (decl),
 GFC_TYPE_ARRAY_RANK (TREE_TYPE (decl)) - 1)
 	 == NULL)
-{
-  error_at (OMP_CLAUSE_LOCATION (c),
-		"implicit mapping of assumed size array %qD", decl);
-  return;
-}
+OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_PRESENT);
 
   tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
   if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FORCE_DEVICEPTR)
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 60e04ff8353..58142c9eb90 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -10088,7 +10088,21 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
   *list_p = clause;
   struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
   gimplify_omp_ctxp = ctx->outer_context;
+  gomp_map_kind kind = (code == OMP_CLAUSE_MAP)

[og9] Allow the accelerator to have more offloaded functions than the host

2019-07-09 Thread Julian Brown
Hi,

This patch was previously posted here by Cesar:

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00668.html

This patch is necessary when not all objects containing offload code
are linked into the final executable, including those in static
libraries. Re-tested with offloading to nvptx. I will apply to the
openacc-gcc-9-branch shortly.

Thanks,

Julian

ChangeLog

2019-07-10  Cesar Philippidis  

libgomp/
* target.c (gomp_load_image_to_device): Allow the accelerator to
possess more offloaded functions than the host.
>From 8fa310efa11254ed430d7e5dca80333a612b699e Mon Sep 17 00:00:00 2001
From: Cesar Philippidis 
Date: Sun, 7 Jul 2019 11:25:51 -0700
Subject: [PATCH 3/3] Allow the accelerator to have more offloaded functions
 than the host

	libgomp/
	* target.c (gomp_load_image_to_device): Allow the accelerator to
	possess more offloaded functions than the host.
---
 libgomp/ChangeLog.openacc | 5 +
 libgomp/target.c  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc
index 1d88bd54cd2..00c58601336 100644
--- a/libgomp/ChangeLog.openacc
+++ b/libgomp/ChangeLog.openacc
@@ -1,3 +1,8 @@
+2019-07-10  Cesar Philippidis  
+
+	* target.c (gomp_load_image_to_device): Allow the accelerator to
+	possess more offloaded functions than the host.
+
 2019-07-10  Julian Brown  
 
 	* oacc-parallel.c (GOACC_enter_exit_data): Fix optional arguments for
diff --git a/libgomp/target.c b/libgomp/target.c
index a4ed763d507..c81e5ababb7 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2131,7 +2131,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
 = devicep->load_image_func (devicep->target_id, version,
 target_data, &target_table);
 
-  if (num_target_entries != num_funcs + num_vars)
+  if (num_target_entries < num_funcs + num_vars)
 {
   gomp_mutex_unlock (&devicep->lock);
   if (is_register_lock)
-- 
2.22.0



[og9] Support Fortran 2003 class pointers in OpenACC

2019-07-09 Thread Julian Brown
This patch provides initial support for Fortran 2003 polymorphic class
pointers in OpenACC. This necessitated some rewriting of the lowering
code in gfc_trans_omp_clauses, partly reverting some of the changes
made by the earlier manual deep copy support. In the new code, I've
tried to reuse existing lowering code in the Fortran front-end where
appropriate.

The main changes can be summarised thus:

1. Polymorphic class pointers can be used in OpenACC data-mapping
clauses. Class descriptors (comprising a _data pointer and a _vptr
virtual-table pointer) are mapped using GOMP_MAP_TO_PSET, in a similar
way to the existing support for array descriptors.

2. For OpenACC, a new internal-only gomp_map_kind has been introduced
when mapping derived-type pointer components, GOMP_MAP_ATTACH_DETACH,
instead of hijacking GOMP_MAP_ALWAYS_POINTER for attach/detach
operations then rewriting it in gimplify.c. This cleans up some code
paths and hopefully self-documents better.

3. OpenACC "enter data" and "exit data" now have GOMP_MAP_POINTER and
GOMP_MAP_PSET mappings removed during gimplification. In some
circumstances, passing an array to a function/subroutine and then doing
an "enter data" on it could leave dangling references to the function's
stack, although the actual array data is defined outside the function.
In any case, the pointer/pointer-set mappings don't seem to be
necessary for OpenACC "enter data".

Tested with offloading to nvptx. I will apply shortly (to the
openacc-gcc-9-branch).

Thanks,

Julian

ChangeLog

gcc/
* gimplify.c (insert_struct_comp_map): Handle GOMP_MAP_ATTACH_DETACH.
(gimplify_scan_omp_clauses): Separate out handling of OACC_ENTER_DATA
and OACC_EXIT_DATA. Remove GOMP_MAP_POINTER and GOMP_MAP_TO_PSET
mappings, apart from those following GOMP_MAP_DECLARE_{,DE}ALLOCATE.
Handle GOMP_MAP_ATTACH_DETACH.
* tree-pretty-print.c (dump_omp_clause): Support GOMP_MAP_ATTACH_DETACH.
Print "bias" not "len" for attach/detach clause types.

include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_ATTACH_DETACH.

gcc/c/
* c-typeck.c (handle_omp_array_sections): Use GOMP_MAP_ATTACH_DETACH
for OpenACC attach/detach operations.

gcc/cp/
* semantics.c (handle_omp_array_sections): Likewise.
(finish_omp_clauses): Handle GOMP_MAP_ATTACH_DETACH.

gcc/fortran/
* openmp.c (resolve_oacc_data_clauses): Allow polymorphic allocatable
variables.
* trans-expr.c (gfc_conv_component_ref,
conv_parent_component_reference): Make global.
(gfc_auto_dereference_var): New function, broken out of...
(gfc_conv_variable): ...here. Call outlined function instead.
* trans-openmp.c (gfc_trans_omp_array_section): New function, broken out
of...
(gfc_trans_omp_clauses): ...here. Separate out OpenACC derived
type/polymorphic class pointer handling. Call above outlined function.
* trans.h (gfc_conv_component_ref, conv_parent_component_references,
gfc_auto_dereference_var): Add prototypes.

gcc/testsuite/
* c-c++-common/goacc/mdc-1.c: Update clause matching patterns.

libgomp/
* oacc-parallel.c (GOACC_enter_exit_data): Fix optional arguments for
changes to clause stripping in enter data/exit data directives.
* testsuite/libgomp.oacc-fortran/class-ptr-param.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-1.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-2.f95: New test.
* testsuite/libgomp.oacc-fortran/derivedtype-1.f95: New test.
* testsuite/libgomp.oacc-fortran/derivedtype-2.f95: New test.
* testsuite/libgomp.oacc-fortran/multidim-slice.f95: New test.
>From 3c260613f2e74d6639c4dbd43b018b6640ae8454 Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Wed, 20 Feb 2019 05:21:15 -0800
Subject: [PATCH 1/3] Support Fortran 2003 class pointers in OpenACC

	gcc/
	* gimplify.c (insert_struct_comp_map): Handle GOMP_MAP_ATTACH_DETACH.
	(gimplify_scan_omp_clauses): Separate out handling of OACC_ENTER_DATA
	and OACC_EXIT_DATA. Remove GOMP_MAP_POINTER and GOMP_MAP_TO_PSET
	mappings, apart from those following GOMP_MAP_DECLARE_{,DE}ALLOCATE.
	Handle GOMP_MAP_ATTACH_DETACH.
	* tree-pretty-print.c (dump_omp_clause): Support GOMP_MAP_ATTACH_DETACH.
	Print "bias" not "len" for attach/detach clause types.

	include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_ATTACH_DETACH.

	gcc/c/
	* c-typeck.c (handle_omp_array_sections): Use GOMP_MAP_ATTACH_DETACH
	for OpenACC attach/detach operations.

	gcc/cp/
	* semantics.c (handle_omp_array_sections): Likewise.
	(finish_omp_clauses): Handle GOMP_MAP_ATTACH_DETACH.

	gcc/fortran/
	* openmp.c (resolve_oacc_data_clauses): Allow polymorphic allocatable
	variables.
	* trans-expr.c (gfc_conv_component_ref,
	conv_parent_component_reference): Make global.
	(gfc_auto_dereference_var): New f

Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC

2019-07-09 Thread Jeff Law
On 7/9/19 5:25 PM, Segher Boessenkool wrote:
> On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote:
>> On 5 Jul 2019, Richard Biener said:
>>
>>> On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat  wrote:
 CTF, at this time, is type information for entities at global or file 
 scope.
 This can be used by online debuggers, program tracers (dynamic tracing); 
 More
 generally, it provides type introspection for C programs, with an optional
 library API to allow them to get at their own types quite more easily than
 DWARF. So, the umbrella usecases are - all C programs that want to 
 introspect
 their own types quickly; and applications that want to introspect other
 programs's types quickly.
>>>
>>> What makes it superior to DWARF stripped down to the above feature set?
>>
>> Increased compactness.
> 
> Does CTF support something like -fasynchronous-unwind-tables?  You need
> that to have any sane debugging on many platforms.  Without it, you
> even have only partial backtraces, on most architectures/ABIs anyway.
I'd be suprised if it did since you need location information.  FWIW,
low level libraries like glibc depend on this stuff to support cancellation.

jeff


[PATCH], PowerPC, Patch #7, Split up SIGNED_34BIT and SIGNED_16BIT macros

2019-07-09 Thread Michael Meissner
This patch splits up the macros SIGNED_16BIT_OFFSET_P and SIGNED_34BIT_OFFSET_P
into two separate macros as you asked for previously in private mail.  The main
macros:

SIGNED_16BIT_OFFSET_P
SIGNED_34BIT_OFFSET_P

only take one argument, and that is the offset that is being tested.  The new
macros:

SIGNED_16BIT_OFFSET_EXTRA_P
SIGNED_34BIT_OFFSET_EXTRA_P

Retain the two arguments that the current macros have.  It is useful when the
functions that are validating addresses that might be split (such as the two
doubles in __ibm128) to verify that all addresses in the range of offset to
offset + extra are valid 16 or 34-bit offsets.  I have changed the existing
uses of these macros.

I have bootstrapped the compiler on a little endian power8 machine and there
were no regressions in the test suite.  Can I check this change into the trunk?

2019-07-09  Michael Meissner  

* config/rs6000/predicates.md (cint34_operand): Update
SIGNED_34BIT_OFFSET_P call.
(pcrel_address): Update SIGNED_34BIT_OFFSET_P call.
(pcrel_external_address): Update SIGNED_34BIT_OFFSET_P call.
* config/rs6000/rs6000.c (rs6000_prefixed_address): Update
SIGNED_16BIT_OFFSET_P and SIGNED_34BIT_OFFSET_P calls.
* config/rs6000/rs6000.h (SIGNED_16BIT_OFFSET_P): Remove EXTRA
argument.
(SIGNED_34BIT_OFFSET_P): Remove EXTRA argument.
(SIGNED_16BIT_OFFSET_EXTRA_P): New macro, like
SIGNED_16BIT_OFFSET_P with an EXTRA argument.
(SIGNED_34BIT_OFFSET_EXTRA_P): New macro, like
SIGNED_34BIT_OFFSET_P with an EXTRA argument.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 273255)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -309,7 +309,7 @@ (define_predicate "cint34_operand"
   if (!TARGET_PREFIXED_ADDR)
 return 0;
 
-  return SIGNED_34BIT_OFFSET_P (INTVAL (op), 0);
+  return SIGNED_34BIT_OFFSET_P (INTVAL (op));
 })
 
 ;; Return 1 if op is a register that is not special.
@@ -1638,7 +1638,7 @@ (define_predicate "pcrel_address"
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
 
-  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1), 0))
+  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
return false;
 
   op = op0;
@@ -1673,7 +1673,7 @@ (define_predicate "pcrel_external_addres
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
 
-  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1), 0))
+  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
return false;
 
   op = op0;
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 273313)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -21551,11 +21551,11 @@ rs6000_prefixed_address (rtx addr, machi
return false;
 
   HOST_WIDE_INT value = INTVAL (op1);
-  if (!SIGNED_34BIT_OFFSET_P (value, 0))
+  if (!SIGNED_34BIT_OFFSET_P (value))
return false;
 
   /* Offset larger than 16-bits?  */
-  if (!SIGNED_16BIT_OFFSET_P (value, 0))
+  if (!SIGNED_16BIT_OFFSET_P (value))
return true;
 
   /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 273255)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -2526,16 +2526,27 @@ typedef struct GTY(()) machine_function
 #pragma GCC poison TARGET_FLOAT128 OPTION_MASK_FLOAT128 MASK_FLOAT128
 #endif
 
-/* Whether a given VALUE is a valid 16- or 34-bit signed offset.  EXTRA is the
-   amount that we can't touch at the high end of the range (typically if the
-   address is split into smaller addresses, the extra covers the addresses
-   which might be generated when the insn is split).  */
-#define SIGNED_16BIT_OFFSET_P(VALUE, EXTRA)\
-  IN_RANGE (VALUE, \
+/* Whether a given VALUE is a valid 16 or 34-bit signed offset.  */
+#define SIGNED_16BIT_OFFSET_P(VALUE)   \
+  IN_RANGE ((VALUE),   \
+   -(HOST_WIDE_INT_1 << 15),   \
+   (HOST_WIDE_INT_1 << 15) - 1)
+
+#define SIGNED_34BIT_OFFSET_P(VALUE)   \
+  IN_RANGE ((VALUE),   \
+   -(HOST_WIDE_INT_1 << 33),   \
+   (HOST_WIDE_INT_1 << 33) - 1)
+
+/* Like SIGNED_16BIT_OFFSET_P and SIGNED_34BIT_OFFSET_P, but with an extra
+   argument that gives a length to validate a range of addresses, to allow for
+   splitting insns into several insns, each of which has 

Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC

2019-07-09 Thread Segher Boessenkool
On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote:
> On 5 Jul 2019, Richard Biener said:
> 
> > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat  wrote:
> >> CTF, at this time, is type information for entities at global or file 
> >> scope.
> >> This can be used by online debuggers, program tracers (dynamic tracing); 
> >> More
> >> generally, it provides type introspection for C programs, with an optional
> >> library API to allow them to get at their own types quite more easily than
> >> DWARF. So, the umbrella usecases are - all C programs that want to 
> >> introspect
> >> their own types quickly; and applications that want to introspect other
> >> programs's types quickly.
> >
> > What makes it superior to DWARF stripped down to the above feature set?
> 
> Increased compactness.

Does CTF support something like -fasynchronous-unwind-tables?  You need
that to have any sane debugging on many platforms.  Without it, you
even have only partial backtraces, on most architectures/ABIs anyway.


Segher


Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC

2019-07-09 Thread Mike Stump
On Jul 5, 2019, at 11:28 AM, Nix  wrote:
> ICTF for the entire Linux kernel is about 6MiB

Any reason why not add CTF to the next dwarf standard?  Then, we just support 
the next dwarf standard.  If not, have you started talks with them to add it?

Long term, this is a better solution, as we then get more interoperability, 
more support, more tools and more goodness.

To me this is the obvious solution to the problem.

[PATCH], PowerPC, Patch #6, Create pc-relative addressing insns

2019-07-09 Thread Michael Meissner
This patch updates the basic support for pc-relative addressing that will be
added in a future machine.

It was originally proposed as patch #4:
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01866.html

Segher suggested that I split out moving create_TOC_reference back to rs6000.c
as a separate patch, which I did in this patch that has been committed:
https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00719.html

In working on the other suggestions from Segher about the second argument to
create_TOC_reference (which is where to put the HIGH value after register
allocation), I came to the conclusion it wasn't worth it to try and combine TOC
and pc-relative addressing.  There are two places in rs6000_emit_move that deal
with the creation of TOC/pc-relative addressing.

There is also one place in rs6000.md that created TOC addressing that needed to
support pc-relative addressing also.  It does the compare of two __ibm128
floating point values when the -mcompat-xl switch is used, and the comparison
needs to compare the values against 0 and +infinity, and it needs to load up
the two constants.

Instead of trying to combine the two addressing forms, I added separate support
for pc-relative addressing, and added extra checks in supporting TOC
addressing.

The one place where I combined TOC and pc-relative addressing is using a common
function to set the memory alias set group.  Originally it was called
get_TOC_alias_set, and I renamed it to get_data_alias_set and I changed all of
the callers.

There will be several other patches before all of the pc-relative support is
available and turned on by default.

I have bootstraped the compiler on a little endian power8 system and there were
no regressions in the test suite.  Can I check it into the trunk?

2019-07-09  Michael Meissner  

* config/rs6000/rs6000-protos.h (get_data_alias_set): Rename
get_TOC_alias_set for use with both TOC & pc-relative addressing.
* config/rs6000/rs6000.c (create_TOC_reference): Add gcc_assert.
(rs6000_legitimize_address): Add check for not pc-relative in TOC
support.
(rs6000_legitimize_tls_address_aix): Call get_data_alias_set
instead of get_TOC_alias_set.
(use_toc_relative_ref): Add check for SYMBOL_REF and not
pc-relative to simplify callers.  Make tests easier to
understand.
(use_pc_relative_ref): New helper function.
(rs6000_emit_move): Add basic support for pc-relative addressing.
Call get_data_alias_set instead of get_TOC_alias_set.
(data_alias_set): Rename static variable.
(get_data_alias_set): Rename get_TOC_alias_set since it is now
used for both TOC and pc-relative addressing.
* config/rs6000/rs6000.md (cmp_internal2): Add support for
pc-relative addressing.  Call get_data_alias_set instead of
get_TOC_alias_set.

Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 273255)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -189,7 +189,7 @@ extern void rs6000_gen_section_name (cha
 extern void output_function_profiler (FILE *, int);
 extern void output_profile_hook  (int);
 extern int rs6000_trampoline_size (void);
-extern alias_set_type get_TOC_alias_set (void);
+extern alias_set_type get_data_alias_set (void);
 extern void rs6000_emit_prologue (void);
 extern void rs6000_emit_load_toc_table (int);
 extern unsigned int rs6000_dbx_register_number (unsigned int, unsigned int);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 273310)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7740,6 +7740,8 @@ create_TOC_reference (rtx symbol, rtx la
 {
   rtx tocrel, tocreg, hi;
 
+  gcc_assert (TARGET_TOC && !TARGET_PCREL);
+
   if (TARGET_DEBUG_ADDR)
 {
   if (SYMBOL_REF_P (symbol))
@@ -8137,7 +8139,7 @@ rs6000_legitimize_address (rtx x, rtx ol
emit_insn (gen_macho_high (reg, x));
   return gen_rtx_LO_SUM (Pmode, reg, x);
 }
-  else if (TARGET_TOC
+  else if (TARGET_TOC && !TARGET_PCREL
   && SYMBOL_REF_P (x)
   && constant_pool_expr_p (x)
   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode))
@@ -8441,7 +8443,7 @@ rs6000_legitimize_tls_address_aix (rtx a
 {
   tocref = create_TOC_reference (XEXP (sym, 0), NULL_RTX);
   mem = gen_const_mem (Pmode, tocref);
-  set_mem_alias_set (mem, get_TOC_alias_set ());
+  set_mem_alias_set (mem, get_data_alias_set ());
 }
   else
 return sym;
@@ -8459,7 +8461,7 @@ rs6000_legitimize_tls_address_aix (rtx a
   SYMBOL_REF_FLAGS (modaddr) |= SYMBOL_FLAG_LOCAL;
   tocref = create_TOC_reference (modaddr, NULL_RTX);
   rtx modmem = gen_const_mem (Pmode, tocref);
-  set_mem_alias_set (modmem, get_TOC_alias_set ());
+  set_mem_alias_set (modmem, get_dat

Fix wi::lshift

2019-07-09 Thread Marc Glisse

Hello,

because of an other bug, __builtin_constant_p is ignored in some cases, 
and this bug in wide_int went unnoticed. I am fixing it by making it match 
more closely the comment above.


Bootstrap+regtest together with a fix for __builtin_constant_p on 
x86_64-pc-linux-gnu.


2019-07-11  Marc Glisse  

* wide-int.h (wi::lshift): Reject negative values for the fast path.

--
Marc GlisseIndex: gcc/wide-int.h
===
--- gcc/wide-int.h	(revision 273306)
+++ gcc/wide-int.h	(working copy)
@@ -3025,22 +3025,22 @@ wi::lshift (const T1 &x, const T2 &y)
 	 handle the case where the shift value is constant and the
 	 result is a single nonnegative HWI (meaning that we don't
 	 need to worry about val[1]).  This is particularly common
 	 for converting a byte count to a bit count.
 
 	 For variable-precision integers like wide_int, handle HWI
 	 and sub-HWI integers inline.  */
   if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
 	  ? (STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT - 1)
 	 && xi.len == 1
-	 && xi.val[0] <= (HOST_WIDE_INT) ((unsigned HOST_WIDE_INT)
-	  HOST_WIDE_INT_MAX >> shift))
+	 && xi.val[0] >= 0
+	 && xi.val[0] <= HOST_WIDE_INT_MAX >> shift)
 	  : precision <= HOST_BITS_PER_WIDE_INT)
 	{
 	  val[0] = xi.ulow () << shift;
 	  result.set_len (1);
 	}
   else
 	result.set_len (lshift_large (val, xi.val, xi.len,
   precision, shift));
 }
   return result;


Re: [range-ops] patch 05/04: bonus round!

2019-07-09 Thread Jeff Law
On 7/6/19 3:26 AM, Aldy Hernandez wrote:
> 
> 
> On 7/3/19 7:12 PM, Jeff Law wrote:
>> On 7/1/19 4:24 AM, Aldy Hernandez wrote:
>>> This is completely unrelated to range-ops itself, but may yield better
>>> results in value_range intersections.  It's just something I found while
>>> working on VRP, and have been dragging around on our branch.
>>>
>>> If we know the intersection of two ranges is the empty set, there is no
>>> need to conservatively add anything to the result.
>>>
>>> Tested on x86-64 Linux with --enable-languages=all.
>>>
>>> Aldy
>>>
>>> range-ops-intersect-undefined.patch
>>>
>>> commit 4f9aa7bd1066267eee92f622ff29d78534158e20
>>> Author: Aldy Hernandez 
>>> Date:   Fri Jun 28 11:34:19 2019 +0200
>>>
>>>  Do not try to further refine a VR_UNDEFINED result when intersecting
>>>
>>>  value_ranges.
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 01fb97cedb2..b0d78ee6871 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,9 @@
>>> +2019-07-01  Aldy Hernandez  
>>> +
>>> +* tree-vrp.c (intersect_ranges): If we know the intersection is
>>> +empty, there is no need to conservatively add anything else to
>>> +the set.
>> Do we have a test where this improves the code or at least the computed
>> ranges?
> 
> I have long since lost the testcase, but a little hacking around to dump
> all the cases where we are trying to conservatively refine a
> VR_UNDEFINED, yields tons of hits.  See attached hack.
> 
> By far, the most common is the intersection of ~[0,0] and [0,0], which
> yields VR_UNDEFINED.  We then conservatively drop to VR1, which is
> [0,0], basically pessimizing the result.
> 
> Other examples include:
> 
> unsigned char [38, +INF], unsigned char [22, 22]
> unsigned char [4, 4], unsigned char [1, 1]
> unsigned char [4, 4], unsigned char [2, 2]
> unsigned int [0, 64], unsigned int [128, 128]
> unsigned int [0, 6], unsigned int [4294967275, 4294967275]
> unsigned int ~[35, 35], unsigned int [35, 35]
> unsigned int [46, 52], unsigned int [25, 25]
> etc
> 
> Within 7 minutes of building a compiler (up to stage2), this incorrect
> refinement of VR_UNDEFINED has been triggered 29000 times.
> 
> If we know it's VR_UNDEFINED (the empty set), I think we should leave it
> empty :).
So my guess is this happens mostly as a result of the way EVRP works.
It refines ranges as it does its domwalk through the CFG via the
intersection method.

I suspect a lot of the time it's coming from stuff like this:

;;   basic block 11, loop depth 1
;;pred:   9
;;10
  # h_14 = PHI <0B(9), &f(10)>
  *h_14 = 0;

Where the dereference creates ~[0,0] that we want to intersect with
[0,0] coming from the PHI for h_14 (this happens when threading jumps).

I think creating VR_UNDEFINED for this is fine.  One might argue that
when we get into this situation what we've found is an infeasible path,
but I'm far from being confident that's always the case.

The other case that's likely common is -O1 where we don't run VRP to
clean a bunch of this stuff up.  Then later the sprintf warning pass
comes along and given the crud still in the IL we trigger similar stuff.

FWIW, at least in cases like above the path isolation code will kick in
and isolate the path 9->11 where block 11' would just trap.  The
original block 11 would change into *&f = 0 as a result of the path
isolation.

I'm slightly concerned about the optimistic handling of VR_UNDEFINED in
vrp_meet and how that might interact with this patch.  But I think we'd
be in the realm of invalid/undefined code in those cases.

So OK for the trunk.

jeff



Committed (vectorizable_comparison): Swap operands only once.

2019-07-09 Thread Joern Wolfgang Rennecke
For gcc.dg/vect/vect-bool-cmp.c, vectorizable_comparison would swap the 
comparison operands

in fn7 once for each copy, thus all odd copies would end up unswapped.
Regression tested on x86_64-pc-linux-gnu.
Committed as obvious.
2019-07-09  Joern Rennecke  

* tree-vect-stmts.c (vectorizable_comparison) :
Swap operands only once.

Index: tree-vect-stmts.c
===
--- tree-vect-stmts.c   (revision 273313)
+++ tree-vect-stmts.c   (working copy)
@@ -10369,7 +10369,7 @@ vectorizable_comparison (stmt_vec_info s
 
   if (!slp_node)
{
- if (swap_p)
+ if (swap_p && j == 0)
std::swap (vec_rhs1, vec_rhs2);
  vec_oprnds0.quick_push (vec_rhs1);
  vec_oprnds1.quick_push (vec_rhs2);


[PATCH] i386: Add AVX512 unaligned intrinsics

2019-07-09 Thread Sunil Pandey
__m512i _mm512_loadu_epi32( void * sa);
__m512i _mm512_loadu_epi64( void * sa);
void _mm512_storeu_epi32(void * d, __m512i a);
void _mm256_storeu_epi32(void * d, __m256i a);
void _mm_storeu_epi32(void * d, __m128i a);
void _mm512_storeu_epi64(void * d, __m512i a);
void _mm256_storeu_epi64(void * d, __m256i a);
void _mm_storeu_epi64(void * d, __m128i a);

Tested on x86-64.

OK for trunk?

--Sunil Pandey


gcc/

PR target/90980
* config/i386/avx512fintrin.h (__v16si_u): New data type
(__v8di_u): Likewise
(_mm512_loadu_epi32): New.
(_mm512_loadu_epi64): Likewise.
(_mm512_storeu_epi32): Likewise.
(_mm512_storeu_epi64): Likewise.
* config/i386/avx512vlintrin.h (_mm_storeu_epi32): New.
(_mm256_storeu_epi32): Likewise.
(_mm_storeu_epi64): Likewise.
(_mm256_storeu_epi64): Likewise.

gcc/testsuite/

PR target/90980
* gcc.target/i386/avx512f-vmovdqu32-3.c: New test.
* gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise.
* gcc.target/i386/pr90980-1.c: Likewise.
* gcc.target/i386/pr90980-2.c: Likewise.


0001-i386-Add-AVX512-unaligned-intrinsics.patch
Description: Binary data


Re: [PATCH] Perform case-insensitive comparison when decoding register names (PR target/70320)

2019-07-09 Thread Segher Boessenkool
On Tue, Jul 09, 2019 at 10:16:31PM +0100, Jozef Lawrynowicz wrote:
> On Mon, 8 Jul 2019 16:42:15 -0500
> Segher Boessenkool  wrote:
> 
> > > Ok, yes a DEFHOOKPOD or similar sounds like a good idea, I'll look into 
> > > this
> > > alternative.  
> > 
> > What is that, like target macros?  But with some indirection?
> 
> Yes its for target macros, it looks like the "POD" in DEFHOOKPOD stands for
> "piece-of-data", i.e. the hook represents a variable rather than function.

But it is data, not a constant, so it does not allow optimising based
on its potentially constant value?  Where "potentially" in this case
means "always" :-/


Segher


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 08.07.19 23:19, Matthias Klose wrote:
> On 14.06.19 15:09, Gaius Mulley wrote:
>>
>> Hello,
>>
>> here is version two of the patches which introduce Modula-2 into the
>> GCC trunk.  The patches include:
>>
>>   (*)  a patch to allow all front ends to register a lang spec function.
>>(included are patches for all front ends to provide an empty
>> callback function).
>>   (*)  patch diffs to allow the Modula-2 front end driver to be
>>built using GCC Makefile and friends.
>>
>> The compressed tarball includes:
>>
>>   (*)  gcc/m2  (compiler driver and lang-spec stuff for Modula-2).
>>Including the need for registering lang spec functions.
>>   (*)  gcc/testsuite/gm2  (a Modula-2 dejagnu test to ensure that
>>the gm2 driver is built and can understands --version).
>>
>> These patches have been re-written after taking on board the comments
>> found in this thread:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02620.html
>>
>> it is a revised patch set from:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00220.html
>>
>> I've run make bootstrap and run the regression tests on trunk and no
>> extra failures occur for all languages touched in the ChangeLog.
>>
>> I'm currently tracking gcc trunk and gcc-9 with gm2 (which works well
>> with amd64/arm64/i386) - these patches are currently simply for the
>> driver to minimise the patch size.  There are also > 1800 tests in a
>> dejagnu testsuite for gm2 which can be included at some future time.
> 
> I had a look at the GCC 9 version of the patches, with a build including a 
> make
> install. Some comments:

[...]

>  - The internal tools in the gcclibdir are installed twice, with
>both vanilla names and prefixed/suffixed names.
> The installed tree:
> 
> ./usr/bin
> ./usr/bin/x86_64-linux-gnu-gm2-9
> ./usr/bin/x86_64-linux-gnu-gm2m-9
> ./usr/lib/gcc/x86_64-linux-gnu
> ./usr/lib/gcc/x86_64-linux-gnu/9
> ./usr/lib/gcc/x86_64-linux-gnu/9/cc1gm2
> ./usr/lib/gcc/x86_64-linux-gnu/9/gm2l
> ./usr/lib/gcc/x86_64-linux-gnu/9/gm2lcc
> ./usr/lib/gcc/x86_64-linux-gnu/9/gm2lgen
> ./usr/lib/gcc/x86_64-linux-gnu/9/gm2lorder
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-cc1gm2-9
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-gm2l-9
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-gm2lcc-9
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-gm2lgen-9
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-gm2lorder-9
> ./usr/lib/gcc/x86_64-linux-gnu/9/x86_64-linux-gnu-gm2m-9

With a fresh build, configured with

 --program-suffix=-9
 --program-prefix=x86_64-linux-gnu-

the latter set of internal binaries is installed, while I would expect just the
un-pre/post-fixed tool names.


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 09.07.19 21:48, Gaius Mulley wrote:
> Matthias Klose  writes:
> 
>>>  - libpth.{a,so} is installed in the system libdir, which
>>>conflicts with the installation of the libpth packages
>>>on most distros.
>>
>> found out that a system provided libpth can be used.  Otoh if you build the
>> in-tree libpth, it shouldn't be installed, but built as a convenience 
>> library,
>> like libgo using libffi, or libgphobos using zlib.
> 
> Hi Matthias,
> 
> as far as I know Redhat doesn't support libpth-dev - therefore it was
> decided to include libpth in the gm2 tree and autodetect build/install
> it as necessary.

That's ok, but then please don't install it as a system library.  that's what
convenience libraries are for (a libpth.a built with -fPIC, which you can link
against).


Re: (C++) Remove -fdeduce-init-list?

2019-07-09 Thread Marek Polacek
On Mon, May 20, 2019 at 02:38:31PM -0400, Marek Polacek wrote:
> Back in 2011, -fdeduce-init-list was marked as deprecated:
> .
> 
> 7.5 years later, is it time to rip it out completely from the compiler?

Seeing the -frepo deprecation, any opinions re this?

Marek


Re: [C++ PATCH] PR c++/90590 Suppress warning for enumeration value not handled in switch warning

2019-07-09 Thread Jason Merrill

On 7/9/19 11:18 AM, Matthew Beliveau wrote:

This patch suppresses the warning:  "enumeration value not handled in
switch", for enumerators that are defined in system headers and use
reserved names.



+  if (decl == NULL_TREE)
+   decl = lookup_name (TREE_PURPOSE (chain));


This seems likely to find an unrelated declaration.  If we have a name 
without a decl, I think it would be better to just look at that name 
rather than try to find the corresponding decl.  For location, we can 
use the location of the type.


Jason


Re: [RFC/PATCH v2][PR89245] Check REG_CALL_DECL note during the tail-merging

2019-07-09 Thread Jeff Law
On 7/9/19 2:06 PM, Dragan Mladjenovic wrote:
> This patch prevents merging of CALL instructions that that have different
> REG_CALL_DECL notes attached to them.
> 
> On most architectures this is not an important distinction. Usually 
> instruction patterns
> for calls to different functions reference different SYMBOL_REF-s, so they 
> won't match.
> On MIPS PIC calls get split into an got_load/*call_internal pair where the 
> latter represents
> indirect register call w/o SYMBOL_REF attached (until machine_reorg pass). 
> The bugzilla issue
> had such two internal_call-s merged despite the fact that they had different 
> register usage
> information assigned by ipa-ra.
> 
> As per comment form Richard Sandiford, this version compares reg usage for 
> both call
> instruction instead of shallow comparing the notes. Tests updated accordingly.
> 
> gcc/ChangeLog:
> 
> 2019-07-09  Dragan Mladjenovic  
> 
>   * cfgcleanup.c (old_insns_match_p): Check if used hard regs set is equal
>   for both call instructions.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-07-09  Dragan Mladjenovic  
> 
>   * gcc.target/mips/cfgcleanup-jalr1.c: New test.
>   * gcc.target/mips/cfgcleanup-jalr2.c: New test.
>   * gcc.target/mips/cfgcleanup-jalr3.c: New test.
THanks.  I've installed this on the trunk.

jeff


Re: [PATCH] Perform case-insensitive comparison when decoding register names (PR target/70320)

2019-07-09 Thread Jozef Lawrynowicz
On Mon, 8 Jul 2019 16:42:15 -0500
Segher Boessenkool  wrote:

> > Ok, yes a DEFHOOKPOD or similar sounds like a good idea, I'll look into this
> > alternative.  
> 
> What is that, like target macros?  But with some indirection?

Yes its for target macros, it looks like the "POD" in DEFHOOKPOD stands for
"piece-of-data", i.e. the hook represents a variable rather than function.

So we can just have a hook like TARGET_CASE_INSENSITIVE_REGISTER_NAME set to
false by default, and then check targetm.case_insensitive_register_name before
comparing the given regname with the names defined by the backend.

> 
> Making this target-specific sounds good, thanks Jozef.
> 
> 
> Segher

Thanks,
Jozef


Re: [PATCH] Deprecate -frepo option.

2019-07-09 Thread Jason Merrill

On 7/9/19 1:48 PM, Nathan Sidwell wrote:

On 7/9/19 9:00 AM, Martin Liška wrote:

On 7/9/19 1:41 PM, Nathan Sidwell wrote:

On 7/9/19 6:39 AM, Richard Biener wrote:

On Mon, Jul 8, 2019 at 2:04 PM Martin Liška  wrote:






Same happens also for GCC7. It does 17 iteration (#define 
MAX_ITERATIONS 17) and
apparently 17 is not enough to resolve all symbols. And it's really 
slow.


Ouch.


hm, 17 is a magic number.  in C++98 it was the maximum depth of 
template instantiations that implementations needed to support.  
Portable code could not expect more.  So the worst case -frepo 
behaviour would be 17 iterations.


That's not true any more, it's been 1024 since C++11.

Has a bug been filed about this frepo problem?


I create a new one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91125


If not, it suggest those using frepo are not compiling modern C++.


That said, I would recommend to remove it :)


In the end it's up to the C++ FE maintainers but the above clearly
doesn't look promising
(not sure if it keeps re-compiling _all_ repo-triggered templates or
just incrementally adds
them to new object files).



I'm not opposed to removing -frepo from GCC 10 but then I would start
noting it is obsolete
on the GCC 9 branch at least.


I concur.  frepo's serial reinvocation of the compiler is not 
compatible with modern C++ code bases.


Great. Then I'm sending patch that does the functionality removal.

Ready to be installed after proper testing & bootstrap?


I'd like Jason to render an opinion, and we should mark it obsolete in 
the gcc 9 branch (how much runway would that give people?)


I haven't noticed any responses to my earlier question: Are there any 
targets without COMDAT support that we still care about?


But given the observation above about the 17 limit, even if there are 
such targets it wouldn't be very useful for modern code.  And if people 
want to compile old code for old platforms, they might as well continue 
to use an old compiler.


So I'm OK with deprecating with a warning for the next GCC 9 release, to 
see if anyone complains, and removing in 10.


Jason


Re: [C++ Patch] A few additional location improvements to grokdeclarator and check_tag_decl

2019-07-09 Thread Jason Merrill

On 7/9/19 6:10 AM, Paolo Carlini wrote:

Hi,

On 08/07/19 23:44, Jason Merrill wrote:

On 6/23/19 7:58 AM, Paolo Carlini wrote:

+    error_at (smallest_type_location (get_type_quals (declspecs),
+  declspecs->locations),
How about adding a smallest_type_location overload that just takes 
declspecs?


Sure. The below has an additional location fixlet which I noticed over 
the last days, for "complex invalid for". Tested x86_64-linux, as usual.


OK, thanks.

Jason



Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Bernhard Reutner-Fischer
On 9 July 2019 15:37:30 CEST, Richard Biener  wrote:
>On Tue, 9 Jul 2019, Jan Hubicka wrote:
>
>> Hi,
>> this is updated variant I am testing.
>> It documents better how function works and streamlines the checks.
>> 
>> OK assuming it passes the tests?
>> 
>> Honza
>> 
>> Index: tree-ssa-alias.c
>> ===
>> --- tree-ssa-alias.c (revision 273193)
>> +++ tree-ssa-alias.c (working copy)
>> @@ -1128,6 +1128,91 @@ aliasing_component_refs_p (tree ref1,
>>return false;
>>  }
>>  
>> +/* FIELD1 and FIELD2 are two fields of component refs.  We assume
>> +   that bases of both component refs are
>> + (*) are either equivalent or they point to different objects.
>
>are either equivalent(*) or not overlapping
>
>> +   We do not assume that FIELD1 and FIELD2 are of same type.
>
>that the containers of FIELD1 and FIELD2 are of the same type?
>
>> +
>> +   Return 0 if FIELD1 and FIELD2 satisfy (*).
>> +   This is the case when their offsets are the same.
>
>Hmm, so when the offsets are the same then the bases are equivalent?
>I think you want to say
>
> Return 0 if in case the component refs satisfy (*) we
> know FIELD1 and FIELD2 are overlapping exactly.
>
>> +   Return 1 if FIELD1 and FIELD2 are non-overlapping.
>> +
>> +   Return -1 otherwise.
>> +
>> +   Main difference between 0 and -1 is to let
>> +   nonoverlapping_component_refs_since_match_p discover the
>semnatically
>
>semantically
>
>otherwise looks good now.
>
>Thanks,
>Richard.
>
>> +   equivalent part of the access path.
>> +
>> +   Note that this function is used even with -fno-strict-aliasing
>> +   and makes use of no TBAA assumptions.  */
>> +
>> +static int
>> +nonoverlapping_component_refs_p_1 (const_tree field1, const_tree
>field2)
>> +{
>> +  /* If both fields are of the same type, we could save hard work of
>> + comparing offsets.  */
>> +  tree type1 = DECL_CONTEXT (field1);
>> +  tree type2 = DECL_CONTEXT (field2);
>> +
>> +  if (DECL_BIT_FIELD_REPRESENTATIVE (field1))
>> +field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
>> +  if (DECL_BIT_FIELD_REPRESENTATIVE (field2))
>> +field2 = DECL_BIT_FIELD_REPRESENTATIVE (field1);

Typo: s/field1/field2/

>> +
>> +  /* ??? Bitfields can overlap at RTL level so punt on them.
>> + FIXME: RTL expansion should be fixed by adjusting the access
>path
>> + when producing MEM_ATTRs for MEMs which are wider than 
>> + the bitfields similarly as done in set_mem_attrs_minus_bitpos. 
>*/
>> +  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
>> +return -1;

That's a pity.
thanks,


Re: [PATCH 1/2] Come up with function_decl_type and use it in tree_function_decl.

2019-07-09 Thread Jason Merrill

On 7/9/19 6:17 AM, Marc Glisse wrote:

On Tue, 9 Jul 2019, Martin Liška wrote:


On 7/9/19 9:49 AM, Marc Glisse wrote:

On Tue, 9 Jul 2019, Marc Glisse wrote:


On Mon, 8 Jul 2019, Martin Liška wrote:

The patch apparently has DECL_IS_OPERATOR_DELETE only on the 
replaceable global deallocation functions, not all delete 
operators, contrary to DECL_IS_OPERATOR_NEW, so the name is 
misleading. On the other hand, those seem to be the ones for which 
the optimization is legal (well, not quite, the rules are in terms 
of operator new, and I am not sure how well operator delete has to 
match, but close enough).


Are you talking about this location where we set OPERATOR_NEW:
https://github.com/gcc-mirror/gcc/blob/master/gcc/cp/decl.c#L13643
?

That's the only place where we set OPERATOR_NEW flag and not 
OPERATOR_DELETE.


Yes, I think that's the place.

Again, not setting DECL_IS_OPERATOR_DELETE on local operator delete
seems misleading, but setting it would let us optimize in cases 
where we

are not really allowed to. Maybe just rename your macro to
DECL_IS_GLOBAL_OPERATOR_DELETE?


Hmm, I replied too fast.

Global operator delete does not seem like a good terminology, the 
ones marked in the patch would be the usual (=non-placement) 
replaceable deallocation functions.


I cannot find a requirement that operator new and operator delete 
should match. The rules to omit allocation are stated in terms of 
which operator new is called, but do not seem to care which operator 
delete is used. So allocating with the global operator new and 
deallocating with a class overload of operator delete can be removed, 
but not the reverse (not sure how they came up with such a rule...). 


Correct.  The standard just says that an implementation is allowed to 
omit a call to the replaceable ::operator new; it does not place any 
constraints on that, the conditions for such omission are left up to the 
implementation.


If the user's code uses global new and class delete for the same 
pointer, that would suggest that they're doing something odd, and we 
might as well leave it alone.  I would expect this to be very rare.



Which means we would need:


Thank you Mark for digging deep in that.



keep DECL_IS_OPERATOR_NEW for the current uses

DECL_IS_REPLACEABLE_OPERATOR_NEW (equivalent to DECL_IS_OPERATOR_NEW 
&& DECL_IS_MALLOC? not exactly but close I think) for DCE


DECL_IS_OPERATOR_DELETE (which also includes some class overloads) 
for DCE


Note that with the current version of the patch we are out of free 
bits in struct GTY(()) tree_function_decl.
Would it be possible to tweak the current patch to cover what you 
described?


If you approximate DECL_IS_REPLACEABLE_OPERATOR_NEW with 
DECL_IS_OPERATOR_NEW && DECL_IS_MALLOC, it shouldn't need more bits than 
the current patch. I think the main difference is if a user adds 
attribute malloc to his class-specific operator new, where it will 
enable DCE, but since the attribute is non-standard, we can just 
document that behavior, it might even be desirable.


Sure, it seems desirable to me.

Jason


[RFC/PATCH v2][PR89245] Check REG_CALL_DECL note during the tail-merging

2019-07-09 Thread Dragan Mladjenovic
This patch prevents merging of CALL instructions that that have different
REG_CALL_DECL notes attached to them.

On most architectures this is not an important distinction. Usually instruction 
patterns
for calls to different functions reference different SYMBOL_REF-s, so they 
won't match.
On MIPS PIC calls get split into an got_load/*call_internal pair where the 
latter represents
indirect register call w/o SYMBOL_REF attached (until machine_reorg pass). The 
bugzilla issue
had such two internal_call-s merged despite the fact that they had different 
register usage
information assigned by ipa-ra.

As per comment form Richard Sandiford, this version compares reg usage for both 
call
instruction instead of shallow comparing the notes. Tests updated accordingly.

gcc/ChangeLog:

2019-07-09  Dragan Mladjenovic  

* cfgcleanup.c (old_insns_match_p): Check if used hard regs set is equal
for both call instructions.

gcc/testsuite/ChangeLog:

2019-07-09  Dragan Mladjenovic  

* gcc.target/mips/cfgcleanup-jalr1.c: New test.
* gcc.target/mips/cfgcleanup-jalr2.c: New test.
* gcc.target/mips/cfgcleanup-jalr3.c: New test.
---
 gcc/cfgcleanup.c |  9 +
 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr1.c | 19 +++
 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr2.c | 23 +++
 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr3.c | 23 +++
 4 files changed, 74 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr2.c
 create mode 100644 gcc/testsuite/gcc.target/mips/cfgcleanup-jalr3.c

diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 992912c..fca3a08 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dce.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "regs.h"
 
 #define FORWARDER_BLOCK_P(BB) ((BB)->flags & BB_FORWARDER_BLOCK)
 
@@ -1224,6 +1225,14 @@ old_insns_match_p (int mode ATTRIBUTE_UNUSED, rtx_insn 
*i1, rtx_insn *i2)
}
}
}
+
+  HARD_REG_SET i1_used, i2_used;
+
+  get_call_reg_set_usage (i1, &i1_used, call_used_reg_set);
+  get_call_reg_set_usage (i2, &i2_used, call_used_reg_set);
+
+  if (!hard_reg_set_equal_p (i1_used, i2_used))
+return dir_none;
 }
 
   /* If both i1 and i2 are frame related, verify all the CFA notes
diff --git a/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr1.c 
b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr1.c
new file mode 100644
index 000..24c1826
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-mabicalls -fpic -mno-mips16 -mno-micromips" } */
+/* { dg-skip-if "needs codesize optimization" { *-*-* } { "-O0" "-O1" "-O2" 
"-O3" } { "" } } */
+
+extern void foo (void*);
+
+extern void bar (void*);
+
+void
+test (void* p)
+{
+   if (!p)
+   foo(p);
+   else
+   bar(p);
+}
+
+/* { dg-final { scan-assembler-not "\\\.reloc\t1f,R_MIPS_JALR,foo" } } */
+/* { dg-final { scan-assembler-not "\\\.reloc\t1f,R_MIPS_JALR,bar" } } */
diff --git a/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr2.c 
b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr2.c
new file mode 100644
index 000..9fd75c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr2.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-mabicalls -fpic -mno-mips16 -mno-micromips" } */
+/* { dg-additional-options "-fno-inline -fipa-ra -mcompact-branches=never" } */
+/* { dg-skip-if "needs codesize optimization" { *-*-* } { "-O0" "-O1" "-O2" 
"-O3" } { "" } } */
+
+static int foo (void* p) { __asm__ (""::"r"(p):"$t0"); return 0; }
+
+static int bar (void* p) { return 1; }
+
+int
+test (void* p)
+{
+  int res = !p ? foo(p) : bar(p);
+
+  register int tmp __asm__("$t0") = -1;
+  __asm__ (""::"r"(tmp));
+
+  return res;
+}
+
+/* { dg-final { scan-assembler "\\\.reloc\t1f,R_MIPS_JALR,foo" } } */
+/* { dg-final { scan-assembler "\\\.reloc\t1f,R_MIPS_JALR,bar" } } */
+/* { dg-final { scan-assembler-not "\\.set\tnomacro\n\tjalr\t\\\$25" } } */
diff --git a/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr3.c 
b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr3.c
new file mode 100644
index 000..580c6ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/cfgcleanup-jalr3.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-mabicalls -fpic -mno-mips16 -mno-micromips" } */
+/* { dg-additional-options "-fno-inline -fipa-ra -mcompact-branches=never" } */
+/* { dg-skip-if "needs codesize optimization" { *-*-* } { "-O0" "-O1" "-O2" 
"-O3" } { "" } } */
+
+static int foo (void* p) { return 0; }
+
+static int bar (void* p) { return 1; }
+
+int
+test (void* p)
+{
+  int res = !p ? foo(p) : bar(p);
+
+  register int tmp __asm__("$t0") = -1;
+  __asm__ (""::"r"(tmp));
+
+  return r

Re: Patch ping (Re: [PATCH] Fortran include line fixes and -fdec-include support)

2019-07-09 Thread Gerald Pfeifer
On Sat, 19 Jan 2019, Jakub Jelinek wrote:
>> how about the refinement below?
> LGTM.  Thanks.

The context has changed a bit since then (due to links being
added), so I had to manually re-apply the patch and committed
the following now.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.69
diff -u -r1.69 changes.html
--- changes.html29 May 2019 22:22:01 -  1.69
+++ changes.html9 Jul 2019 19:48:01 -
@@ -740,10 +740,10 @@
   
   
   A new command-line option https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/Fortran-Dialect-Options.html#index-fdec-include";>-fdec-include,
 set also
-by https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/Fortran-Dialect-Options.html#index-fdec";>-fdec
 option, has been added for an extension
-for compatibility with legacy code.  With this option,
-INCLUDE directive is parsed also as a statement,
-which allows the directive to be written on multiple source lines
+by the https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/Fortran-Dialect-Options.html#index-fdec";>-fdec
 option,
+has been added to increase compatibility with legacy code.  With this
+option, an INCLUDE directive is also parsed as a statement,
+which allows the directive to be spread across multiple source lines
 with line continuations.
   
   


Re: [wwwdocs] Buildstat update for 6.x

2019-07-09 Thread Gerald Pfeifer
On Wed, 27 Feb 2019, Tom G. Christensen wrote:
> Latest results for 6.x.

Applied, thank you!  Note, I had to manually apply the last
three hunks, since patch somehow did not like the format (w/o
me seeing anything obviously wrong):

> Testresults for 6.5.0:
>   hppa2.0w-hp-hpux11.11
>   hppa64-hp-hpux11.11
>   powerpc64le-unknown-linux-gnu
>   powerpc64-unknown-linux-gnu
>   x86_64-w64-mingw32

While we are at it, these patches coming in during my sabbatical
earlier this year (and sitting until now :-() do make me wonder 
whether you really wouldn't want to get write access yourself?
The offer stands. ;-)

Gerald


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Matthias Klose  writes:

>>  - libpth.{a,so} is installed in the system libdir, which
>>conflicts with the installation of the libpth packages
>>on most distros.
>
> found out that a system provided libpth can be used.  Otoh if you build the
> in-tree libpth, it shouldn't be installed, but built as a convenience library,
> like libgo using libffi, or libgphobos using zlib.

Hi Matthias,

as far as I know Redhat doesn't support libpth-dev - therefore it was
decided to include libpth in the gm2 tree and autodetect build/install
it as necessary.


regards,
Gaius


Re: [wwwdocs] Buildstat update for 7.x

2019-07-09 Thread Gerald Pfeifer
On Wed, 27 Feb 2019, Tom G. Christensen wrote:
> Latest results for 7.x.
> 
> -tgc
> 
> Testresults for 7.4.0:
>   x86_64-w64-mingw32

Thank you, applied (finally).

Gerald


Re: [PATCH] simplify-rtx.c (simplify_unary_operation_1): Change BITSIZE to PRECISION.

2019-07-09 Thread Jeff Law
On 7/8/19 1:11 AM, John Darrington wrote:
> gcc/
> * simplify-rtx.c (simplify_unary_operation_1): Change BITSIZE to PRECISION
>  in simplification of (extend ashiftrt (ashift ..)))  Otherwise the
>  gcc_assert can catch when dealing with partial int modes.
THanks.  I edited the ChangeLog entry a bit and installed the patch on
the trunk.

jeff
> ---


Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

2019-07-09 Thread Bernhard Reutner-Fischer
On Mon, 8 Jul 2019 09:05:04 -0700
Steve Kargl  wrote:

> On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:
> > 
> > Good point. If you happen to have a USB stick handy, can you try the
> > simple C benchmark program at
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
> > 
> > (the kernel will coalesce IO's by itself, so the granularity of IO
> > syscalls is not necessarily the same as the actual IO to devices.
> > Network filesystems like NFS/Lustre/GPFS may have less latitude here
> > due to coherency requirements etc.)

GFORTRAN_BUFFER_SIZE_FORMATTED sounds a bit odd, maybe
GFORTRAN_(UN)FORMATTED_BUFFER_SIZE would read more natural.

And let me note 2 flaws in the benchmark:

>   left = N;
>   w = p;
>   t1 = walltime();
>   while (left > 0)
> {
>   if (left >= blocksize)
> to_write = blocksize;
>   else
> to_write = left;
> 
>   write (fd, w, blocksize);

1) this should write to_write, not blocksize i assume.
2) you don't catch short writes

>   w += to_write;
>   left -= to_write;
> }

So, short of using iozone, it should probably be more like (modulo
typos):
  left = N;
  w = p;
  t1 = walltime();
  while (left > 0)
{
  if (left >= blocksize)
to_write = blocksize;
  else
to_write = left;
  while (to_write > 0) {
errno = 0;
ssize_t wrote = write (fd, w, to_write);
if (wrote < 0 && errno != EINTR) /* retry EINTR or bail */
  break;
w += wrote;
left -= wrote;
to_write -= wrote;
  }
}
thanks,


Re: [PING][PATCH] constrain one character optimization to one character stores (PR 90989)

2019-07-09 Thread Jeff Law
On 7/8/19 8:37 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01506.html
> 
> Jeff (et al.), do you have any outstanding questions/concerns
> about the patch?
Sorry I wasn't clear.  All my concerns are resolved.  The patch is fine
for the trunk.

jeff
> 


Re: [PATCH 3/3] change class-key of PODs to struct and others to class (PR 61339)

2019-07-09 Thread Martin Sebor

On 7/9/19 9:17 AM, Richard Sandiford wrote:

Martin Sebor  writes:

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index cfc41e1ed86..625d5b17413 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6428,7 +6428,7 @@ extern tree get_scope_of_declarator   (const 
cp_declarator *);
  extern void grok_special_member_properties(tree);
  extern bool grok_ctor_properties  (const_tree, const_tree);
  extern bool grok_op_properties(tree, bool);
-extern tree xref_tag   (enum tag_types, tree, 
tag_scope, bool, bool * = NULL);
+extern tree xref_tag   (enum tag_types, tree, 
tag_scope, bool);
  extern tree xref_tag_from_type(tree, tree, tag_scope);
  extern void xref_basetypes(tree, tree);
  extern tree start_enum(tree, tree, tree, 
tree, bool, bool *);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 005f99a6e15..9accc3d141b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14119,7 +14119,7 @@ lookup_and_check_tag (enum tag_types tag_code, tree 
name,
  
  static tree

  xref_tag_1 (enum tag_types tag_code, tree name,
-tag_scope scope, bool template_header_p, bool *new_p)
+tag_scope scope, bool template_header_p)
  {
enum tree_code code;
tree context = NULL_TREE;
@@ -14151,9 +14151,6 @@ xref_tag_1 (enum tag_types tag_code, tree name,
if (t == error_mark_node)
  return error_mark_node;
  
-  /* Let the caller know this is a new type.  */

-  *new_p = t == NULL_TREE;
-
if (scope != ts_current && t && current_class_type
&& template_class_depth (current_class_type)
&& template_header_p)
@@ -14215,7 +14212,6 @@ xref_tag_1 (enum tag_types tag_code, tree name,
  scope = ts_current;
}
  t = pushtag (name, t, scope);
- *new_p = true;
}
  }
else
@@ -14267,13 +14263,11 @@ xref_tag_1 (enum tag_types tag_code, tree name,
  
  tree

  xref_tag (enum tag_types tag_code, tree name,
-  tag_scope scope, bool template_header_p, bool *new_p /* = NULL */)
+  tag_scope scope, bool template_header_p)
  {
bool dummy;
-  if (!new_p)
-new_p = &dummy;
bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
-  tree ret = xref_tag_1 (tag_code, name, scope, template_header_p, new_p);
+  tree ret = xref_tag_1 (tag_code, name, scope, template_header_p);
timevar_cond_stop (TV_NAME_LOOKUP, subtime);
return ret;
  }
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 52af8c0c6d6..d16bf253058 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28193,8 +28193,6 @@ cp_parser_template_declaration_after_parameters 
(cp_parser* parser,
   member_p,
 
/*explicit_specialization_p=*/false,
   &friend_p);
-  // maybe_warn_struct_vs_class (token->location, TREE_TYPE (decl));
-
pop_deferring_access_checks ();
  
/* If this is a member template declaration, let the front


Looks like this might have been part of 1/3.


Yes, this and a few other hunks didn't belong in this patch.
I removed them, retested the patch, and committed r273311.



OK otherwise.  Thanks again for doing this.

(I guess a lot of these tags could be removed, but that was just as true
before the patch, so it's still a strict improvement.)


Most could be removed and my own preference would have been to
remove them.  The warning has a mechanism for figuring out which
ones can one can go and which ones are needed and I considered
making use of it.  In the end I decided to be conservative and
keep them in case someone preferred it that way.  Making
the change now that the cleanup is done will be slightly more
involved.  I suppose we could add yet another warning to find
them: -Wredundant-tag.

Martin


Re: [PATCH, committed] PowerPC Prefixed Memory, Patch #5 (move create_TOC_reference)

2019-07-09 Thread Segher Boessenkool
On Tue, Jul 09, 2019 at 01:32:26PM -0400, Michael Meissner wrote:
> On Mon, Jul 08, 2019 at 01:53:13PM -0500, Segher Boessenkool wrote:
> > Please do; as a separate patch.  Thanks in advance.  A patch purely moving
> > it back is pre-approved.
> 
> I just committed this patch:
> 
> 2019-07-09  Michael Meissner  
> 
>   * config/rs6000/rs6000-internal.h (create_TOC_reference): Delete.
>   * config/rs6000/rs6000-logue.c (create_TOC_reference): Move
>   function from rs6000-logue.c back to rs6000.c.
>   * config/rs6000/rs6000.c (create_TOC_reference): Likewise.

Thanks!


Segher


Re: [PATCH] Deprecate -frepo option.

2019-07-09 Thread Nathan Sidwell

On 7/9/19 9:00 AM, Martin Liška wrote:

On 7/9/19 1:41 PM, Nathan Sidwell wrote:

On 7/9/19 6:39 AM, Richard Biener wrote:

On Mon, Jul 8, 2019 at 2:04 PM Martin Liška  wrote:






Same happens also for GCC7. It does 17 iteration (#define MAX_ITERATIONS 17) and
apparently 17 is not enough to resolve all symbols. And it's really slow.


Ouch.


hm, 17 is a magic number.  in C++98 it was the maximum depth of template 
instantiations that implementations needed to support.  Portable code could not 
expect more.  So the worst case -frepo behaviour would be 17 iterations.

That's not true any more, it's been 1024 since C++11.

Has a bug been filed about this frepo problem?


I create a new one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91125


If not, it suggest those using frepo are not compiling modern C++.


That said, I would recommend to remove it :)


In the end it's up to the C++ FE maintainers but the above clearly
doesn't look promising
(not sure if it keeps re-compiling _all_ repo-triggered templates or
just incrementally adds
them to new object files).



I'm not opposed to removing -frepo from GCC 10 but then I would start
noting it is obsolete
on the GCC 9 branch at least.


I concur.  frepo's serial reinvocation of the compiler is not compatible with 
modern C++ code bases.


Great. Then I'm sending patch that does the functionality removal.

Ready to be installed after proper testing & bootstrap?


I'd like Jason to render an opinion, and we should mark it obsolete in 
the gcc 9 branch (how much runway would that give people?)


nathan

--
Nathan Sidwell


Re: [C++ PATCH] PR c++/90590 Suppress warning for enumeration value not handled in switch warning

2019-07-09 Thread Marek Polacek
On Tue, Jul 09, 2019 at 11:18:53AM -0400, Matthew Beliveau wrote:
> index 000..8aa65cf0afd
> --- /dev/null
> +++ gcc/testsuite/c-c++-common/pr90590-2.c
> @@ -0,0 +1,8 @@
> +#include "pr90590-2.h"
> +
> +void
> +fn ()
> +{
> +  switch (c.b) // { dg-bogus "enumeration value" }
> +;
> +}

I suppose this test should also have
// PR c++/90590
// { dg-options -Wswitch }
because without -Wswitch the warning wouldn't trigger in any case.

Marek


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 09.07.19 17:53, Gaius Mulley wrote:
> Rainer Orth  writes:
> 
>> Hi Matthias,
>>
>>> I had a look at the GCC 9 version of the patches, with a build including a 
>>> make
>>> install. Some comments:
>>>
>>>  - A parallel build (at least with -j4) isn't working. A sequental
>>>build works fine.  I think forcing a sequential build will not
>>>work well, increasing the build time too much.
>>
>> absolutely: I'd go as far as claiming that this is the number one
>> priority.  Otherwise build and test times are just too long for all but
>> the most dedicated testers, and forcing a sequential build would be a
>> showstopper for trunk integration.
> 
> Hi,
> 
> Many thanks for all the feedback/bugs/patches.
> I've been working through some of these.  The parallel build is now done.

It seems to work, however I see

make[11]: Entering directory
'/home/packages/gcc/9/u/gcc-9-9.1.1/build/x86_64-linux-gnu/32/libgm2/libpth/pth'
make[11]: warning: jobserver unavailable: using -j1.  Add '+' to parent make 
rule.
/bin/bash: line 0: test: !=: unary operator expected
../../../../../../src/libgm2/libpth/pth/shtool scpp -o pth_p.h -t
../../../../../../src/libgm2/libpth/pth/pth_p.h.in -Dcpp -Cintern -M '==#==' \

not sure what's going wrong.

In both gcc/gm2 and libgm2, there are explicit calls to make, which probably
should bre replaced by $(MAKE).




[PATCH, committed] PowerPC Prefixed Memory, Patch #5 (move create_TOC_reference)

2019-07-09 Thread Michael Meissner
On Mon, Jul 08, 2019 at 01:53:13PM -0500, Segher Boessenkool wrote:
> Please do; as a separate patch.  Thanks in advance.  A patch purely moving
> it back is pre-approved.

I just committed this patch:

2019-07-09  Michael Meissner  

* config/rs6000/rs6000-internal.h (create_TOC_reference): Delete.
* config/rs6000/rs6000-logue.c (create_TOC_reference): Move
function from rs6000-logue.c back to rs6000.c.
* config/rs6000/rs6000.c (create_TOC_reference): Likewise.

Index: gcc/config/rs6000/rs6000-internal.h
===
--- gcc/config/rs6000/rs6000-internal.h (revision 273308)
+++ gcc/config/rs6000/rs6000-internal.h (working copy)
@@ -92,7 +92,6 @@ extern void rs6000_emit_prologue_compone
 extern void rs6000_emit_epilogue_components (sbitmap components);
 extern void rs6000_set_handled_components (sbitmap components);
 extern rs6000_stack_t * rs6000_stack_info (void);
-extern rtx create_TOC_reference (rtx symbol, rtx largetoc_reg);
 extern rtx rs6000_got_sym (void);
 extern struct machine_function *rs6000_init_machine_status (void);
 extern bool save_reg_p (int reg);
Index: gcc/config/rs6000/rs6000-logue.c
===
--- gcc/config/rs6000/rs6000-logue.c(revision 273308)
+++ gcc/config/rs6000/rs6000-logue.c(working copy)
@@ -1406,41 +1406,6 @@ uses_TOC (void)
 }
 #endif
 
-rtx
-create_TOC_reference (rtx symbol, rtx largetoc_reg)
-{
-  rtx tocrel, tocreg, hi;
-
-  if (TARGET_DEBUG_ADDR)
-{
-  if (SYMBOL_REF_P (symbol))
-   fprintf (stderr, "\ncreate_TOC_reference, (symbol_ref %s)\n",
-XSTR (symbol, 0));
-  else
-   {
- fprintf (stderr, "\ncreate_TOC_reference, code %s:\n",
-  GET_RTX_NAME (GET_CODE (symbol)));
- debug_rtx (symbol);
-   }
-}
-
-  if (!can_create_pseudo_p ())
-df_set_regs_ever_live (TOC_REGISTER, true);
-
-  tocreg = gen_rtx_REG (Pmode, TOC_REGISTER);
-  tocrel = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, symbol, tocreg), 
UNSPEC_TOCREL);
-  if (TARGET_CMODEL == CMODEL_SMALL || can_create_pseudo_p ())
-return tocrel;
-
-  hi = gen_rtx_HIGH (Pmode, copy_rtx (tocrel));
-  if (largetoc_reg != NULL)
-{
-  emit_move_insn (largetoc_reg, hi);
-  hi = largetoc_reg;
-}
-  return gen_rtx_LO_SUM (Pmode, hi, tocrel);
-}
-
 /* Issue assembly directives that create a reference to the given DWARF
FRAME_TABLE_LABEL from the current function section.  */
 void
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 273308)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7735,6 +7735,45 @@ constant_pool_expr_p (rtx op)
  && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (base), Pmode));
 }
 
+/* Create a TOC reference for symbol_ref SYMBOL.  If LARGETOC_REG is non-null,
+   use that as the register to put the HIGH value into if register allocation
+   is already done.  */
+
+rtx
+create_TOC_reference (rtx symbol, rtx largetoc_reg)
+{
+  rtx tocrel, tocreg, hi;
+
+  if (TARGET_DEBUG_ADDR)
+{
+  if (SYMBOL_REF_P (symbol))
+   fprintf (stderr, "\ncreate_TOC_reference, (symbol_ref %s)\n",
+XSTR (symbol, 0));
+  else
+   {
+ fprintf (stderr, "\ncreate_TOC_reference, code %s:\n",
+  GET_RTX_NAME (GET_CODE (symbol)));
+ debug_rtx (symbol);
+   }
+}
+
+  if (!can_create_pseudo_p ())
+df_set_regs_ever_live (TOC_REGISTER, true);
+
+  tocreg = gen_rtx_REG (Pmode, TOC_REGISTER);
+  tocrel = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, symbol, tocreg), 
UNSPEC_TOCREL);
+  if (TARGET_CMODEL == CMODEL_SMALL || can_create_pseudo_p ())
+return tocrel;
+
+  hi = gen_rtx_HIGH (Pmode, copy_rtx (tocrel));
+  if (largetoc_reg != NULL)
+{
+  emit_move_insn (largetoc_reg, hi);
+  hi = largetoc_reg;
+}
+  return gen_rtx_LO_SUM (Pmode, hi, tocrel);
+}
+
 /* These are only used to pass through from print_operand/print_operand_address
to rs6000_output_addr_const_extra over the intervening function
output_addr_const which is not target code.  */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC

2019-07-09 Thread Indu Bhagat




On 07/04/2019 03:43 AM, Richard Biener wrote:

Hmm...a GCC plugin for CTF generation at compile-time may work out for a single
compilation unit.  But I am not sure how will LTO be supported in that case.
Basically, for LTO and -gtLEVEL to work together, I need the lto-wrapper to be
aware of the presence of .ctf sections (so I think). I will need to combine the
.ctf sections from multiple compilation units into a CTF archive, which the
linker can then de-duplicate.

True.  lto-wrapper does this kind of dancing for the much more complex set of
DWARF sections already.


Even if I assume that the technical hurdle in the above paragraph is solvable
within the purview of a plugin, I fear worse problems of adoption, maintenance
and distribution in the long run, if CTF support unfortunately ever remains to 
be
done via a plugin for reasons unforeseen.

Going the plugin route for the short term, will continue to suffer similar
problems of distribution and support.

- Is the plugin infrastructure supported on most platforms ? Also, I see that
the plugin infrastructure supports all gcc versions from 4.5 onwards.
Can someone confirm ? ( We minimally want the toolchain support with
GCC 4.8.5 and GCC 8 and later, for now. )

The infrastructure is quite old but you'd need new invocation hooks so this
won't help.



OK then.  I will continue to focus on my current implementation without
exploring the plugin option at this time.  Thanks for confirming.

Indu



Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Matthias Klose  writes:

> And an unrelated one: You are introducing python2 as a build-dependency.
> Afaics, it's only one invocation
>
> python ../../src/gcc/gm2/tools-src/def2texi.py -uLibraries -s../../src/gcc/gm2
> -b/home/packages/gcc/9/u/gcc-9-9.1.1/build/gcc/gm2 >
> /home/packages/gcc/9/u/gcc-9-9.1.1/build/gcc/gm2/gm2-libs.texi
>
> but there might be more implicit python calls via a shebang line.
>
> Also it's python2, not ready to run with python3.  I think you should not rely
> on python2 only code anymore.

Hi Matthias,

ah yes, true - I'll convert the scripts to Python 3

regards,
Gaius


Re: [PATCH V4] PR88497 - Extend reassoc for vector bit_field_ref

2019-07-09 Thread Segher Boessenkool
On Tue, Jul 09, 2019 at 10:28:06AM +0800, Kewen.Lin wrote:
> on 2019/7/9 上午12:32, Segher Boessenkool wrote:
> > On Mon, Jul 08, 2019 at 04:07:00PM +0800, Kewen.Lin wrote:
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
> >> @@ -0,0 +1,60 @@
> >> +/* { dg-do run } */
> >> +/* { dg-require-effective-target vect_double } */
> >> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } 
> >> } } */
> > 
> > For "dg-do run" tests, you need "powerpc_vsx_hw".  "_ok" only tests if
> > the assembler can handle VSX instructions, not whether the test system
> > can run them.  (powerpc_vsx_ok is what you need for "dg-do assemble" or
> > "dg-do link" tests.  It also tests if you can use -mvsx, but that doesn't
> > do what you might hope it does: you can use -mvsx together with a -mcpu=
> > that doesn't support VSX, for example).
> 
> Thanks, I will update it.  But sorry that I can't find "powerpc_vsx_hw" but 
> "vsx_hw_available".  I guess it's the one you are referring to?

Yeah, sorry.  You can also use just "vsx_hw".

> And I happened
> to find the vect_double will force powerpc to check vsx_hw_available.

Yes :-)  So this whole line is unnecessary.


Segher


Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Richard Biener
On July 9, 2019 3:10:19 PM GMT+02:00, Michael Matz  wrote:
>Hi,
>
>On Tue, 9 Jul 2019, Richard Biener wrote:
>
>> > So a "backedge" in this sense would be e->dest->index <
>e->src->index.
>> > No?
>> 
>> To me the following would make sense.
>
>The basic block index is not a DFS index, so no, that's not a test for 
>backedge.

I think in CFG RTL mode the BB index designates the order of the BBs in the 
object file? So this is a way to identify backwards jumps? 

Richard. 

>
>Ciao,
>Michael.



Re: [PATCH 2/3] change class-key of PODs to struct and others to class (PR 61339)

2019-07-09 Thread Martin Sebor

On 7/9/19 7:48 AM, Richard Sandiford wrote:

Martin Sebor  writes:

Hopefully with the right patch this time (thanks Jon).

On 7/8/19 4:00 PM, Martin Sebor wrote:

The attached patch changes the class-key of class definitions that
satisfy the requirements on a POD struct to 'struct', and that of
struct definitions that aren't POD to class, according to the GCC
coding convention.  The patch is also prerequisite for GCC being
able to compile cleanly with -Wmismatched-tags.

I made the changes building GCC with -Wstruct-not-pod and
-Wclass-is-pod enabled, scanning the build log for instances
of each warning, and using a script replacing the class-key
as necessary and adjusting the access of the members declared
immediately after the class-head.

Martin


PR c++/61339 - add mismatch between struct and class [-Wmismatched-tags] to 
non-bugs


...

gcc/lto/ChangeLog:

* lto-dump.c: Same.


Need to cut-&-paste the description for this one.


Done.

...


libcpp/ChangeLog:

* include/line-map.h: Change class-key from class to struct and vice
versa to match convention and avoid -Wclass-is-pod and -Wstruct-no-pod.
* mkdeps.c: Same.yyy


s/yyy// :-)


Ditto.



The changelog format is outdoing itself in usefulness here...


diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 18839a4a5ec..ca2a34afbae 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -100,7 +100,7 @@ enum symbol_partitioning_class
  
  /* Base of all entries in the symbol table.

 The symtab_node is inherited by cgraph and varpol nodes.  */
-class GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"),
+struct GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"),
   chain_next ("%h.next"), chain_prev ("%h.previous")))
symtab_node


Second line should get an extra space of indentation.


@@ -1673,8 +1675,10 @@ struct GTY(()) cgraph_indirect_call_info
unsigned vptr_changed : 1;
  };
  
-struct GTY((chain_next ("%h.next_caller"), chain_prev ("%h.prev_caller"),

-   for_user)) cgraph_edge {
+class GTY((chain_next ("%h.next_caller"), chain_prev ("%h.prev_caller"),
+   for_user)) cgraph_edge


Similarly one fewer space here.


Done.  My simple script handles just one space issue but not this
one.

...

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 31fba823435..fc1a66416cc 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -589,7 +594,7 @@ class GTY(()) rtx_nonjump_insn : public rtx_insn
   from rtl.def.  */
  };

-class GTY(()) rtx_jump_insn : public rtx_insn
+struct GTY(()) rtx_jump_insn : public rtx_insn
  {
  public:
/* No extra fields, but adds the invariant:
@@ -533,7 +538,7 @@ is_a_helper ::test (const_rtx rt)
return rt->code == SEQUENCE;
  }

-class GTY(()) rtx_insn : public rtx_def
+struct GTY(()) rtx_insn : public rtx_def
  {
  public:
/* No extra fields, but adds the invariant:


Might as well get rid of these "public:"s too, unless you feel they
should be kept.


I think my script did that but gengtype choked on the struct when
it had no members so I had to put it back.  I'll try to remember
to reproduce it and open a bug for it.



OK with those changes (or without the last one), thanks.


Committed in r273308.

Martin


Re: [PATCH 1/2] [ARC] Fix and refurbish the interrupts.

2019-07-09 Thread claziss
Hi Jeff,

Please find attached the updated patch.

What is new:
- mailing list feedback is taken into account.
- some comments are updated.
- a new test is added.
- the ARC AUX registers used by ZOL (hardware loop) and FPX (a custom
floating point implementation) are saved before fp-register.
- the millicode optimization is not used by ISR.


Thank you,
Claudiu
From d22368681b7aab4bef4b5c32a9a472808f2c16cd Mon Sep 17 00:00:00 2001
From: Claudiu Zissulescu 
Date: Fri, 17 May 2019 14:48:17 +0300
Subject: [PATCH] [ARC] Fix and refurbish the interrupts.

When entering an interrupt, not only the call save registers needs to
be place on stack but also the call clobbers one. More over, the
ARC700 return from interrupt instruction needs to be rtie, the same
like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX]
instruction. Additionally, we need to save the state of the ZOL
machinery, namely the lp_count, lp_end and lp_start registers. For
architectures which are using extension registers (i.e., HS48) we need
to save/restore them as well.

gcc/
-xx-xx  Claudiu Zissulescu  

	* config/arc/arc-protos.h (arc_output_function_epilogue): Delete
	declaration.
	(arc_compute_frame_size): Millicode is disabled when compiling
	ISR.
	(arc_return_address_register): Likewise.
	(arc_compute_function_type): Likewise.
	(arc_compute_frame_size): Likewise.
	(secondary_reload_info): Likewise.
	(arc_get_unalign): Likewise.
	(arc_can_use_return_insn): Declare.
	* config/arc/arc.c (AUX_LP_START): Define
	(AUX_LP_END): Likewise.
	(arc_frame_info): Update gmask member to 64-bit datum.
	(GMASK_LEN): Update.
	(arc_compute_function_type): Make it static, move it forward.
	(arc_must_save_register): Update, consider the extra regs.
	(arc_compute_millicode_save_restore_regs): Update to use the 64
	bit gmask.
	(arc_compute_frame_size): Likewise.
	(arc_enter_leave_p): Likewise.
	(arc_save_callee_saves): Likewise.
	(arc_restore_callee_saves): Likewise.
	(arc_save_callee_enter): Likewise.
	(arc_restore_callee_leave): Likewise.
	(arc_save_callee_milli): Likewise.
	(arc_restore_callee_milli): Likewise.
	(arc_expand_prologue): Add new interrupt handling.
	(arc_return_address_register): Make it static, move it forward.
	(arc_expand_epilogue): Add new interrupt handling.
	(arc_get_unalign): Delete.
	(arc_epilogue_uses): Make sure we do not remove the extra
	saved/restored registers when interrupt.
	(arc_can_use_return_insn): New function.
	(push_reg): Likewise.
	(pop_reg): Likewise.
	(arc_save_callee_saves): Add ZOL and FPX aux registers saving
	procedures.
	(arc_restore_callee_saves): Likewise, but restoring.
	* config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define.
	(R33_REG): Likewise.
	(R34_REG): Likewise.
	(R35_REG): Likewise.
	(R36_REG): Likewise.
	(R37_REG): Likewise.
	(R38_REG): Likewise.
	(R39_REG): Likewise.
	(R45_REG): Likewise.
	(R46_REG): Likewise.
	(R47_REG): Likewise.
	(R48_REG): Likewise.
	(R49_REG): Likewise.
	(R50_REG): Likewise.
	(R51_REG): Likewise.
	(R52_REG): Likewise.
	(R53_REG): Likewise.
	(R54_REG): Likewise.
	(R55_REG): Likewise.
	(R56_REG): Likewise.
	(R58_REG): Likewise.
	(type): Add rtie attribute.
	(in_call_delay_slot): Use RETURN_ADDR_REGNUM.
	(movsi_insn): Accept moves to lp_count.
	(rtie): Update pattern.
	(simple_return): Simplify it, don't use this pattern as a return
	from an interrupt.
	(arc600_rtie): New pattern.
	(p_return_i): Clean up.
	(return): Likewise.
	* config/arc/builtins.def (rtie): Only available for non ARC6xx
	family CPUs.
	* config/arc/predicates.md (move_src_operand): Consider lp_count
	as a register.

gcc/testsuite
-xx-xx  Claudiu Zissulescu  

	* gcc.target/arc/arc.exp (check_effective_target_accregs): New
	predicate.
	* gcc.target/arc/builtin_special.c: Update test/
	* gcc.target/arc/interrupt-1.c: Likewise.
	* gcc.target/arc/interrupt-10.c: New test.
	* gcc.target/arc/interrupt-11.c: Likewise.
	* gcc.target/arc/interrupt-12.c: Likewise.
---
 gcc/config/arc/arc-protos.h   |   7 +-
 gcc/config/arc/arc.c  | 741 +++---
 gcc/config/arc/arc.md | 139 ++--
 gcc/config/arc/builtins.def   |   2 +-
 gcc/config/arc/predicates.md  |   2 +
 gcc/testsuite/gcc.target/arc/arc.exp  |  18 +
 .../gcc.target/arc/builtin_special.c  |   2 +
 gcc/testsuite/gcc.target/arc/interrupt-1.c|   4 +-
 gcc/testsuite/gcc.target/arc/interrupt-10.c   |  36 +
 gcc/testsuite/gcc.target/arc/interrupt-11.c   |  16 +
 gcc/testsuite/gcc.target/arc/interrupt-12.c   |  16 +
 11 files changed, 628 insertions(+), 355 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-10.c
 create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-11.c
 create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-12.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index f501bc30ee7..0c9f422827d 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -25,7 +25,6

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 09.07.19 15:41, Gaius Mulley wrote:
> Matthias Klose  writes:
> 
>>> the libraries ./usr/lib/x86_64-linux-gnu/lib{ulm,pim,gm2,cor,iso,min}.a
>>> are not needed the correct locations of the static libraries are:
>>>
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/min/libmin.a
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/cor/libcor.a
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/pim/libgm2.a
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/log/liblog.a
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/iso/libiso.a
>>
>> ok, then I assume that could be the place for the .so files as well (not the
>> .so.* files/links), if I don't want to install those in the system libdir.
> 
> yes sure this should work fine,

but then you'll end up with separate m2 subdirs for each multilib build?  That's
maybe not what you want.

And an unrelated one: You are introducing python2 as a build-dependency.
Afaics, it's only one invocation

python ../../src/gcc/gm2/tools-src/def2texi.py -uLibraries -s../../src/gcc/gm2
-b/home/packages/gcc/9/u/gcc-9-9.1.1/build/gcc/gm2 >
/home/packages/gcc/9/u/gcc-9-9.1.1/build/gcc/gm2/gm2-libs.texi

but there might be more implicit python calls via a shebang line.

Also it's python2, not ready to run with python3.  I think you should not rely
on python2 only code anymore.

Matthias


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Rainer Orth  writes:

> Hi Matthias,
>
>> I had a look at the GCC 9 version of the patches, with a build including a 
>> make
>> install. Some comments:
>>
>>  - A parallel build (at least with -j4) isn't working. A sequental
>>build works fine.  I think forcing a sequential build will not
>>work well, increasing the build time too much.
>
> absolutely: I'd go as far as claiming that this is the number one
> priority.  Otherwise build and test times are just too long for all but
> the most dedicated testers, and forcing a sequential build would be a
> showstopper for trunk integration.

Hi,

Many thanks for all the feedback/bugs/patches.
I've been working through some of these.  The parallel build is now done.

> The same holds for the current requirement of a non-bootstrap build.  At
> least that's what I saw initially: it may be that it works sequentially,
> but haven't tried since the build time was way too long already.
>
>>  - libgm2 multilib builds are not working.  //32/libgm2
>>is configured, but not built.
>
> True, but the fix is a simple one-liner:
>
> --- ../../../m2/dist/gcc-versionno/libgm2/Makefile.am 2019-06-06 
> 15:17:19.634469354 +
> +++ libgm2/Makefile.am2019-07-09 00:41:23.214142811 +
> @@ -97,3 +97,5 @@
>  
>  # Subdir rules rely on $(FLAGS_TO_PASS)
>  FLAGS_TO_PASS = $(AM_MAKEFLAGS)
> +
> +include $(top_srcdir)/../multilib.am

thanks - applied to the master and 9.1.0 branch of gm2.

> This allowed me to build both 32 and 64-bit gm2 libs on
> i386-pc-solaris2.11 and get the testresults I reported earlier, which
> are identical for -m32 and -m64.
>
> Here are a couple of other issues I saw:
>
> * There are many many warnings during the build in the gcc/gm2 code.
>
> * The mc output is far too verbose right now: this isn't of interest to
>   anyone but gm2 developers ;-)

added --quiet to all invocations of mc on master - will apply to 9.1.0
soon.

> * Running make check-gm2 in gcc produces gm2 testsuite output directly
>   in gcc/testsuite.  This needs to go into a testsuite/gm2 subdir (or
>   gm2 once the testsuite is parallelized: it is far too large to only
>   run sequentially).
>
> * Many tests FAIL like this:
>
> ESC[01mESC[Kxgm2:ESC[mESC[K ESC[01;31mESC[Kfatal error: ESC[mESC[Kcannot 
> execute �<80><98>ESC[01mESC[Kgm2lESC[mESC[K�<80><99>: execvp: No such file or 
> directory
> compilation terminated.
> compiler exited with status 1
> FAIL: gm2/calling-c/datatypes/unbounded/run/pass/m.mod compilation,  -g 
>
>   For one, I didn't have gm2l anywhere in my tree.  Besides, the tests
>   absolutely need to be run with -fno-diagnostics-show-caret
>   -fno-diagnostics-show-line-numbers -fdiagnostics-color=never

applied to master and 9.1.0.

>   This problem seems to account for the vast majority of failing tests
>   right now:
>
>6820 xgm2: fatal error: cannot execute ‘gm2l’: execvp: No such file or 
> directory
>   6 xgm2: fatal error: no input files
>
>   gm2l and a couple of other tools are built by gm2/Make-lang.in's
>   gm2.all.build rule, that the seems not to be referenced anywhere.
>   Even after manually building them, the stay in stage1/gm2 and need a
>   make gm2l to be copied into gcc/gm2.  This all needs to work without
>   such manual steps or without installing gm2 first.

rewritten some of the top level targets to build these tools
(applied/pushed on master) will apply to 9.1.0.

> * There are a couple of broken testcase names in gm2.sum, e.g.
>
> PASS: 
> /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gm2/pim/options/optimize/run/pass/addition.mod
>  compilation, -g 
> {compiler=/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/xgm2
>  -B/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc 
> -I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libpim:/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/../gm2/gm2-libs
>  
> -I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libiso:/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/../gm2/gm2-libs-iso
>  
> -I/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gm2/pim/options/optimize/run/pass
>  -fpim 
> -L/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libpim/.libs
>  
> -L/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libiso/.libs}
>
>   Names are required to be unique, and must not contain absolute
>   pathnames to allow for comparing different test results.  All this
>   stuff in braces above should go.

thanks for the heads up about this - I'll rename them.

> With the missing gm2l worked around as above, my i386-pc-solaris2.11
> testresults are way better now:
>
> === gm2 Summary for unix ===
>
> # of expected passes11186
> # of unexpected failures24
> # of unresolved testcases   12

these results are great for solaris.  On the amd64/GNU/Linux I get 12
failures (long.

[C++ PATCH] PR c++/90590 Suppress warning for enumeration value not handled in switch warning

2019-07-09 Thread Matthew Beliveau
This patch suppresses the warning:  "enumeration value not handled in
switch", for enumerators that are defined in system headers and use
reserved names.
Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-07-08  Matthew Beliveau  
	
PR c++/90590
	* c-warn.c (c_do_switch_warnings): Suppress warning for enumerators
	with reserved names that are in a system header.

	* c-c++-common/pr90590-1.c: New test.
	* c-c++-common/pr90590-1.h: New test.
	* c-c++-common/pr90590-2.c: New test.
	* c-c++-common/pr90590-2.h: New test.

diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index b5d09e761d7..56ad23dd29c 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gcc-rich-location.h"
 #include "gimplify.h"
 #include "c-family/c-indentation.h"
+#include "c-family/c-spellcheck.h"
 #include "calls.h"
 #include "stor-layout.h"
 
@@ -1592,8 +1593,12 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
   for (chain = TYPE_VALUES (type); chain; chain = TREE_CHAIN (chain))
 {
   tree value = TREE_VALUE (chain);
+  tree decl = NULL_TREE;
   if (TREE_CODE (value) == CONST_DECL)
-	value = DECL_INITIAL (value);
+	{
+	  decl = value;
+	  value = DECL_INITIAL (value);
+	}
   node = splay_tree_lookup (cases, (splay_tree_key) value);
   if (node)
 	{
@@ -1628,6 +1633,19 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
   if (cond && tree_int_cst_compare (cond, value))
 	continue;
 
+  /* If the enumerator is defined in a system header and uses a reserved
+	 name, then we continue to avoid throwing a warning.  */
+  if (decl == NULL_TREE)
+	decl = lookup_name (TREE_PURPOSE (chain));
+  if (decl && TREE_CODE (decl) == CONST_DECL)
+	{
+	  const char *name = IDENTIFIER_POINTER (DECL_NAME (decl));
+	  location_t loc = DECL_SOURCE_LOCATION (decl);
+	  if (in_system_header_at (loc)
+	  && name_reserved_for_implementation_p (name))
+	continue;
+	}
+
   /* If there is a default_node, the only relevant option is
 	 Wswitch-enum.  Otherwise, if both are enabled then we prefer
 	 to warn using -Wswitch because -Wswitch is enabled by -Wall
diff --git gcc/testsuite/c-c++-common/pr90590-1.c gcc/testsuite/c-c++-common/pr90590-1.c
new file mode 100644
index 000..4997a3082d5
--- /dev/null
+++ gcc/testsuite/c-c++-common/pr90590-1.c
@@ -0,0 +1,15 @@
+// PR c++/90590
+// { dg-options -Wswitch }
+#include "pr90590-1.h"
+
+void
+g ()
+{
+  enum E e = _A;
+  switch (e) // { dg-bogus "enumeration value '_C' not handled in switch" }
+{
+case _A:
+case _B:
+  break;
+}
+}
diff --git gcc/testsuite/c-c++-common/pr90590-1.h gcc/testsuite/c-c++-common/pr90590-1.h
new file mode 100644
index 000..22f1a7d5d52
--- /dev/null
+++ gcc/testsuite/c-c++-common/pr90590-1.h
@@ -0,0 +1,2 @@
+#pragma GCC system_header
+enum E { _A, _B, _C };
diff --git gcc/testsuite/c-c++-common/pr90590-2.c gcc/testsuite/c-c++-common/pr90590-2.c
new file mode 100644
index 000..8aa65cf0afd
--- /dev/null
+++ gcc/testsuite/c-c++-common/pr90590-2.c
@@ -0,0 +1,8 @@
+#include "pr90590-2.h"
+
+void
+fn ()
+{
+  switch (c.b) // { dg-bogus "enumeration value" }
+;
+}
diff --git gcc/testsuite/c-c++-common/pr90590-2.h gcc/testsuite/c-c++-common/pr90590-2.h
new file mode 100644
index 000..e4f8635576f
--- /dev/null
+++ gcc/testsuite/c-c++-common/pr90590-2.h
@@ -0,0 +1,4 @@
+#pragma GCC system_header
+struct {
+  enum { _A } b;
+} c;


Re: [PATCH 3/3] change class-key of PODs to struct and others to class (PR 61339)

2019-07-09 Thread Richard Sandiford
Martin Sebor  writes:
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index cfc41e1ed86..625d5b17413 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -6428,7 +6428,7 @@ extern tree get_scope_of_declarator (const 
> cp_declarator *);
>  extern void grok_special_member_properties   (tree);
>  extern bool grok_ctor_properties (const_tree, const_tree);
>  extern bool grok_op_properties   (tree, bool);
> -extern tree xref_tag (enum tag_types, tree, 
> tag_scope, bool, bool * = NULL);
> +extern tree xref_tag (enum tag_types, tree, 
> tag_scope, bool);
>  extern tree xref_tag_from_type   (tree, tree, tag_scope);
>  extern void xref_basetypes   (tree, tree);
>  extern tree start_enum   (tree, tree, tree, 
> tree, bool, bool *);
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 005f99a6e15..9accc3d141b 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -14119,7 +14119,7 @@ lookup_and_check_tag (enum tag_types tag_code, tree 
> name,
>  
>  static tree
>  xref_tag_1 (enum tag_types tag_code, tree name,
> -tag_scope scope, bool template_header_p, bool *new_p)
> +tag_scope scope, bool template_header_p)
>  {
>enum tree_code code;
>tree context = NULL_TREE;
> @@ -14151,9 +14151,6 @@ xref_tag_1 (enum tag_types tag_code, tree name,
>if (t == error_mark_node)
>  return error_mark_node;
>  
> -  /* Let the caller know this is a new type.  */
> -  *new_p = t == NULL_TREE;
> -
>if (scope != ts_current && t && current_class_type
>&& template_class_depth (current_class_type)
>&& template_header_p)
> @@ -14215,7 +14212,6 @@ xref_tag_1 (enum tag_types tag_code, tree name,
> scope = ts_current;
>   }
> t = pushtag (name, t, scope);
> -   *new_p = true;
>   }
>  }
>else
> @@ -14267,13 +14263,11 @@ xref_tag_1 (enum tag_types tag_code, tree name,
>  
>  tree
>  xref_tag (enum tag_types tag_code, tree name,
> -  tag_scope scope, bool template_header_p, bool *new_p /* = NULL */)
> +  tag_scope scope, bool template_header_p)
>  {
>bool dummy;
> -  if (!new_p)
> -new_p = &dummy;
>bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
> -  tree ret = xref_tag_1 (tag_code, name, scope, template_header_p, new_p);
> +  tree ret = xref_tag_1 (tag_code, name, scope, template_header_p);
>timevar_cond_stop (TV_NAME_LOOKUP, subtime);
>return ret;
>  }
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 52af8c0c6d6..d16bf253058 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -28193,8 +28193,6 @@ cp_parser_template_declaration_after_parameters 
> (cp_parser* parser,
>  member_p,
> 
> /*explicit_specialization_p=*/false,
>  &friend_p);
> -  // maybe_warn_struct_vs_class (token->location, TREE_TYPE (decl));
> -
>pop_deferring_access_checks ();
>  
>/* If this is a member template declaration, let the front

Looks like this might have been part of 1/3.

OK otherwise.  Thanks again for doing this.

(I guess a lot of these tags could be removed, but that was just as true
before the patch, so it's still a strict improvement.)

Richard


[PATCH] gdbhooks.py: dump-fn, dot-fn: cast ret values of fopen/fclose

2019-07-09 Thread Vladislav Ivanishin
Hi,

Without the patch, I see these error messages with gdb 8.3:

(gdb) Python Exception  'fclose@@GLIBC_2.2.5' has
unknown return type; cast the call to its declared return type:
(gdb) Error occurred in Python: 'fclose@@GLIBC_2.2.5' has unknown
return type; cast the call to its declared return type

One doesn't have to use python to reproduce that: start debugging cc1
and issue

(gdb) call fopen ("", "")

This actually looks like a GDB bug: from looking at cc1's (built with
either -g, or -ggdb3) DWARF with either dwarfdump, or readelf I see that
there is info about the return type (for fopen it's FILE *, and `ptype
FILE` in gdb gives the full struct).

Tom, you contributed the {dot,dump}-fn functions.  Do they still work
for you without the patch?  (And if so, do you happen to have debuginfo
for libc installed on you machine?)

I think, the patch itself is obvious (as a workaround).  I've only
tested it with the version of GDB I have (8.3, which is the latest
release), but expect this to work for older versions as well.

(Comparisons of gdb.Value's returned from parse_and_eval, like fp == 0
and their conversion to python strings in "%s" % fp work automagically.)

* gdbhooks.py (DumpFn.invoke): Add explicit casts of return values of
fopen and fclose to their respective types.
(DotFn.invoke): Ditto.
---
 gcc/gdbhooks.py | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index 15f9738aeee..c68bffb4d1a 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -773,18 +773,17 @@ class DumpFn(gdb.Command):
 f.close()
 
 # Open file
-fp = gdb.parse_and_eval("fopen (\"%s\", \"w\")" % filename)
+fp = gdb.parse_and_eval("(FILE *) fopen (\"%s\", \"w\")" % filename)
 if fp == 0:
 print ("Could not open file: %s" % filename)
 return
-fp = "(FILE *)%u" % fp
 
 # Dump function to file
 _ = gdb.parse_and_eval("dump_function_to_file (%s, %s, %u)" %
(func, fp, flags))
 
 # Close file
-ret = gdb.parse_and_eval("fclose (%s)" % fp)
+ret = gdb.parse_and_eval("(int) fclose (%s)" % fp)
 if ret != 0:
 print ("Could not close file: %s" % filename)
 return
@@ -843,11 +842,10 @@ class DotFn(gdb.Command):
 
 # Close and reopen temp file to get C FILE*
 f.close()
-fp = gdb.parse_and_eval("fopen (\"%s\", \"w\")" % filename)
+fp = gdb.parse_and_eval("(FILE *) fopen (\"%s\", \"w\")" % filename)
 if fp == 0:
 print("Cannot open temp file")
 return
-fp = "(FILE *)%u" % fp
 
 # Write graph to temp file
 _ = gdb.parse_and_eval("start_graph_dump (%s, \"\")" % fp)
@@ -856,7 +854,7 @@ class DotFn(gdb.Command):
 _ = gdb.parse_and_eval("end_graph_dump (%s)" % fp)
 
 # Close temp file
-ret = gdb.parse_and_eval("fclose (%s)" % fp)
+ret = gdb.parse_and_eval("(int) fclose (%s)" % fp)
 if ret != 0:
 print("Could not close temp file: %s" % filename)
 return
-- 
2.22.0

-- 
Vlad


Re: [PATCH 2/3] change class-key of PODs to struct and others to class (PR 61339)

2019-07-09 Thread Richard Sandiford
Martin Sebor  writes:
> Hopefully with the right patch this time (thanks Jon).
>
> On 7/8/19 4:00 PM, Martin Sebor wrote:
>> The attached patch changes the class-key of class definitions that
>> satisfy the requirements on a POD struct to 'struct', and that of
>> struct definitions that aren't POD to class, according to the GCC
>> coding convention.  The patch is also prerequisite for GCC being
>> able to compile cleanly with -Wmismatched-tags.
>> 
>> I made the changes building GCC with -Wstruct-not-pod and
>> -Wclass-is-pod enabled, scanning the build log for instances
>> of each warning, and using a script replacing the class-key
>> as necessary and adjusting the access of the members declared
>> immediately after the class-head.
>> 
>> Martin
>
> PR c++/61339 - add mismatch between struct and class [-Wmismatched-tags] to 
> non-bugs
>
> gcc/c/ChangeLog:
>
>   * c-decl.c: Change class-key from class to struct and vice versa
>   to match convention and avoid -Wclass-is-pod and -Wstruct-no-pod.
>   * gimple-parser.c: Same.
>
> gcc/c-family/ChangeLog:
>
>   * c-format.c (check_argument_type): Change class-key from class to
>   struct and vice versa to match convention and avoid -Wclass-is-pod
>   and -Wstruct-no-pod.
>   * c-pretty-print.h: Same.
>
> gcc/cp/ChangeLog:
>
>   * constexpr.c (cxx_eval_call_expression): Change class-key from class
>   to struct and vice versa to match convention and avoid -Wclass-is-pod
>   and -Wstruct-no-pod.
>   * constraint.cc (get_concept_definition): Same.
>   * cp-tree.h: Same.
>   * cxx-pretty-print.h: Same.
>   * error.c: Same.
>   * logic.cc (term_list::replace): Same.
>   * name-lookup.c (find_local_binding): Same.
>   * pt.c (tsubst_binary_right_fold): Same.
>   * search.c (field_accessor_p): Same.
>   * semantics.c (expand_or_defer_fn): Same.
>
> gcc/lto/ChangeLog:
>
>   * lto-dump.c: Same.

Need to cut-&-paste the description for this one.

> gcc/ChangeLog:
>
>   * align.h: Change class-key from class to struct and vice versa
>   to match convention and avoid -Wclass-is-pod and -Wstruct-no-pod.
>   * alloc-pool.h: Same.
>   * asan.c (shadow_mem_size): Same.
>   * auto-profile.c: Same.
>   * basic-block.h: Same.
>   * bitmap.h: Same.
>   * cfgexpand.c (set_rtl): Same.
>   (expand_one_stack_var_at): Same.
>   * cfghooks.h: Same.
>   * cfgloop.h: Same.
>   * cgraph.h: Same.
>   * config/i386/i386.h: Same.
>   * df-problems.c (df_print_bb_index): Same.
>   * df-scan.c: Same.
>   * df.h (df_single_use): Same.
>   * diagnostic-show-locus.c (layout::print_annotation_line): Same.
>   (layout::annotation_line_showed_range_p): Same.
>   (get_printed_columns): Same.
>   (correction::ensure_terminated): Same.
>   (line_corrections::~line_corrections): Same.
>   * dojump.h: Same.
>   * dse.c: Same.
>   * dump-context.h: Same.
>   * dumpfile.h: Same.
>   * dwarf2out.c: Same.
>   * edit-context.c: Same.
>   * fibonacci_heap.c (test_union_of_equal_heaps): Same.
>   * flags.h: Same.
>   * function.c (assign_stack_local): Same.
>   * function.h: Same.
>   * gcc.c: Same.
>   * gcov.c (block_info::block_info): Same.
>   * genattrtab.c: Same.
>   * genextract.c: Same.
>   * genmatch.c (comparison_code_p): Same.
>   (id_base::id_base): Same.
>   (decision_tree::print): Same.
>   * genoutput.c: Same.
>   * genpreds.c (write_one_predicate_function): Same.
>   * genrecog.c (validate_pattern): Same.
>   (find_operand_positions): Same.
>   (optimize_subroutine_group): Same.
>   (merge_pattern_transition::merge_pattern_transition): Same.
>   (merge_pattern_info::merge_pattern_info): Same.
>   (merge_state_result::merge_state_result): Same.
>   (merge_into_state): Same.
>   * gensupport.c: Same.
>   * gensupport.h: Same.
>   * ggc-common.c (init_ggc_heuristics): Same.
>   * ggc-tests.c (test_union): Same.
>   * gimple-loop-interchange.cc (dump_induction): Same.
>   * gimple-loop-versioning.cc: Same.
>   * gimple-match.h (gimple_match_cond::any_else): Same.
>   * gimple-ssa-backprop.c: Same.
>   * gimple-ssa-sprintf.c: Same.
>   * gimple-ssa-store-merging.c (store_operand_info::store_operand_info): 
> Same.
>   (store_immediate_info::store_immediate_info): Same.
>   (merged_store_group::apply_stores): Same.
>   (get_location_for_stmts): Same.
>   * gimple-ssa-strength-reduction.c: Same.
>   * gimple-ssa-warn-alloca.c: Same.
>   * gimple-ssa-warn-restrict.c (pass_wrestrict::execute): Same.
>   * godump.c (go_type_decl): Same.
>   * hash-map-tests.c (test_map_of_strings_to_int): Same.
>   * hash-map.h: Same.
>   * hash-set-tests.c (test_set_of_strings): Same.
>   * hsa-brig.c: Same.
>   * hsa-common.h: Same.
>   * hsa-gen.c (transformable_swi

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Matthias Klose  writes:

>> the libraries ./usr/lib/x86_64-linux-gnu/lib{ulm,pim,gm2,cor,iso,min}.a
>> are not needed the correct locations of the static libraries are:
>> 
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/min/libmin.a
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/cor/libcor.a
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/pim/libgm2.a
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/log/liblog.a
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/iso/libiso.a
>
> ok, then I assume that could be the place for the .so files as well (not the
> .so.* files/links), if I don't want to install those in the system libdir.

yes sure this should work fine,


regards,
Gaius


Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Richard Biener
On Tue, 9 Jul 2019, Jan Hubicka wrote:

> Hi,
> this is updated variant I am testing.
> It documents better how function works and streamlines the checks.
> 
> OK assuming it passes the tests?
> 
> Honza
> 
> Index: tree-ssa-alias.c
> ===
> --- tree-ssa-alias.c  (revision 273193)
> +++ tree-ssa-alias.c  (working copy)
> @@ -1128,6 +1128,91 @@ aliasing_component_refs_p (tree ref1,
>return false;
>  }
>  
> +/* FIELD1 and FIELD2 are two fields of component refs.  We assume
> +   that bases of both component refs are
> + (*) are either equivalent or they point to different objects.

are either equivalent(*) or not overlapping

> +   We do not assume that FIELD1 and FIELD2 are of same type.

that the containers of FIELD1 and FIELD2 are of the same type?

> +
> +   Return 0 if FIELD1 and FIELD2 satisfy (*).
> +   This is the case when their offsets are the same.

Hmm, so when the offsets are the same then the bases are equivalent?
I think you want to say

 Return 0 if in case the component refs satisfy (*) we
 know FIELD1 and FIELD2 are overlapping exactly.

> +   Return 1 if FIELD1 and FIELD2 are non-overlapping.
> +
> +   Return -1 otherwise.
> +
> +   Main difference between 0 and -1 is to let
> +   nonoverlapping_component_refs_since_match_p discover the semnatically

semantically

otherwise looks good now.

Thanks,
Richard.

> +   equivalent part of the access path.
> +
> +   Note that this function is used even with -fno-strict-aliasing
> +   and makes use of no TBAA assumptions.  */
> +
> +static int
> +nonoverlapping_component_refs_p_1 (const_tree field1, const_tree field2)
> +{
> +  /* If both fields are of the same type, we could save hard work of
> + comparing offsets.  */
> +  tree type1 = DECL_CONTEXT (field1);
> +  tree type2 = DECL_CONTEXT (field2);
> +
> +  if (DECL_BIT_FIELD_REPRESENTATIVE (field1))
> +field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
> +  if (DECL_BIT_FIELD_REPRESENTATIVE (field2))
> +field2 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
> +
> +  /* ??? Bitfields can overlap at RTL level so punt on them.
> + FIXME: RTL expansion should be fixed by adjusting the access path
> + when producing MEM_ATTRs for MEMs which are wider than 
> + the bitfields similarly as done in set_mem_attrs_minus_bitpos.  */
> +  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
> +return -1;
> +
> +  /* Assume that different FIELD_DECLs never overlap within a RECORD_TYPE.  
> */
> +  if (type1 == type2 && TREE_CODE (type1) == RECORD_TYPE)
> +return field1 != field2;
> +
> +  /* In common case the offsets and bit offsets will be the same.
> + However if frontends do not agree on the alignment, they may be
> + different even if they actually represent same address.
> + Try the common case first and if that fails calcualte the
> + actual bit offset.  */
> +  if (tree_int_cst_equal (DECL_FIELD_OFFSET (field1),
> +   DECL_FIELD_OFFSET (field2))
> +  && tree_int_cst_equal (DECL_FIELD_BIT_OFFSET (field1),
> +  DECL_FIELD_BIT_OFFSET (field2)))
> +return 0;
> +
> +  /* Note that it may be possible to use component_ref_field_offset
> + which would provide offsets as trees. However constructing and folding
> + trees is expensive and does not seem to be worth the compile time
> + cost.  */
> +
> +  poly_uint64 offset1, offset2;
> +  poly_uint64 bit_offset1, bit_offset2;
> +
> +  if (poly_int_tree_p (DECL_FIELD_OFFSET (field1), &offset1)
> +  && poly_int_tree_p (DECL_FIELD_OFFSET (field2), &offset2)
> +  && poly_int_tree_p (DECL_FIELD_BIT_OFFSET (field1), &bit_offset1)
> +  && poly_int_tree_p (DECL_FIELD_BIT_OFFSET (field2), &bit_offset2))
> +{
> +  offset1 = (offset1 << LOG2_BITS_PER_UNIT) + bit_offset1;
> +  offset2 = (offset2 << LOG2_BITS_PER_UNIT) + bit_offset2;
> +
> +  if (known_eq (offset1, offset2))
> + return 0;
> +
> +  poly_uint64 size1, size2;
> +
> +  if (poly_int_tree_p (DECL_SIZE (field1), &size1)
> +   && poly_int_tree_p (DECL_SIZE (field2), &size2)
> +   && !ranges_maybe_overlap_p (offset1, size1, offset2, size2))
> + return 1;
> +}
> +  /* Resort to slower overlap checking by looking for matching types in
> + the middle of access path.  */
> +  return -1;
> +}
> +
>  /* Try to disambiguate REF1 and REF2 under the assumption that MATCH1 and
> MATCH2 either point to the same address or are disjoint.
> MATCH1 and MATCH2 are assumed to be ref in the access path of REF1 and 
> REF2
> @@ -1224,6 +1309,7 @@ nonoverlapping_component_refs_since_matc
>   case the return value will precisely be false.  */
>while (true)
>  {
> +  bool seen_noncomponent_ref_p = false;
>do
>   {
> if (component_refs1.is_empty ())
> @@ -1233,6 +1319,8 @@ nonoverlapping_component_refs_since_matc
> return 0;
>   }
>  

Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Jan Hubicka
Hi,
this is updated variant I am testing.
It documents better how function works and streamlines the checks.

OK assuming it passes the tests?

Honza

Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 273193)
+++ tree-ssa-alias.c(working copy)
@@ -1128,6 +1128,91 @@ aliasing_component_refs_p (tree ref1,
   return false;
 }
 
+/* FIELD1 and FIELD2 are two fields of component refs.  We assume
+   that bases of both component refs are
+ (*) are either equivalent or they point to different objects.
+   We do not assume that FIELD1 and FIELD2 are of same type.
+
+   Return 0 if FIELD1 and FIELD2 satisfy (*).
+   This is the case when their offsets are the same.
+
+   Return 1 if FIELD1 and FIELD2 are non-overlapping.
+
+   Return -1 otherwise.
+
+   Main difference between 0 and -1 is to let
+   nonoverlapping_component_refs_since_match_p discover the semnatically
+   equivalent part of the access path.
+
+   Note that this function is used even with -fno-strict-aliasing
+   and makes use of no TBAA assumptions.  */
+
+static int
+nonoverlapping_component_refs_p_1 (const_tree field1, const_tree field2)
+{
+  /* If both fields are of the same type, we could save hard work of
+ comparing offsets.  */
+  tree type1 = DECL_CONTEXT (field1);
+  tree type2 = DECL_CONTEXT (field2);
+
+  if (DECL_BIT_FIELD_REPRESENTATIVE (field1))
+field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
+  if (DECL_BIT_FIELD_REPRESENTATIVE (field2))
+field2 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
+
+  /* ??? Bitfields can overlap at RTL level so punt on them.
+ FIXME: RTL expansion should be fixed by adjusting the access path
+ when producing MEM_ATTRs for MEMs which are wider than 
+ the bitfields similarly as done in set_mem_attrs_minus_bitpos.  */
+  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
+return -1;
+
+  /* Assume that different FIELD_DECLs never overlap within a RECORD_TYPE.  */
+  if (type1 == type2 && TREE_CODE (type1) == RECORD_TYPE)
+return field1 != field2;
+
+  /* In common case the offsets and bit offsets will be the same.
+ However if frontends do not agree on the alignment, they may be
+ different even if they actually represent same address.
+ Try the common case first and if that fails calcualte the
+ actual bit offset.  */
+  if (tree_int_cst_equal (DECL_FIELD_OFFSET (field1),
+ DECL_FIELD_OFFSET (field2))
+  && tree_int_cst_equal (DECL_FIELD_BIT_OFFSET (field1),
+DECL_FIELD_BIT_OFFSET (field2)))
+return 0;
+
+  /* Note that it may be possible to use component_ref_field_offset
+ which would provide offsets as trees. However constructing and folding
+ trees is expensive and does not seem to be worth the compile time
+ cost.  */
+
+  poly_uint64 offset1, offset2;
+  poly_uint64 bit_offset1, bit_offset2;
+
+  if (poly_int_tree_p (DECL_FIELD_OFFSET (field1), &offset1)
+  && poly_int_tree_p (DECL_FIELD_OFFSET (field2), &offset2)
+  && poly_int_tree_p (DECL_FIELD_BIT_OFFSET (field1), &bit_offset1)
+  && poly_int_tree_p (DECL_FIELD_BIT_OFFSET (field2), &bit_offset2))
+{
+  offset1 = (offset1 << LOG2_BITS_PER_UNIT) + bit_offset1;
+  offset2 = (offset2 << LOG2_BITS_PER_UNIT) + bit_offset2;
+
+  if (known_eq (offset1, offset2))
+   return 0;
+
+  poly_uint64 size1, size2;
+
+  if (poly_int_tree_p (DECL_SIZE (field1), &size1)
+ && poly_int_tree_p (DECL_SIZE (field2), &size2)
+ && !ranges_maybe_overlap_p (offset1, size1, offset2, size2))
+   return 1;
+}
+  /* Resort to slower overlap checking by looking for matching types in
+ the middle of access path.  */
+  return -1;
+}
+
 /* Try to disambiguate REF1 and REF2 under the assumption that MATCH1 and
MATCH2 either point to the same address or are disjoint.
MATCH1 and MATCH2 are assumed to be ref in the access path of REF1 and REF2
@@ -1224,6 +1309,7 @@ nonoverlapping_component_refs_since_matc
  case the return value will precisely be false.  */
   while (true)
 {
+  bool seen_noncomponent_ref_p = false;
   do
{
  if (component_refs1.is_empty ())
@@ -1233,6 +1319,8 @@ nonoverlapping_component_refs_since_matc
  return 0;
}
  ref1 = component_refs1.pop ();
+ if (TREE_CODE (ref1) != COMPONENT_REF)
+   seen_noncomponent_ref_p = true;
}
   while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref1, 0;
 
@@ -1245,17 +1333,15 @@ nonoverlapping_component_refs_since_matc
  return 0;
}
  ref2 = component_refs2.pop ();
+ if (TREE_CODE (ref2) != COMPONENT_REF)
+   seen_noncomponent_ref_p = true;
}
   while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref2, 0;
 
-  /* Beware of BIT_FIELD_REF.  */
-  if (TREE_CODE (ref1) != COMPONENT_R

Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Richard Biener
On Tue, 9 Jul 2019, Jan Hubicka wrote:

> > > tree_int_cst_equal will return false if offsets are not INTEGER_CST.
> > > I was not sure if I can safely use operand_equal_p.  What happens for
> > > fields with variable offsets when I inline two copies of same function
> > > which takes size as parameter and make the size different? Will I get
> > > here proper SSA name so operand_equal_p will work?
> > 
> > No, you get a DECL, but yes, I think operand_equal_p will work.
> > Consider two _same_ variable sizes, you'll not see that you
> > have to return zero then?  But yes, in case you have types
> > globbed to the canonical type (but not FIELD_DECLs) then
> > you'll get false !operand_equal_p as well.
> > 
> > The question is really what is desired here.  If you want/need precision
> > for non-constant offsets then you have to look at the COMPONENT_REF
> > trees because the relevant offset (SSA name) is only there
> > (in TREE_OPERAND (component_ref, 2)).
> > 
> > If you want to give up for non-constants and can do that without
> > correctness issue then fine (but Ada probably would like to have
> > it - so also never forget to include Ada in testing here ;))
> 
> I would like to have precision here. so perhaps as incremental change I
> can
>  1) reorganize callers to pass refs rather than just field_decls
>  2) check if TREE_OPERAND (component_ref, 2) is non-NULL in both case
> a) if so do operand_equal_p on them and return 0 on match
> b) if there is no match see if I have same canonical types and
>return 1 then
> c) return -1 otherwise

makes sense

>  3) continue with parsing FIELD_DECLS we work on now.
> > 
> > Oh, OK ... a bit more explaining commentary might be nice
> > (at the top of the function - basically what the input
> > constraints to the FIELD_DECLs are).
> 
> OK, will try to improve comments (though i tried to be relatively
> thorough).
> 
> Honza
> > 
> > Btw, the offsets in FIELD_DECLs are relative to DECL_CONTEXT so
> > comparing when DECL_CONTEXT are not related at all doesn't make
> > any sense.  Well, unless we know _those_ are at the same offset,
> > so - the constraint for the FIELD_DECLs we compare is that
> > the containing structure type object instances live at the same
> > address?
> > 
> > Richard.
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

[PATCH] Fix PR91114

2019-07-09 Thread Richard Biener


The following fixes PR91114.  It's not really the most desirable
solution but hey.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-07-09  Richard Biener  

PR tree-optimization/91114
* tree-vect-data-refs.c (vect_analyze_data_refs): Failure to
find a vector type isn't fatal.

* gcc.dg/vect/pr91114.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 273294)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -4360,6 +4360,8 @@ vect_analyze_data_refs (vec_info *vinfo,
  STMT_VINFO_VECTORIZABLE (stmt_info) = false;
  continue;
}
+ if (fatal)
+   *fatal = false;
  return opt_result::failure_at (stmt_info->stmt,
 "not vectorized:"
 " no vectype for stmt: %G"
Index: gcc/testsuite/gcc.dg/vect/pr91114.c
===
--- gcc/testsuite/gcc.dg/vect/pr91114.c (nonexistent)
+++ gcc/testsuite/gcc.dg/vect/pr91114.c (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fopenmp-simd" } */
+
+void
+ne (double *zu)
+{
+  int h3;
+
+#pragma omp simd simdlen (4)
+  for (h3 = 0; h3 < 4; ++h3)
+zu[h3] = 0;
+}


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 09.07.19 14:02, Gaius Mulley wrote:
> Matthias Klose  writes:
> 
>>>  - There are three letter libraries with pretty generic
>>>names installed into the system libdir: log, iso, cor,
>>>min, ulm. At least for log, you have a file conflict
>>>with another library.  Shouldn't these libraries named
>>>mpre specific, like libgm2log?
> 
>>> The installed tree:
>>
>>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
>>> ./usr/lib/x86_64-linux-gnu/libulm.a
>>
>> and all static libraries are installed twice, not just libulm.a. What is the
>> correct location?
>>
>> Matthias
> 
> Hi Matthias,
> 
> the libraries ./usr/lib/x86_64-linux-gnu/lib{ulm,pim,gm2,cor,iso,min}.a
> are not needed the correct locations of the static libraries are:
> 
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/min/libmin.a
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/cor/libcor.a
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/pim/libgm2.a
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/log/liblog.a
> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/iso/libiso.a

ok, then I assume that could be the place for the .so files as well (not the
.so.* files/links), if I don't want to install those in the system libdir.


Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Michael Matz
Hi,

On Tue, 9 Jul 2019, Richard Biener wrote:

> > So a "backedge" in this sense would be e->dest->index < e->src->index.
> > No?
> 
> To me the following would make sense.

The basic block index is not a DFS index, so no, that's not a test for 
backedge.


Ciao,
Michael.


Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Jan Hubicka
> > tree_int_cst_equal will return false if offsets are not INTEGER_CST.
> > I was not sure if I can safely use operand_equal_p.  What happens for
> > fields with variable offsets when I inline two copies of same function
> > which takes size as parameter and make the size different? Will I get
> > here proper SSA name so operand_equal_p will work?
> 
> No, you get a DECL, but yes, I think operand_equal_p will work.
> Consider two _same_ variable sizes, you'll not see that you
> have to return zero then?  But yes, in case you have types
> globbed to the canonical type (but not FIELD_DECLs) then
> you'll get false !operand_equal_p as well.
> 
> The question is really what is desired here.  If you want/need precision
> for non-constant offsets then you have to look at the COMPONENT_REF
> trees because the relevant offset (SSA name) is only there
> (in TREE_OPERAND (component_ref, 2)).
> 
> If you want to give up for non-constants and can do that without
> correctness issue then fine (but Ada probably would like to have
> it - so also never forget to include Ada in testing here ;))

I would like to have precision here. so perhaps as incremental change I
can
 1) reorganize callers to pass refs rather than just field_decls
 2) check if TREE_OPERAND (component_ref, 2) is non-NULL in both case
a) if so do operand_equal_p on them and return 0 on match
b) if there is no match see if I have same canonical types and
   return 1 then
c) return -1 otherwise
 3) continue with parsing FIELD_DECLS we work on now.
> 
> Oh, OK ... a bit more explaining commentary might be nice
> (at the top of the function - basically what the input
> constraints to the FIELD_DECLs are).

OK, will try to improve comments (though i tried to be relatively
thorough).

Honza
> 
> Btw, the offsets in FIELD_DECLs are relative to DECL_CONTEXT so
> comparing when DECL_CONTEXT are not related at all doesn't make
> any sense.  Well, unless we know _those_ are at the same offset,
> so - the constraint for the FIELD_DECLs we compare is that
> the containing structure type object instances live at the same
> address?
> 
> Richard.


[PATCH v3 2/5] or1k: Fix issues with msoft-div

2019-07-09 Thread Stafford Horne
Fixes bad assembly logic with software divide as reported by Richard Selvaggi.
Also, add a basic test to verify the soft math works when enabled.

gcc/testsuite/ChangeLog:

PR target/90362
* gcc.target/or1k/div-mul-3.c: New test.

libgcc/ChangeLog:

PR target/90362
* config/or1k/lib1funcs.S (__udivsi3): Change l.sfeqi
to l.sfeq and l.sfltsi to l.sflts equivalents as the immediate
instructions are not available on every processor.  Change a
l.bnf to l.bf to fix logic issue.
---
 gcc/testsuite/gcc.target/or1k/div-mul-3.c | 31 +++
 libgcc/config/or1k/lib1funcs.S|  6 ++---
 2 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/or1k/div-mul-3.c

diff --git a/gcc/testsuite/gcc.target/or1k/div-mul-3.c 
b/gcc/testsuite/gcc.target/or1k/div-mul-3.c
new file mode 100644
index 000..2c4f91b7e98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/or1k/div-mul-3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -msoft-div -msoft-mul" } */
+
+struct testcase {
+  int a;
+  int b;
+  int c;
+  int expected;
+};
+
+struct testcase tests[] = {
+  {2, 200, 3, 133},
+  {3, 300, 3, 300},
+  {2, 500, 3, 333},
+  {4, 250, 3, 333},
+  {0, 0, 0, 0}
+};
+
+int calc (int a, int b, int c) {
+  return a * b / c;
+}
+
+int main () {
+  int fail = 0;
+  struct testcase *tc;
+
+  for (int i = 0; (tc = &tests[i], tc->c); i++)
+fail |= (calc (tc->a, tc->b, tc->c) != tc->expected);
+
+  return fail;
+}
diff --git a/libgcc/config/or1k/lib1funcs.S b/libgcc/config/or1k/lib1funcs.S
index d2103923486..6d058977229 100644
--- a/libgcc/config/or1k/lib1funcs.S
+++ b/libgcc/config/or1k/lib1funcs.S
@@ -68,18 +68,18 @@ __udivmodsi3_internal:
   is not clobbered by this routine, and use that as to
   save a return address without creating a stack frame.  */
 
-   l.sfeqi r4, 0   /* division by zero; return 0.  */
+   l.sfeq  r4, r0  /* division by zero; return 0.  */
l.ori   r11, r0, 0  /* initial quotient */
l.bf9f
 l.ori  r12, r3, 0  /* initial remainder */
 
/* Given X/Y, shift Y left until Y >= X.  */
l.ori   r6, r0, 1   /* mask = 1 */
-1: l.sfltsir4, 0   /* y has msb set */
+1: l.sflts r4, r0  /* y has msb set */
l.bf2f
 l.sfltur4, r12 /* y < x */
l.add   r4, r4, r4  /* y <<= 1 */
-   l.bnf   1b
+   l.bf1b
 l.add  r6, r6, r6  /* mask <<= 1 */
 
/* Shift Y back to the right again, subtracting from X.  */
-- 
2.21.0



[PATCH v3 3/5] or1k: Add mrori option, fix option docs

2019-07-09 Thread Stafford Horne
gcc/ChangeLog:

* config.gcc (or1k*-*-*): Add mrori and mror to validation.
* doc/invoke.texi (OpenRISC Options): Add mrori option, rewrite all
documenation to be more clear.
* config/or1k/elf.opt (mboard=, mnewlib): Rewrite documentation to be
more clear.
* config/or1k/or1k.opt (mrori): New option.
(mhard-div, msoft-div, mhard-mul, msoft-mul, mcmov, mror, msext,
msfimm, mshftimm): Rewrite documentation to be more clear.
* config/or1k/predicates.md (ror_reg_or_u6_operand): New predicate.
* config/or1k/or1k.md (insn_support): Add ror and rori.
(enabled): Add conditions for ror and rori.
(rotrsi3): Replace condition for shftimm with ror and rori.

gcc/testsuite/ChangeLog:

* gcc.target/or1k/ror-4.c: New file.
* gcc.target/or1k/ror-5.c: New file.
* gcc.target/or1k/shftimm-1.c: Update test from rotate to shift
as the shftimm option no longer controls rotate.
---
Changes since v2:
 - Fix issue with ror predicate pointed out by Segher.
 - Added ror-5.c test to confirm/fix ICE.

 gcc/config.gcc|  1 +
 gcc/config/or1k/elf.opt   |  6 +--
 gcc/config/or1k/or1k.md   | 14 --
 gcc/config/or1k/or1k.opt  | 56 +--
 gcc/config/or1k/predicates.md |  7 +++
 gcc/doc/invoke.texi   | 56 +--
 gcc/testsuite/gcc.target/or1k/ror-4.c |  8 
 gcc/testsuite/gcc.target/or1k/ror-5.c |  9 
 gcc/testsuite/gcc.target/or1k/shftimm-1.c |  8 ++--
 9 files changed, 104 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-4.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-5.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c281c418b28..aeab8b4544e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2578,6 +2578,7 @@ or1k*-*-*)
for or1k_multilib in ${or1k_multilibs}; do
case ${or1k_multilib} in
mcmov | msext | msfimm | \
+   mror | mrori | \
mhard-div | mhard-mul | \
msoft-div | msoft-mul )

TM_MULTILIB_CONFIG="${TM_MULTILIB_CONFIG},${or1k_multilib}"
diff --git a/gcc/config/or1k/elf.opt b/gcc/config/or1k/elf.opt
index 641b6ddd4be..2d4d1875d02 100644
--- a/gcc/config/or1k/elf.opt
+++ b/gcc/config/or1k/elf.opt
@@ -25,9 +25,9 @@
 
 mboard=
 Target RejectNegative Joined
-Configure board specific runtime.
+Configure the newlib board specific runtime.  The default is or1ksim.
 
 mnewlib
 Target RejectNegative
-For compatibility, it's always newlib for elf now.
-
+This option is ignored; it is provided for compatibility purposes only.  This
+used to select linker and preprocessor options for use with newlib.
diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
index 757d899c442..0faa0fa4c47 100644
--- a/gcc/config/or1k/or1k.md
+++ b/gcc/config/or1k/or1k.md
@@ -63,7 +63,7 @@
   "alu,st,ld,control,multi"
   (const_string "alu"))
 
-(define_attr "insn_support" "class1,sext,sfimm,shftimm" (const_string 
"class1"))
+(define_attr "insn_support" "class1,sext,sfimm,shftimm,ror,rori" (const_string 
"class1"))
 
 (define_attr "enabled" ""
   (cond [(eq_attr "insn_support" "class1") (const_int 1)
@@ -72,7 +72,11 @@
 (and (eq_attr "insn_support" "sfimm")
  (ne (symbol_ref "TARGET_SFIMM") (const_int 0))) (const_int 1)
 (and (eq_attr "insn_support" "shftimm")
- (ne (symbol_ref "TARGET_SHFTIMM") (const_int 0))) (const_int 1)]
+ (ne (symbol_ref "TARGET_SHFTIMM") (const_int 0))) (const_int 1)
+(and (eq_attr "insn_support" "ror")
+ (ne (symbol_ref "TARGET_ROR") (const_int 0))) (const_int 1)
+(and (eq_attr "insn_support" "rori")
+ (ne (symbol_ref "TARGET_RORI") (const_int 0))) (const_int 1)]
(const_int 0)))
 
 ;; Describe a user's asm statement.
@@ -178,12 +182,12 @@
 (define_insn "rotrsi3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
(rotatert:SI (match_operand:SI 1 "register_operand"  "r,r")
- (match_operand:SI 2 "reg_or_u6_operand" "r,n")))]
-  "TARGET_ROR"
+(match_operand:SI 2 "ror_reg_or_u6_operand" "r,n")))]
+  "TARGET_ROR || TARGET_RORI"
   "@
l.ror\t%0, %1, %2
l.rori\t%0, %1, %2"
-  [(set_attr "insn_support" "*,shftimm")])
+  [(set_attr "insn_support" "ror,rori")])
 
 (define_insn "andsi3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
diff --git a/gcc/config/or1k/or1k.opt b/gcc/config/or1k/or1k.opt
index 7bdbd842dd4..c2f64c5dd45 100644
--- a/gcc/config/or1k/or1k.opt
+++ b/gcc/config/or1k/or1k.opt
@@ -21,47 +21,55 @@
 ; See the GCC internals manual (options.texi) for a description of
 ; this file's format.
 
-; Please try to keep this file in ASCII collating order.
-
 mhard-div
 Target RejectNegative InverseMask(SOFT_DIV)
-Use hardware divide ins

[PATCH v3 5/5] or1k: only force reg for immediates

2019-07-09 Thread Stafford Horne
The force_reg in or1k_expand_compare is hard coded for SImode, which is fine as
this used to only be used on SI expands.  However, with FP support this will
cause issues.  In general we should only force the right hand operand to a
register if its an immediate.  This patch adds an condition to check for that.

gcc/ChangeLog:

* config/or1k/or1k.c (or1k_expand_compare): Check for int before
force_reg.
---
 gcc/config/or1k/or1k.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/or1k/or1k.c b/gcc/config/or1k/or1k.c
index 1eea84f47e0..f8eed4a7797 100644
--- a/gcc/config/or1k/or1k.c
+++ b/gcc/config/or1k/or1k.c
@@ -1448,13 +1448,15 @@ void
 or1k_expand_compare (rtx *operands)
 {
   rtx sr_f = gen_rtx_REG (BImode, SR_F_REGNUM);
+  rtx righthand_op = XEXP (operands[0], 1);
   rtx_code cmp_code = GET_CODE (operands[0]);
   bool flag_check_ne = true;
 
-  /* The RTL may receive an immediate in argument 1 of the compare, this is not
- supported unless we have l.sf*i instructions, force them into registers.  
*/
-  if (!TARGET_SFIMM)
-XEXP (operands[0], 1) = force_reg (SImode, XEXP (operands[0], 1));
+  /* Integer RTL may receive an immediate in argument 1 of the compare, this is
+ not supported unless we have l.sf*i instructions, force them into
+ registers.  */
+  if (!TARGET_SFIMM && CONST_INT_P (righthand_op))
+XEXP (operands[0], 1) = force_reg (SImode, righthand_op);
 
   /* Normalize comparison operators to ones OpenRISC support.  */
   switch (cmp_code)
-- 
2.21.0



[PATCH v3 4/5] or1k: Initial support for FPU

2019-07-09 Thread Stafford Horne
This adds support for OpenRISC hardware floating point instructions.
This is enabled with the -mhard-float option.

Double-prevision floating point operations work using register pairing as
specified in: https://openrisc.io/proposals/orfpx64a32.  This has just been
added in the OpenRISC architecture specification 1.3.
This is enabled with the -mdouble-float option.

Not all architectures support unordered comparisons so an option,
-munordered-float is added.

Currently OpenRISC does not support sf/df or df/sf conversions, but this has
also just been added in architecture specification 1.3.

gcc/ChangeLog:

* config.gcc (or1k*-*-*): Add mhard-float, mdouble-float, msoft-float
and munordered-float validations.
* config/or1k/constraints.md (d): New register constraint.
* config/or1k/predicates.md (fp_comparison_operator): New.
* config/or1k/or1k.c (or1k_print_operand): Add support for printing 'd'
operands.
(or1k_expand_compare): Normalize unordered comparisons.
* config/or1k/or1k.h (reg_class): Define DOUBLE_REGS.
(REG_CLASS_NAMES): Add "DOUBLE_REGS".
(REG_CLASS_CONTENTS): Add contents for DOUBLE_REGS.
* config/or1k/or1k.md (type): Add fpu.
(fpu): New instruction reservation.
(F, f, fr, fi, FI, FOP, fop): New.
(3): New ALU instruction definition.
(float2): New conversion instruction definition.
(fix_trunc2): New conversion instruction definition.
(fpcmpcc): New code iterator.
(*sf_fp_insn): New instruction definition.
(cstore4): New expand definition.
(cbranch4): New expand definition.
* config/or1k/or1k.opt (msoft-float, mhard-float, mdouble-float,
munordered-float): New options.
* doc/invoke.texi: Document msoft-float, mhard-float, mdouble-float and
munordered-float.
---
Changes since v2:
 - Fix wrong order in double reg mask.

 gcc/config.gcc |   1 +
 gcc/config/or1k/constraints.md |   4 ++
 gcc/config/or1k/or1k.c |  38 ++-
 gcc/config/or1k/or1k.h |   3 +
 gcc/config/or1k/or1k.md| 111 -
 gcc/config/or1k/or1k.opt   |  22 +++
 gcc/config/or1k/predicates.md  |   5 ++
 gcc/doc/invoke.texi|  21 +++
 8 files changed, 201 insertions(+), 4 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index aeab8b4544e..1678109131f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2579,6 +2579,7 @@ or1k*-*-*)
case ${or1k_multilib} in
mcmov | msext | msfimm | \
mror | mrori | \
+   mhard-float | mdouble-float | munordered-float | msoft-float | \
mhard-div | mhard-mul | \
msoft-div | msoft-mul )

TM_MULTILIB_CONFIG="${TM_MULTILIB_CONFIG},${or1k_multilib}"
diff --git a/gcc/config/or1k/constraints.md b/gcc/config/or1k/constraints.md
index 93da8c058c6..8cac7eb5329 100644
--- a/gcc/config/or1k/constraints.md
+++ b/gcc/config/or1k/constraints.md
@@ -24,6 +24,7 @@
 
 ; We use:
 ;  c - sibcall registers
+;  d - double pair base registers (excludes r0, r30 and r31 which overflow)
 ;  I - constant signed 16-bit
 ;  K - constant unsigned 16-bit
 ;  M - constant signed 16-bit shifted left 16-bits (l.movhi)
@@ -32,6 +33,9 @@
 (define_register_constraint "c" "SIBCALL_REGS"
   "Registers which can hold a sibling call address")
 
+(define_register_constraint "d" "DOUBLE_REGS"
+  "Registers which can be used for double reg pairs.")
+
 ;; Immediates
 (define_constraint "I"
   "A signed 16-bit immediate in the range -32768 to 32767."
diff --git a/gcc/config/or1k/or1k.c b/gcc/config/or1k/or1k.c
index 54c9e804ea5..1eea84f47e0 100644
--- a/gcc/config/or1k/or1k.c
+++ b/gcc/config/or1k/or1k.c
@@ -1226,6 +1226,19 @@ or1k_print_operand (FILE *file, rtx x, int code)
output_operand_lossage ("invalid %%H value");
   break;
 
+case 'd':
+  if (REG_P (x))
+   {
+ if (GET_MODE (x) == DFmode || GET_MODE (x) == DImode)
+   fprintf (file, "%s,%s", reg_names[REGNO (operand)],
+   reg_names[REGNO (operand) + 1]);
+ else
+   fprintf (file, "%s", reg_names[REGNO (operand)]);
+   }
+  else
+   output_operand_lossage ("invalid %%d value");
+  break;
+
 case 'h':
   print_reloc (file, x, 0, RKIND_HI);
   break;
@@ -1435,21 +1448,42 @@ void
 or1k_expand_compare (rtx *operands)
 {
   rtx sr_f = gen_rtx_REG (BImode, SR_F_REGNUM);
+  rtx_code cmp_code = GET_CODE (operands[0]);
+  bool flag_check_ne = true;
 
   /* The RTL may receive an immediate in argument 1 of the compare, this is not
  supported unless we have l.sf*i instructions, force them into registers.  
*/
   if (!TARGET_SFIMM)
 XEXP (operands[0], 1) = force_reg (SImode, XEXP (operands[0], 1));
 
+  /* Normalize comparison operators to ones OpenRISC support.  */
+  switch (c

[PATCH v3 1/5] or1k: Fix code quality for volatile memory loads

2019-07-09 Thread Stafford Horne
Volatile memory does not match the memory_operand predicate.  This
causes extra extend/mask instructions instructions when reading
from volatile memory.  On OpenRISC loading volatile memory can be
treated the same as regular memory loads which supports combined
sign/zero extends.  Fixing this eliminates the need for extra
extend/mask instructions.

This also adds a test provided by Richard Selvaggi which uncovered the
issue while we were looking into another issue.

gcc/ChangeLog:

PR target/90363
* config/or1k/or1k.md (zero_extendsi2): Update predicate.
(extendsi2): Update predicate.
* gcc/config/or1k/predicates.md (volatile_mem_operand): New.
(reg_or_mem_operand): New.

gcc/testsuite/ChangeLog:

PR target/90363
* gcc.target/or1k/swap-1.c: New test.
* gcc.target/or1k/swap-2.c: New test.
---
Changes since v2:
 - Fix comment format issue, pointed out by Segher

 gcc/config/or1k/or1k.md|  6 +--
 gcc/config/or1k/predicates.md  | 18 +++
 gcc/testsuite/gcc.target/or1k/swap-1.c | 70 ++
 gcc/testsuite/gcc.target/or1k/swap-2.c | 47 +
 4 files changed, 138 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/or1k/swap-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/swap-2.c

diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
index 2dad51cd46b..757d899c442 100644
--- a/gcc/config/or1k/or1k.md
+++ b/gcc/config/or1k/or1k.md
@@ -328,11 +328,11 @@
 ;; Sign Extending
 ;; -
 
-;; Zero extension can always be done with AND and an extending load.
+;; Zero extension can always be done with AND or an extending load.
 
 (define_insn "zero_extendsi2"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
-   (zero_extend:SI (match_operand:I12 1 "nonimmediate_operand" "r,m")))]
+   (zero_extend:SI (match_operand:I12 1 "reg_or_mem_operand" "r,m")))]
   ""
   "@
l.andi\t%0, %1, 
@@ -344,7 +344,7 @@
 
 (define_insn "extendsi2"
   [(set (match_operand:SI 0 "register_operand"  "=r,r")
-   (sign_extend:SI (match_operand:I12 1 "nonimmediate_operand"  "r,m")))]
+   (sign_extend:SI (match_operand:I12 1 "reg_or_mem_operand"  "r,m")))]
   "TARGET_SEXT"
   "@
l.exts\t%0, %1
diff --git a/gcc/config/or1k/predicates.md b/gcc/config/or1k/predicates.md
index 879236bca49..dad1c5d4be3 100644
--- a/gcc/config/or1k/predicates.md
+++ b/gcc/config/or1k/predicates.md
@@ -82,3 +82,21 @@
 
 (define_predicate "equality_comparison_operator"
   (match_code "ne,eq"))
+
+;; Borrowed from rs6000
+;; Return true if the operand is in volatile memory.  Note that during the
+;; RTL generation phase, memory_operand does not return TRUE for volatile
+;; memory references.  So this function allows us to recognize volatile
+;; references where it's safe.
+(define_predicate "volatile_mem_operand"
+  (and (match_code "mem")
+   (match_test "MEM_VOLATILE_P (op)")
+   (if_then_else (match_test "reload_completed")
+(match_operand 0 "memory_operand")
+(match_test "memory_address_p (mode, XEXP (op, 0))"
+
+;; Return true if the operand is a register or memory; including volatile
+;; memory.
+(define_predicate "reg_or_mem_operand"
+  (ior (match_operand 0 "nonimmediate_operand")
+   (match_operand 0 "volatile_mem_operand")))
diff --git a/gcc/testsuite/gcc.target/or1k/swap-1.c 
b/gcc/testsuite/gcc.target/or1k/swap-1.c
new file mode 100644
index 000..4c179d1e430
--- /dev/null
+++ b/gcc/testsuite/gcc.target/or1k/swap-1.c
@@ -0,0 +1,70 @@
+/* { dg-do run } */
+/* { dg-options "-Os -mhard-mul -msoft-div -msoft-float" } */
+
+/* Notes:
+
+   This test failed on or1k GCC 7.2.0, and passes on or1k GCC 5.3.0
+   as well as the or1k port released in GCC 9.1.
+
+   The main program is organized as a loop structure so gcc does not
+   optimize-away the calls to swap_1().  Compiling with -O2 is still smart
+   enough to optimize-away the calls, but using -Os does not.
+   The bad code is only generated when compiled with -Os.
+
+   When the bad code is generated all code is okay except for the very last
+   instruction (a 'l.addc' in the l.jr delay slot).
+   Up to that point in execution, r11 and r12 contain the correct (expected)
+   values, but the execution of the final "l.addc" corrupts r11.
+
+   This test is added to ensure this does not come back.  */
+
+#include 
+
+volatile static uint8_t g_doswap = 1;
+
+uint64_t swap_1 (uint64_t u64) {
+  uint32_t u64_lo, u64_hi, u64_tmp;
+
+  u64_lo = u64 & 0x;
+  u64_hi = u64 >> 32;
+
+  if (g_doswap)
+{
+  u64_tmp = u64_lo;
+  u64_lo  = u64_hi;
+  u64_hi  = u64_tmp;
+}
+
+  u64 = u64_lo;
+  u64 += ((uint64_t) u64_hi << 32);
+
+  return u64;
+}
+
+int main () {
+  int ret;
+  int iter;
+  uint64_t  aa[2];   // inputs to swap function
+  uint64_t  ee[2];   // expected outp

[PATCH v3 0/5] OpenRISC updates for 10 (fpu, fixes)

2019-07-09 Thread Stafford Horne
Hello,

New since v2:
 - Fix comment formatting pointed out by Segher in valatile patch
 - Fix issue and add test for rotrsi3 options pointed out by Segher
 - Fix issue with reg mask for doubles being backwards Pointed out by Segher
   and Richard.

New since v1:
 - Changed 64-bit FPU operations to use explicit register pairs as per spec
   revision suggested by Richard Henderson.
 - Added patch for new -mrori option
 - Added patch for msoft-div fix from other series (no changes)
 - Fixed volatile spelling pointed out by Bernhard 
   Reutner-Fischer 

This is a set of patches to bring FPU support to the OpenRISC backend.  The
backend also add support for 64-bit floating point operations on 32-bit cores
using register pairs, see orfpx64a32 [0].

This binutils patches are already upstream.

The toolchain has been tested using the gcc and binutils testsuites as well as
floating point test suites running on sim and an fpga soft core or1k_marocchino.
[1]

I have also included a few fixes to PRs:

 - 90362 or1k: Soft divide does not work correctly
 - 90363 or1k: Extra mask insn after load from memory

This whole patch series can be found on my github repo [2] as well.

If all is OK, I plan to commit these to master (gcc 10).  Then back port the PR
fixes to the GCC 9 branch, I will ask for guidance when I start to do the
backporting.

-Stafford

[0] https://openrisc.io/proposals/orfpx64a32
[1] https://github.com/openrisc/or1k_marocchino
[2] g...@github.com:stffrdhrn/gcc.git or1k-fpu-3


*** BLURB HERE ***

Stafford Horne (5):
  or1k: Fix code quality for volatile memory loads
  or1k: Fix issues with msoft-div
  or1k: Add mrori option, fix option docs
  or1k: Initial support for FPU
  or1k: only force reg for immediates

 gcc/config.gcc|   2 +
 gcc/config/or1k/constraints.md|   4 +
 gcc/config/or1k/elf.opt   |   6 +-
 gcc/config/or1k/or1k.c|  50 +++--
 gcc/config/or1k/or1k.h|   3 +
 gcc/config/or1k/or1k.md   | 131 --
 gcc/config/or1k/or1k.opt  |  78 +
 gcc/config/or1k/predicates.md |  30 +
 gcc/doc/invoke.texi   |  77 -
 gcc/testsuite/gcc.target/or1k/div-mul-3.c |  31 +
 gcc/testsuite/gcc.target/or1k/ror-4.c |   8 ++
 gcc/testsuite/gcc.target/or1k/ror-5.c |   9 ++
 gcc/testsuite/gcc.target/or1k/shftimm-1.c |   8 +-
 gcc/testsuite/gcc.target/or1k/swap-1.c|  70 
 gcc/testsuite/gcc.target/or1k/swap-2.c|  47 
 libgcc/config/or1k/lib1funcs.S|   6 +-
 16 files changed, 484 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/or1k/div-mul-3.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-4.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-5.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/swap-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/swap-2.c

-- 
2.21.0



Re: [PATCH] Deprecate -frepo option.

2019-07-09 Thread Martin Liška
On 7/9/19 1:41 PM, Nathan Sidwell wrote:
> On 7/9/19 6:39 AM, Richard Biener wrote:
>> On Mon, Jul 8, 2019 at 2:04 PM Martin Liška  wrote:
>>>
> 
>>>
>>> Same happens also for GCC7. It does 17 iteration (#define MAX_ITERATIONS 
>>> 17) and
>>> apparently 17 is not enough to resolve all symbols. And it's really slow.
>>
>> Ouch.
> 
> hm, 17 is a magic number.  in C++98 it was the maximum depth of template 
> instantiations that implementations needed to support.  Portable code could 
> not expect more.  So the worst case -frepo behaviour would be 17 iterations.
> 
> That's not true any more, it's been 1024 since C++11.
> 
> Has a bug been filed about this frepo problem? 

I create a new one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91125

> If not, it suggest those using frepo are not compiling modern C++.
> 
>>> That said, I would recommend to remove it :)
>>
>> In the end it's up to the C++ FE maintainers but the above clearly
>> doesn't look promising
>> (not sure if it keeps re-compiling _all_ repo-triggered templates or
>> just incrementally adds
>> them to new object files).
> 
>> I'm not opposed to removing -frepo from GCC 10 but then I would start
>> noting it is obsolete
>> on the GCC 9 branch at least.
> 
> I concur.  frepo's serial reinvocation of the compiler is not compatible with 
> modern C++ code bases.

Great. Then I'm sending patch that does the functionality removal.

Ready to be installed after proper testing & bootstrap?

Martin

> 
> nathan
> 

>From 06a298c3381c204b6ed6cf97b05940ebb8abcbde Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 9 Jul 2019 14:45:05 +0200
Subject: [PATCH] Remove support for repo files (PR c++/91125).

gcc/ChangeLog:

2019-07-09  Martin Liska  

	PR c++/91125
	* Makefile.in: Remove tlink.o.
	* collect2.c (do_link): New function isolated
	from do_tlink.
	(main): Use.
	* collect2.h (do_tlink): Remove declaration of do_tlink.
	* doc/extend.texi: Remove documentation of -frepo.
	* doc/invoke.texi: Likewise.
	* doc/sourcebuild.texi: Remove cleanup-repo-files.
	* tlink.c: Remove.

gcc/c-family/ChangeLog:

2019-07-09  Martin Liska  

	PR c++/91125
	* c-common.c: Remove definition of flag_use_repository.
	* c-common.h: Likewise.
	* c-opts.c (c_common_handle_option):
	Do not handle OPT_frepo option.
	* c.opt: Mark the option with Deprecated.

gcc/cp/ChangeLog:

2019-07-09  Martin Liska  

	PR c++/91125
	* Make-lang.in: Remove repo.o.
	* config-lang.in: Likewise.
	* cp-tree.h (init_repo): Remove declarations
	of repo-related functions.
	(repo_emit_p): Likewise.
	(repo_export_class_p): Likewise.
	(finish_repo): Likewise.
	* decl2.c (import_export_class): Always
	set -1 value/
	(mark_needed): Remove -frepo from comment.
	(import_export_decl): Similarly here.
	(c_parse_final_cleanups): Remove call of finish_repo.
	* lex.c (cxx_init): Remove call to init_repo.
	* optimize.c (can_alias_cdtor): Remove dead condition.
	* pt.c (push_template_decl_real): Update comment.
	(instantiate_decl): Remove dead code used for -frepo.
	* repo.c: Remove.

gcc/testsuite/ChangeLog:

2019-07-09  Martin Liska  

	PR c++/91125
	* g++.dg/parse/repo1.C: Remove.
	* g++.dg/rtti/repo1.C: Remove.
	* g++.dg/template/repo1.C: Remove.
	* g++.dg/template/repo10.C: Remove.
	* g++.dg/template/repo11.C: Remove.
	* g++.dg/template/repo2.C: Remove.
	* g++.dg/template/repo3.C: Remove.
	* g++.dg/template/repo4.C: Remove.
	* g++.dg/template/repo5.C: Remove.
	* g++.dg/template/repo6.C: Remove.
	* g++.dg/template/repo7.C: Remove.
	* g++.dg/template/repo8.C: Remove.
	* g++.dg/template/repo9.C: Remove.
	* g++.old-deja/g++.pt/instantiate4.C: Remove.
	* g++.old-deja/g++.pt/instantiate6.C: Remove.
	* g++.old-deja/g++.pt/repo1.C: Remove.
	* g++.old-deja/g++.pt/repo2.C: Remove.
	* g++.old-deja/g++.pt/repo3.C: Remove.
	* g++.old-deja/g++.pt/repo4.C: Remove.
	* lib/g++.exp: Remove removal of repo files.
	* lib/gcc-dg.exp: Likewise.
	* lib/obj-c++.exp: Likewise.
---
 gcc/Makefile.in   |   2 +-
 gcc/c-family/c-common.c   |   5 -
 gcc/c-family/c-common.h   |   5 -
 gcc/c-family/c-opts.c |   6 -
 gcc/c-family/c.opt|   4 +-
 gcc/collect2.c|  36 +-
 gcc/collect2.h|   4 +-
 gcc/cp/Make-lang.in   |   2 +-
 gcc/cp/config-lang.in |   2 +-
 gcc/cp/cp-tree.h  |   6 -
 gcc/cp/decl2.c|  37 +-
 gcc/cp/lex.c  |   2 -
 gcc/cp/optimize.c |   3 -
 gcc/cp/pt.c   |  18 +-
 gcc/cp/repo.c | 374 
 gcc/doc/extend.texi   |  25 -
 gcc/doc/invoke.texi   |   8 +-
 gcc/doc/sourcebuild.texi  |   3 -
 gcc/testsuite/g++.dg/parse/repo1.C|  10 -
 gcc/tes

Re: [patch 1/2][aarch64]: redefine aes patterns

2019-07-09 Thread Kyrill Tkachov

Hi Sylvia,

On 7/8/19 5:59 PM, Sylvia Taylor wrote:

Hi James,

I forgot to mention that. Yes, please do commit it on my behalf.


I've committed this on your behalf with r273304.

Thanks,

Kyrill



Cheers,
Syl


Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Richard Biener
On Tue, 9 Jul 2019, Jan Hubicka wrote:

> > For consistency yes I guess but IIRC they cannot really appear in 
> > FIELD_DECLs.
> 
> OK, i tought that if I put SVE into structures, we may end up with
> these.
> > > +  /* Different fields of the same record type cannot overlap.
> > > +  ??? Bitfields can overlap at RTL level so punt on them.  */
> > > +  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
> > > + return 0;
> > > +
> > 
> > don't you need the DECL_BIT_FIELD_REPRESENTATIVE check here as well?
> > I'd do
> > 
> > if (DECL_BIT_FIELD_REPRESENTATIVE (field1))
> >   field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
> > if (DECL_BIT_FIELD_REPRESENTATIVE (field2))
> >   field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
> > 
> > thus use the representative for the overlap check.  It might
> > be the case that we can improve here and if we do this
> > can do the DECL_BIT_FIELD check after this (hoping the
> > representative doesn't have it set).
> 
> OK.
> > 
> > > +  if (tree_int_cst_equal (DECL_FIELD_OFFSET (field1),
> > > +   DECL_FIELD_OFFSET (field2))
> > > +   && tree_int_cst_equal (DECL_FIELD_BIT_OFFSET (field1),
> > > +  DECL_FIELD_BIT_OFFSET (field2)))
> > > + return 0;
> > 
> > In gimple_compare_field_offset this was fast-pathed for
> > DECL_OFFSET_ALIGN (f1) == DECL_OFFSET_ALIGN (f2) so I suggest to
> > do that here as well.  Note that DECL_FIELD_OFFSET can be
> > a non-constant which means you cannot use tree_int_cst_equal
> > unconditionally here but you have to use operand_equal_p.
> 
> tree_int_cst_equal will return false if offsets are not INTEGER_CST.
> I was not sure if I can safely use operand_equal_p.  What happens for
> fields with variable offsets when I inline two copies of same function
> which takes size as parameter and make the size different? Will I get
> here proper SSA name so operand_equal_p will work?

No, you get a DECL, but yes, I think operand_equal_p will work.
Consider two _same_ variable sizes, you'll not see that you
have to return zero then?  But yes, in case you have types
globbed to the canonical type (but not FIELD_DECLs) then
you'll get false !operand_equal_p as well.

The question is really what is desired here.  If you want/need precision
for non-constant offsets then you have to look at the COMPONENT_REF
trees because the relevant offset (SSA name) is only there
(in TREE_OPERAND (component_ref, 2)).

If you want to give up for non-constants and can do that without
correctness issue then fine (but Ada probably would like to have
it - so also never forget to include Ada in testing here ;))

> If so, I still see no point for fast-path for DECL_OFFSET_ALIGN. In many
> cases BIT_OFFSET will be just 0, so even if offset alignments are
> different we are likely going to hit this fast path avoiding parsing
> trees later.

Ok.

> > 
> > > +  /* Note that it may be possible to use component_ref_field_offset
> > > +  which would provide offsets as trees. However constructing and folding
> > > +  trees is expensive and does not seem to be worth the compile time
> > > +  cost.  */
> > > +
> > > +  poly_uint64 offset1, offset2;
> > > +  poly_uint64 bit_offset1, bit_offset2;
> > > +  poly_uint64 size1, size2;
> > 
> > I think you need poly_offset_int here since you convert to bits below.
> > 
> > The gimple_compare_field_offset checking way looks cheaper btw, so
> > I wonder why you don't simply call it but replicate things here?
> > When do we expect to have partially overlapping field decls?  Even
> > when considering canonical type merging?
> 
> Because the types I am comparing may not have same canonical types.
> 
> nonoverlapping_component_refs_since_match_p is called when we prove that
> base pointers are the same (even with -fno-strict-aliasing).  In such
> cases the access paths may be based on completely different types. The
> point of nonoverlapping_component_refs_since_match_p is to match them as
> far as possible when they are semantically equivalent in hope to get
> non-overlapping refs in the last step.

Oh, OK ... a bit more explaining commentary might be nice
(at the top of the function - basically what the input
constraints to the FIELD_DECLs are).

Btw, the offsets in FIELD_DECLs are relative to DECL_CONTEXT so
comparing when DECL_CONTEXT are not related at all doesn't make
any sense.  Well, unless we know _those_ are at the same offset,
so - the constraint for the FIELD_DECLs we compare is that
the containing structure type object instances live at the same
address?

Richard.


Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Jan Hubicka
> For consistency yes I guess but IIRC they cannot really appear in 
> FIELD_DECLs.

OK, i tought that if I put SVE into structures, we may end up with
these.
> > +  /* Different fields of the same record type cannot overlap.
> > +??? Bitfields can overlap at RTL level so punt on them.  */
> > +  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
> > +   return 0;
> > +
> 
> don't you need the DECL_BIT_FIELD_REPRESENTATIVE check here as well?
> I'd do
> 
> if (DECL_BIT_FIELD_REPRESENTATIVE (field1))
>   field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
> if (DECL_BIT_FIELD_REPRESENTATIVE (field2))
>   field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
> 
> thus use the representative for the overlap check.  It might
> be the case that we can improve here and if we do this
> can do the DECL_BIT_FIELD check after this (hoping the
> representative doesn't have it set).

OK.
> 
> > +  if (tree_int_cst_equal (DECL_FIELD_OFFSET (field1),
> > + DECL_FIELD_OFFSET (field2))
> > + && tree_int_cst_equal (DECL_FIELD_BIT_OFFSET (field1),
> > +DECL_FIELD_BIT_OFFSET (field2)))
> > +   return 0;
> 
> In gimple_compare_field_offset this was fast-pathed for
> DECL_OFFSET_ALIGN (f1) == DECL_OFFSET_ALIGN (f2) so I suggest to
> do that here as well.  Note that DECL_FIELD_OFFSET can be
> a non-constant which means you cannot use tree_int_cst_equal
> unconditionally here but you have to use operand_equal_p.

tree_int_cst_equal will return false if offsets are not INTEGER_CST.
I was not sure if I can safely use operand_equal_p.  What happens for
fields with variable offsets when I inline two copies of same function
which takes size as parameter and make the size different? Will I get
here proper SSA name so operand_equal_p will work?

If so, I still see no point for fast-path for DECL_OFFSET_ALIGN. In many
cases BIT_OFFSET will be just 0, so even if offset alignments are
different we are likely going to hit this fast path avoiding parsing
trees later.
> 
> > +  /* Note that it may be possible to use component_ref_field_offset
> > +which would provide offsets as trees. However constructing and folding
> > +trees is expensive and does not seem to be worth the compile time
> > +cost.  */
> > +
> > +  poly_uint64 offset1, offset2;
> > +  poly_uint64 bit_offset1, bit_offset2;
> > +  poly_uint64 size1, size2;
> 
> I think you need poly_offset_int here since you convert to bits below.
> 
> The gimple_compare_field_offset checking way looks cheaper btw, so
> I wonder why you don't simply call it but replicate things here?
> When do we expect to have partially overlapping field decls?  Even
> when considering canonical type merging?

Because the types I am comparing may not have same canonical types.

nonoverlapping_component_refs_since_match_p is called when we prove that
base pointers are the same (even with -fno-strict-aliasing).  In such
cases the access paths may be based on completely different types. The
point of nonoverlapping_component_refs_since_match_p is to match them as
far as possible when they are semantically equivalent in hope to get
non-overlapping refs in the last step.

This is stronger than the get_base_ref_and_extend based check in
presence of non-constant ARRAY_REFs.

Honza


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Rainer Orth
Hi Matthias,

> I had a look at the GCC 9 version of the patches, with a build including a 
> make
> install. Some comments:
>
>  - A parallel build (at least with -j4) isn't working. A sequental
>build works fine.  I think forcing a sequential build will not
>work well, increasing the build time too much.

absolutely: I'd go as far as claiming that this is the number one
priority.  Otherwise build and test times are just too long for all but
the most dedicated testers, and forcing a sequential build would be a
showstopper for trunk integration.

The same holds for the current requirement of a non-bootstrap build.  At
least that's what I saw initially: it may be that it works sequentially,
but haven't tried since the build time was way too long already.

>  - libgm2 multilib builds are not working.  //32/libgm2
>is configured, but not built.

True, but the fix is a simple one-liner:

--- ../../../m2/dist/gcc-versionno/libgm2/Makefile.am	2019-06-06 15:17:19.634469354 +
+++ libgm2/Makefile.am	2019-07-09 00:41:23.214142811 +
@@ -97,3 +97,5 @@
 
 # Subdir rules rely on $(FLAGS_TO_PASS)
 FLAGS_TO_PASS = $(AM_MAKEFLAGS)
+
+include $(top_srcdir)/../multilib.am

This allowed me to build both 32 and 64-bit gm2 libs on
i386-pc-solaris2.11 and get the testresults I reported earlier, which
are identical for -m32 and -m64.

Here are a couple of other issues I saw:

* There are many many warnings during the build in the gcc/gm2 code.

* The mc output is far too verbose right now: this isn't of interest to
  anyone but gm2 developers ;-)

* Running make check-gm2 in gcc produces gm2 testsuite output directly
  in gcc/testsuite.  This needs to go into a testsuite/gm2 subdir (or
  gm2 once the testsuite is parallelized: it is far too large to only
  run sequentially).

* Many tests FAIL like this:

ESC[01mESC[Kxgm2:ESC[mESC[K ESC[01;31mESC[Kfatal error: ESC[mESC[Kcannot 
execute �<80><98>ESC[01mESC[Kgm2lESC[mESC[K�<80><99>: execvp: No such file or 
directory
compilation terminated.
compiler exited with status 1
FAIL: gm2/calling-c/datatypes/unbounded/run/pass/m.mod compilation,  -g 

  For one, I didn't have gm2l anywhere in my tree.  Besides, the tests
  absolutely need to be run with -fno-diagnostics-show-caret
  -fno-diagnostics-show-line-numbers -fdiagnostics-color=never

  This problem seems to account for the vast majority of failing tests
  right now:

   6820 xgm2: fatal error: cannot execute ‘gm2l’: execvp: No such file or 
directory
  6 xgm2: fatal error: no input files

  gm2l and a couple of other tools are built by gm2/Make-lang.in's
  gm2.all.build rule, that the seems not to be referenced anywhere.
  Even after manually building them, the stay in stage1/gm2 and need a
  make gm2l to be copied into gcc/gm2.  This all needs to work without
  such manual steps or without installing gm2 first.

* There are a couple of broken testcase names in gm2.sum, e.g.

PASS: 
/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gm2/pim/options/optimize/run/pass/addition.mod
 compilation, -g 
{compiler=/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/xgm2
 -B/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc 
-I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libpim:/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/../gm2/gm2-libs
 
-I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libiso:/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/../gm2/gm2-libs-iso
 
-I/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gm2/pim/options/optimize/run/pass 
-fpim 
-L/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libpim/.libs
 
-L/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/i386-pc-solaris2.11/./libgm2/libiso/.libs}

  Names are required to be unique, and must not contain absolute
  pathnames to allow for comparing different test results.  All this
  stuff in braces above should go.

With the missing gm2l worked around as above, my i386-pc-solaris2.11
testresults are way better now:

=== gm2 Summary for unix ===

# of expected passes11186
# of unexpected failures24
# of unresolved testcases   12

=== gm2 Summary for unix/-m64 ===

# of expected passes10976
# of unexpected failures156
# of unresolved testcases   90

=== gm2 Summary ===

# of expected passes22162
# of unexpected failures180
# of unresolved testcases   102

However, you may want to reconsider if really the whole gm2 testsuite
needs to be torture-tested, i.e. run at -g/-O/-O -g/-Os/-O3
-fomit-frame-pointer/-O3 -fomit-frame-pointer -finline-functions.  This
seems pretty excessive to me.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, RISC-V] Fix ambiguous mode of some compare insn

2019-07-09 Thread Katsuhiro Suzuki

Hello Jim,

On 2019/07/08 13:41, Jim Wilson wrote:

On Sat, Jul 6, 2019 at 1:26 AM Katsuhiro Suzuki  wrote:

This patch fixes ambiguous mode of some compare insns of RISC-V.
Only sge, slt and sle are using  but other compare insns use
. It seems first group mode settings are ambiguous.


Richard Sandiford submitted a patch that fixes every port, and has the
same RISC-V specific changes plus a few more.  I approved Richard's
patch.

Jim



I understand, thanks!

Best Regards,
Katsuhiro Suzuki


[PATCH] Properly valueize according to availability in all cases

2019-07-09 Thread Richard Biener


We're relying on this for correctness, placing an assert triggers
easily so we're not handling things fully correctly in all paths
leading up to here.  So simply valueize appropriately in
vn_nary_build_or_lookup_1 which should make using value-numbers
where it is more natural possible.

I've also took the liberty to introduce a nicer API to
gimple_resimplifyN now that we indirect through gimple_match_op
anyways.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-07-09  Richard Biener  

* gimple-match.h (gimple_match_op::resimplify): New.
(gimple_resimplify1, gimple_resimplify2, gimple_resimplify3,
gimple_resimplify4, gimple_resimplify5): Remove.
* gimple-match-head.c (gimple_resimplify1, gimple_resimplify2,
gimple_resimplify3, gimple_resimplify4, gimple_resimplify5):
Make static.
(gimple_match_op::resimplify): New.
* tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Valueize
according to availability.  Use gimple_match_op::resimplify. 

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 273294)
+++ gcc/gimple-match-head.c (working copy)
@@ -57,6 +57,16 @@ static bool gimple_simplify (gimple_matc
 code_helper, tree, tree, tree, tree, tree);
 static bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
 code_helper, tree, tree, tree, tree, tree, tree);
+static bool gimple_resimplify1 (gimple_seq *, gimple_match_op *,
+   tree (*)(tree));
+static bool gimple_resimplify2 (gimple_seq *, gimple_match_op *,
+   tree (*)(tree));
+static bool gimple_resimplify3 (gimple_seq *, gimple_match_op *,
+   tree (*)(tree));
+static bool gimple_resimplify4 (gimple_seq *, gimple_match_op *,
+   tree (*)(tree));
+static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *,
+   tree (*)(tree));
 
 const unsigned int gimple_match_op::MAX_NUM_OPS;
 
@@ -173,7 +183,7 @@ maybe_resimplify_conditional_op (gimple_
RES_OP with a simplified and/or canonicalized result and
returns whether any change was made.  */
 
-bool
+static bool
 gimple_resimplify1 (gimple_seq *seq, gimple_match_op *res_op,
tree (*valueize)(tree))
 {
@@ -233,7 +243,7 @@ gimple_resimplify1 (gimple_seq *seq, gim
RES_OP with a simplified and/or canonicalized result and
returns whether any change was made.  */
 
-bool
+static bool
 gimple_resimplify2 (gimple_seq *seq, gimple_match_op *res_op,
tree (*valueize)(tree))
 {
@@ -305,7 +315,7 @@ gimple_resimplify2 (gimple_seq *seq, gim
RES_OP with a simplified and/or canonicalized result and
returns whether any change was made.  */
 
-bool
+static bool
 gimple_resimplify3 (gimple_seq *seq, gimple_match_op *res_op,
tree (*valueize)(tree))
 {
@@ -376,7 +386,7 @@ gimple_resimplify3 (gimple_seq *seq, gim
RES_OP with a simplified and/or canonicalized result and
returns whether any change was made.  */
 
-bool
+static bool
 gimple_resimplify4 (gimple_seq *seq, gimple_match_op *res_op,
tree (*valueize)(tree))
 {
@@ -417,7 +427,7 @@ gimple_resimplify4 (gimple_seq *seq, gim
RES_OP with a simplified and/or canonicalized result and
returns whether any change was made.  */
 
-bool
+static bool
 gimple_resimplify5 (gimple_seq *seq, gimple_match_op *res_op,
tree (*valueize)(tree))
 {
@@ -439,6 +449,30 @@ gimple_resimplify5 (gimple_seq *seq, gim
   return false;
 }
 
+/* Match and simplify the toplevel valueized operation THIS.
+   Replaces THIS with a simplified and/or canonicalized result and
+   returns whether any change was made.  */
+
+bool
+gimple_match_op::resimplify (gimple_seq *seq, tree (*valueize)(tree))
+{
+  switch (num_ops)
+{
+case 1:
+  return gimple_resimplify1 (seq, this, valueize);
+case 2:
+  return gimple_resimplify2 (seq, this, valueize);
+case 3:
+  return gimple_resimplify3 (seq, this, valueize);
+case 4:
+  return gimple_resimplify4 (seq, this, valueize);
+case 5:
+  return gimple_resimplify5 (seq, this, valueize);
+default:
+  gcc_unreachable ();
+}
+}
+
 /* If in GIMPLE the operation described by RES_OP should be single-rhs,
build a GENERIC tree for that expression and update RES_OP accordingly.  */
 
Index: gcc/gimple-match.h
===
--- gcc/gimple-match.h  (revision 273294)
+++ gcc/gimple-match.h  (working copy)
@@ -105,6 +105,8 @@ struct gimple_match_op
 
   tree op_or_null (unsigned int) const;
 
+  bool resimplify (gimple_seq *, tree (*)(tree));
+
   /* The maximum value of NUM_OPS.  */
   static const unsigned int MAX_NUM_OPS = 5;
 
@@ -331,1

Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Richard Biener
On Tue, 9 Jul 2019, Jan Hubicka wrote:

> Hi,
> this is updated patch.  I based the logic on gimple_compare_field_offset but
> dropped bits about PLACEHOLDER_EXPR since my understanding is that to get
> comparsion done I would then need to compare operand #2 of COMPONENT_REF which
> I don't.
> 
> I also wrote the range checks using polyint since I believe that most code is
> supposed to be updated this way (shall we update gimple_compare_field_offset?

For consistency yes I guess but IIRC they cannot really appear in 
FIELD_DECLs.

> It is used in canonical type merging and considering fields different may lead
> to wrong code I would say, but I do not know enough about SVE to construct
> testcase).
> 
> I updated documentation of return values which I also find somewhat confusing
> since 1/0 meanings in nonoverlapping_* is reversed compared to
> aliasing_component_refs.  Main entry is called refs_may_alias so I am
> considering rename all the functions into *_may_alias and make them return 0
> for disambiguation, 1 for non-disambiguation and use -1's to keep decidion
> whether to work harder.
> 
> However for now I am sticking with the nonoverlapping reversed semantics.
> 
> 
> There are no changes in tramp3d stats (there are no unmerged types)
> Here are stats on cc1plus build:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 43435216 disambiguations, 51989307 queries
>   ref_maybe_used_by_call_p: 60843 disambiguations, 44040532 queries
>   call_may_clobber_ref_p: 6051 disambiguations, 9115 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 2535 queries
>   nonoverlapping_component_refs_since_match_p: 12297 disambiguations, 40458 
> must overlaps, 53243 queries
>   aliasing_component_refs_p: 70892 disambiguations, 374789 queries
>   TBAA oracle: 12680127 disambiguations 32179405 queries
>10383370 are in alias set 0
>5747729 queries asked about the same object
>148 queries asked about the same alias set
>0 access volatile
>2482808 are dependent in the DAG
>885223 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 562382 disambiguations, 7467931 queries
>   pt_solutions_intersect: 412873 disambiguations, 7818955 queries
> 
> It seems that nonoverlapping_component_refs_since_match_p is pretty good on
> making decision and there are only few cases where it return -1 which is good.
> So i guess teaching it about ARRAY_REFs is logical next step.
> 
> If I would like to benchmark alias oracle compile time, what is
> reasonable way to do that?
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
>   * tree-ssa-alias.c (nonoverlapping_component_refs_p_1): Break out
>   from ...; work also on duplicated types.
>   (nonoverlapping_component_refs_since_match): ... here
>   (ncr_type_uid): Break out from ...
>   (ncr_compar): ... here; look for TYPE_UID of canonical type if
>   available.
>   (nonoverlapping_component_refs_p): Use same_type_for_tbaa to match
>   the types and nonoverlapping_component_refs_p_1 to disambiguate.
>   * g++.dg/lto/alias-3_0.C: New file.
>   * g++.dg/lto/alias-3_1.c: New file.
> 
> Index: tree-ssa-alias.c
> ===
> --- tree-ssa-alias.c  (revision 273193)
> +++ tree-ssa-alias.c  (working copy)
> @@ -1128,6 +1128,90 @@ aliasing_component_refs_p (tree ref1,
>return false;
>  }
>  
> +/* FIELD1 and FIELD2 are two component refs whose bases either do
> +   not overlap at all or their addresses are the same.
> +
> +   Return 1 if FIELD1 and FIELD2 are non-overlapping
> +
> +   Return 0 if FIELD1 and FIELD2 are posisbly overlapping but in case of
> +   overlap their addresses are the same.
> +
> +   Return -1 otherwise.
> +
> +   Main difference between 0 and -1 is to let
> +   nonoverlapping_component_refs_since_match_p discover the semnatically
> +   equivalent part of the access path.  */
> +
> +static int
> +nonoverlapping_component_refs_p_1 (const_tree field1, const_tree field2)
> +{
> +  /* If both fields are of the same type, we could save hard work of
> + comparing offsets.
> + ??? We cannot simply use the type of operand #0 of the refs here
> + as the Fortran compiler smuggles type punning into COMPONENT_REFs
> + for common blocks instead of using unions like everyone else.  */
> +  tree type1 = DECL_CONTEXT (field1);
> +  tree type2 = DECL_CONTEXT (field2);
> +

drop the vertical space

> +  if (type1 == type2 && TREE_CODE (type1) == RECORD_TYPE)
> +{
> +  if (field1 == field2)
> + return 0;
> +  /* A field and its representative need to be considered the
> +  same.  */
> +  if (DECL_BIT_FIELD_REPRESENTATIVE (field1) == field2
> +   || DECL_BIT_FIELD_REPRESENTATIVE (field2) == field1)
> + return 0;
> +  /* Different fields of the same record type cannot overlap.
> +  ??? 

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Matthias Klose  writes:

>>  - There are three letter libraries with pretty generic
>>names installed into the system libdir: log, iso, cor,
>>min, ulm. At least for log, you have a file conflict
>>with another library.  Shouldn't these libraries named
>>mpre specific, like libgm2log?

>> The installed tree:
>
>> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
>> ./usr/lib/x86_64-linux-gnu/libulm.a
>
> and all static libraries are installed twice, not just libulm.a. What is the
> correct location?
>
> Matthias

Hi Matthias,

the libraries ./usr/lib/x86_64-linux-gnu/lib{ulm,pim,gm2,cor,iso,min}.a
are not needed the correct locations of the static libraries are:

./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
./usr/lib/gcc/x86_64-linux-gnu/9/m2/min/libmin.a
./usr/lib/gcc/x86_64-linux-gnu/9/m2/cor/libcor.a
./usr/lib/gcc/x86_64-linux-gnu/9/m2/pim/libgm2.a
./usr/lib/gcc/x86_64-linux-gnu/9/m2/log/liblog.a
./usr/lib/gcc/x86_64-linux-gnu/9/m2/iso/libiso.a


regards,
Gaius


Re: Make nonoverlapping_component_refs work with duplicated main variants

2019-07-09 Thread Jan Hubicka
Hi,
this is updated patch.  I based the logic on gimple_compare_field_offset but
dropped bits about PLACEHOLDER_EXPR since my understanding is that to get
comparsion done I would then need to compare operand #2 of COMPONENT_REF which
I don't.

I also wrote the range checks using polyint since I believe that most code is
supposed to be updated this way (shall we update gimple_compare_field_offset?
It is used in canonical type merging and considering fields different may lead
to wrong code I would say, but I do not know enough about SVE to construct
testcase).

I updated documentation of return values which I also find somewhat confusing
since 1/0 meanings in nonoverlapping_* is reversed compared to
aliasing_component_refs.  Main entry is called refs_may_alias so I am
considering rename all the functions into *_may_alias and make them return 0
for disambiguation, 1 for non-disambiguation and use -1's to keep decidion
whether to work harder.

However for now I am sticking with the nonoverlapping reversed semantics.


There are no changes in tramp3d stats (there are no unmerged types)
Here are stats on cc1plus build:

Alias oracle query stats:
  refs_may_alias_p: 43435216 disambiguations, 51989307 queries
  ref_maybe_used_by_call_p: 60843 disambiguations, 44040532 queries
  call_may_clobber_ref_p: 6051 disambiguations, 9115 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 2535 queries
  nonoverlapping_component_refs_since_match_p: 12297 disambiguations, 40458 
must overlaps, 53243 queries
  aliasing_component_refs_p: 70892 disambiguations, 374789 queries
  TBAA oracle: 12680127 disambiguations 32179405 queries
   10383370 are in alias set 0
   5747729 queries asked about the same object
   148 queries asked about the same alias set
   0 access volatile
   2482808 are dependent in the DAG
   885223 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 562382 disambiguations, 7467931 queries
  pt_solutions_intersect: 412873 disambiguations, 7818955 queries

It seems that nonoverlapping_component_refs_since_match_p is pretty good on
making decision and there are only few cases where it return -1 which is good.
So i guess teaching it about ARRAY_REFs is logical next step.

If I would like to benchmark alias oracle compile time, what is
reasonable way to do that?

Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree-ssa-alias.c (nonoverlapping_component_refs_p_1): Break out
from ...; work also on duplicated types.
(nonoverlapping_component_refs_since_match): ... here
(ncr_type_uid): Break out from ...
(ncr_compar): ... here; look for TYPE_UID of canonical type if
available.
(nonoverlapping_component_refs_p): Use same_type_for_tbaa to match
the types and nonoverlapping_component_refs_p_1 to disambiguate.
* g++.dg/lto/alias-3_0.C: New file.
* g++.dg/lto/alias-3_1.c: New file.

Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 273193)
+++ tree-ssa-alias.c(working copy)
@@ -1128,6 +1128,90 @@ aliasing_component_refs_p (tree ref1,
   return false;
 }
 
+/* FIELD1 and FIELD2 are two component refs whose bases either do
+   not overlap at all or their addresses are the same.
+
+   Return 1 if FIELD1 and FIELD2 are non-overlapping
+
+   Return 0 if FIELD1 and FIELD2 are posisbly overlapping but in case of
+   overlap their addresses are the same.
+
+   Return -1 otherwise.
+
+   Main difference between 0 and -1 is to let
+   nonoverlapping_component_refs_since_match_p discover the semnatically
+   equivalent part of the access path.  */
+
+static int
+nonoverlapping_component_refs_p_1 (const_tree field1, const_tree field2)
+{
+  /* If both fields are of the same type, we could save hard work of
+ comparing offsets.
+ ??? We cannot simply use the type of operand #0 of the refs here
+ as the Fortran compiler smuggles type punning into COMPONENT_REFs
+ for common blocks instead of using unions like everyone else.  */
+  tree type1 = DECL_CONTEXT (field1);
+  tree type2 = DECL_CONTEXT (field2);
+
+  if (type1 == type2 && TREE_CODE (type1) == RECORD_TYPE)
+{
+  if (field1 == field2)
+   return 0;
+  /* A field and its representative need to be considered the
+same.  */
+  if (DECL_BIT_FIELD_REPRESENTATIVE (field1) == field2
+ || DECL_BIT_FIELD_REPRESENTATIVE (field2) == field1)
+   return 0;
+  /* Different fields of the same record type cannot overlap.
+??? Bitfields can overlap at RTL level so punt on them.  */
+  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
+   return 0;
+  /* Assume that different FIELD_DECLs never overlap in a RECORD_TYPE.  */
+  return 1;
+}
+  else 
+{
+  /* Different fields of the same record type cannot overlap.
+???

Re: [PATCH] Deprecate -frepo option.

2019-07-09 Thread Nathan Sidwell

On 7/9/19 6:39 AM, Richard Biener wrote:

On Mon, Jul 8, 2019 at 2:04 PM Martin Liška  wrote:






Same happens also for GCC7. It does 17 iteration (#define MAX_ITERATIONS 17) and
apparently 17 is not enough to resolve all symbols. And it's really slow.


Ouch.


hm, 17 is a magic number.  in C++98 it was the maximum depth of template 
instantiations that implementations needed to support.  Portable code 
could not expect more.  So the worst case -frepo behaviour would be 17 
iterations.


That's not true any more, it's been 1024 since C++11.

Has a bug been filed about this frepo problem?  If not, it suggest those 
using frepo are not compiling modern C++.



That said, I would recommend to remove it :)


In the end it's up to the C++ FE maintainers but the above clearly
doesn't look promising
(not sure if it keeps re-compiling _all_ repo-triggered templates or
just incrementally adds
them to new object files).



I'm not opposed to removing -frepo from GCC 10 but then I would start
noting it is obsolete
on the GCC 9 branch at least.


I concur.  frepo's serial reinvocation of the compiler is not compatible 
with modern C++ code bases.


nathan

--
Nathan Sidwell


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Gaius Mulley
Rainer Orth  writes:

> here are some initial issues.  I'll reply to Matthias' mail to expand on
> other problems he's raised.
>
> * First, the build broke like this:
>
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/GRTint.c:57:30: error: 'time' 
> redeclared as different kind of symbol
>57 | typedef enum {input, output, time} VectorType;
>   |  ^~~~
> In file included from /usr/include/time.h:12,
>  from /usr/include/sys/time.h:448,
>  from /usr/include/sys/select.h:27,
>  from /usr/include/sys/types.h:665,
>  from /usr/include/stdlib.h:22,
>  from 
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/Glibc.h:15,
>  from 
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/GRTint.c:42:
> /usr/include/iso/time_iso.h:96:15: note: previous declaration of 'time' was 
> here
>96 | extern time_t time(time_t *);
>   |   ^~~~

Hi Rainer,

thanks for the bug report.  Now fixed in the git repro -
the mc bootstrap tool now avoids 'time'.  Also fixed Make-lang.in
to allow parallel builds.

>   I've worked around this by renaming the enum value to vtime.  This
>   problem is likely to occur on other targets as well.
>
> * Building gm2.info failed with the makeinfo I happened to have
>   installed:
>
> makeinfo --split-size=500 --split-size=500 
> -I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2 -o 
> /var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info
>  /vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:3070: `Prerequisites' has no 
> Up field (perhaps incorrect sectioning?).
> makeinfo: Removing output file 
> `/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info'
>  due to errors; use --force to preserve.
> make[2]: *** [/vol/gcc/src/hg/trunk/solaris/gcc/gm2/Make-lang.in:234: 
> /var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info]
>  Error 1
>
>   This is from texinfo 4.13, newer than the required minimum of 4.7.
>   Even with makeinfo 6.1, there are a couple of warnings:
>
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:82: warning: multiple @menu
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:581: warning: multiple @menu
> /var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2-libs.texi:6043:
>  warning: multiple @menu
> /vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:3070: warning:
> unreferenced node `Prerequisites'

will look into gm2.texi

> Other than that, a sequential (only!) multilibbed build succeeded, and I
> even managed to get some testsuite results which aren't too bad, again
> for both multilibs:
>
> === gm2 Summary for unix ===
>
> # of expected passes7800
> # of unexpected failures1729
> # of unresolved testcases   1705
>
> === gm2 Summary for unix/-m64 ===
>
> # of expected passes7800
> # of unexpected failures1729
> # of unresolved testcases   1705
>
> === gm2 Summary ===
>
> # of expected passes15600
> # of unexpected failures3458
> # of unresolved testcases   3410
>
>   Rainer


regards,
Gaius


[OG9] Improve diagnostics for unmappable types

2019-07-09 Thread Andrew Stubbs

I've backported Jakub's patch to openacc-gcc-9-branch.

Andrew

On 08/07/2019 23:10, Jakub Jelinek wrote:

On Thu, Jul 04, 2019 at 12:44:32PM +0100, Andrew Stubbs wrote:

On 03/07/2019 18:58, Jason Merrill wrote:

OK, thanks.


Committed.


This broke following testcase.
error_mark_node type isn't really incomplete, it is errorneous, doesn't have
TYPE_MAIN_DECL and we should have diagnosed it earlier, so it makes no sense
to emit extra explanation messages.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk.

2019-07-08  Jakub Jelinek  

PR c++/91110
* decl2.c (cp_omp_mappable_type_1): Don't emit any note for
error_mark_node type.

* g++.dg/gomp/pr91110.C: New test.

--- gcc/cp/decl2.c.jj   2019-07-04 23:39:02.579106113 +0200
+++ gcc/cp/decl2.c  2019-07-08 13:22:52.552898230 +0200
@@ -1416,7 +1416,7 @@ cp_omp_mappable_type_1 (tree type, bool
/* Mappable type has to be complete.  */
if (type == error_mark_node || !COMPLETE_TYPE_P (type))
  {
-  if (notes)
+  if (notes && type != error_mark_node)
{
  tree decl = TYPE_MAIN_DECL (type);
  inform ((decl ? DECL_SOURCE_LOCATION (decl) : input_location),
--- gcc/testsuite/g++.dg/gomp/pr91110.C.jj  2019-07-08 13:29:43.803163534 
+0200
+++ gcc/testsuite/g++.dg/gomp/pr91110.C 2019-07-08 13:29:17.550593456 +0200
@@ -0,0 +1,11 @@
+// PR c++/91110
+// { dg-do compile }
+
+void
+foo ()
+{
+  X b[2];  // { dg-error "'X' was not declared in this scope" }
+  b[0] = 1;// { dg-error "'b' was not declared in this scope" }
+  #pragma omp target map(to: b)// { dg-error "'b' does not have a mappable 
type in 'map' clause" }
+  ;
+}


Jakub





Re: [PATCH] Deprecate -frepo option.

2019-07-09 Thread Richard Biener
On Mon, Jul 8, 2019 at 2:04 PM Martin Liška  wrote:
>
> On 6/21/19 4:28 PM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 4:13 PM Jakub Jelinek  wrote:
> >>
> >> On Fri, Jun 21, 2019 at 04:04:00PM +0200, Martin Liška wrote:
> >>> On 6/21/19 1:58 PM, Jakub Jelinek wrote:
>  On Fri, Jun 21, 2019 at 01:52:09PM +0200, Martin Liška wrote:
> > On 6/21/19 1:47 PM, Jonathan Wakely wrote:
> >> On Fri, 21 Jun 2019 at 11:40, Martin Liška wrote:
> >>> Yes, I would be fine to deprecate that for GCC 10.1
> >>
> >> Would it be appropriate to issue a warning in GCC 10.x if the option 
> >> is used?
> >
> > Sure. With the patch attached one will see:
> >
> > $ gcc -frepo /tmp/main.cc -c
> > gcc: warning: switch ‘-frepo’ is no longer supported
> >
> > I'm sending patch that also removes -frepo tests from test-suite.
> > I've been testing the patch.
> 
>  IMHO for just deprecation of an option you don't want to remove it from 
>  the
>  testsuite, just match the warning it will generate in those tests, and
>  I'm not convinced you want to remove it from the documentation (rather 
>  than
>  just saying in the documentation that the option is deprecated and might 
>  be
>  removed in a later GCC version).
> >>>
> >>> Agree with you. I'm sending updated version of the patch.
> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> I'm also not convinced about the Deprecated flag, seems like that is a flag
> >> that we use for options that have been already removed.
> >> So, instead there should be some proper warning in the C++ FE for it,
> >> or just Warn.
> >
> > In principle -frepo is a nice idea - does it live up to its promises?  That 
> > is,
> > does it actually work, for example when throwing it on the libstdc++
> > testsuite or a larger C++ project?
>
> I've just tested tramp3d, and it does not survive linking:
>
> g++ tramp3d-v4.o
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> collect: recompiling tramp3d-v4.cpp
> collect: relinking
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: 
> /tmp/ccEeuyj7.ltrans0.ltrans.o: in function 
> `RefCountedBlockPtr, 
> ViewEngine<3, IndexFunction UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > >, false, 
> RefBlockController, 
> ViewEngine<3, IndexFunction UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > > > 
> >::RefCountedBlockPtr(RefCountedBlockPtr double, Full>, ViewEngine<3, IndexFunction UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > >, false, 
> RefBlockController, 
> ViewEngine<3, IndexFunction UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > > > > const&)':
> :(.text+0x4181b): undefined reference to 
> `RefCountedPtr Full>, ViewEngine<3, IndexFunction UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > > > 
> >::RefCountedPtr(RefCountedPtr Vector<3, double, Full>, ViewEngine<3, IndexFunction double, UniformRectilinearTag, CartesianTag, 3> >::PositionsFunctor> > > > > 
> const&)'
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: 
> /tmp/ccEeuyj7.ltrans0.ltrans.o: in function `std::_Vector_base, 
> Interval<3> >, std::allocator, Interval<3> > > 
> >::_Vector_impl::~_Vector_impl()':
> :(.text+0xc1890): undefined reference to 
> `std::allocator, Interval<3> > >::~allocator()'
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: 
> /tmp/ccEeuyj7.ltrans0.ltrans.o: in function `std::_Vector_base, 
> Interval<3> >, std::allocator, Interval<3> > > 
> >::_Vector_base()':
> :(.text+0xc18aa): undefined reference to 
> `std::_Vector_base, Interval<3> >, 
> std::allocator, Interval<3> > > >::_Vector_impl::_Vector_impl()'
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: 
> /tmp/ccEeuyj7.ltrans0.ltrans.o: in function `std::vector, 
> std::allocator > >::_S_use_relocate()':
> :(.text+0xc496f): undefined reference to `std::vector, 
> std::allocator > >::_S_nothrow_relocate(std::integral_constant true>)'
> /usr/lib64/

Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Richard Biener
On Tue, Jul 9, 2019 at 12:23 PM Richard Biener
 wrote:
>
> On Tue, Jul 9, 2019 at 12:22 PM Richard Biener
>  wrote:
> >
> > On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka  wrote:
> > >
> > > > Hi.
> > > >
> > > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
> > > > basic blocks for which it makes the biggest sense. I quite some binary
> > > > size reductions on SPEC2006 and SPEC2017. Speed numbers are also 
> > > > slightly
> > > > positive.
> > > >
> > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> > > >
> > > > Ready to be installed?
> > > The original idea of distinction between jump alignment and loop
> > > alignment was that they have two basic meanings:
> > >  1) jump alignment is there to avoid jumping just to the end of decode
> > >  window (if the window is aligned) so CPU will get stuck after reaching
> > >  the jump and also to possibly reduce code cache polution by populating
> > >  by code that is not executed
> > >  2) loop alignment aims to fit loop in as few cache windows as possible
> > >
> > > Now if you have loop laid in a way that header of loop is not first
> > > basic block, 2) IMO still apply.  I.e.
> > >
> > > jump loop
> > > :loopback
> > > loop body
> > > :loop
> > > if cond jump to loopback
> > >
> > > So dropping loop alignment for those does not seem to make much sense
> > > from high level.  We may want to have differnt alignment for loops
> > > starting by header and loops starting in the middle, but I still liked
> > > more your patch which did bundles for loops.
> > >
> > > modern x86 chips are not very good testing targets on it.  I guess
> > > generic changes to alignment needs to be tested on other chips too.
> > >
> > > Honza
> > > > Thanks,
> > > > Martin
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2019-07-09  Martin Liska  
> > > >
> > > >   * final.c (compute_alignments): Apply the LOOP_ALIGN only
> > > >   to basic blocks that all loop headers.
> > > > ---
> > > >  gcc/final.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > >
> > > >
> > >
> > > > diff --git a/gcc/final.c b/gcc/final.c
> > > > index fefc4874b24..ce2678da988 100644
> > > > --- a/gcc/final.c
> > > > +++ b/gcc/final.c
> > > > @@ -739,6 +739,7 @@ compute_alignments (void)
> > > >if (has_fallthru
> > > > && !(single_succ_p (bb)
> > > >  && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
> > > > +   && bb->loop_father->header == bb
> >
> > I agree that the above is the wrong condition - but I'm not sure we
> > only end up using LOOP_ALIGN for blocks reached by a DFS_BACK
> > edge.  Note that DFS_BACK would have to be applied considering
> > the current CFG layout, simply doing mark_dfs_back_edges doesn't
> > work (we're in CFG layout mode here, no?).
>
> So a "backedge" in this sense would be e->dest->index < e->src->index.
> No?

To me the following would make sense.

Index: gcc/final.c
===
--- gcc/final.c (revision 273294)
+++ gcc/final.c (working copy)
@@ -669,6 +669,7 @@ compute_alignments (void)
 {
   rtx_insn *label = BB_HEAD (bb);
   bool has_fallthru = 0;
+  bool has_backedge = 0;
   edge e;
   edge_iterator ei;

@@ -693,6 +694,8 @@ compute_alignments (void)
has_fallthru = 1, fallthru_count += e->count ();
  else
branch_count += e->count ();
+ if (e->src->index > bb->index)
+   has_backedge = 1;
}
   if (dump_file)
{
@@ -736,7 +739,7 @@ compute_alignments (void)
}
   /* In case block is frequent and reached mostly by non-fallthru edge,
 align it.  It is most likely a first block of loop.  */
-  if (has_fallthru
+  if (has_backedge
  && !(single_succ_p (bb)
   && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
  && optimize_bb_for_speed_p (bb)


Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Richard Biener
On Tue, Jul 9, 2019 at 12:22 PM Richard Biener
 wrote:
>
> On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka  wrote:
> >
> > > Hi.
> > >
> > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
> > > basic blocks for which it makes the biggest sense. I quite some binary
> > > size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly
> > > positive.
> > >
> > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> > >
> > > Ready to be installed?
> > The original idea of distinction between jump alignment and loop
> > alignment was that they have two basic meanings:
> >  1) jump alignment is there to avoid jumping just to the end of decode
> >  window (if the window is aligned) so CPU will get stuck after reaching
> >  the jump and also to possibly reduce code cache polution by populating
> >  by code that is not executed
> >  2) loop alignment aims to fit loop in as few cache windows as possible
> >
> > Now if you have loop laid in a way that header of loop is not first
> > basic block, 2) IMO still apply.  I.e.
> >
> > jump loop
> > :loopback
> > loop body
> > :loop
> > if cond jump to loopback
> >
> > So dropping loop alignment for those does not seem to make much sense
> > from high level.  We may want to have differnt alignment for loops
> > starting by header and loops starting in the middle, but I still liked
> > more your patch which did bundles for loops.
> >
> > modern x86 chips are not very good testing targets on it.  I guess
> > generic changes to alignment needs to be tested on other chips too.
> >
> > Honza
> > > Thanks,
> > > Martin
> > >
> > > gcc/ChangeLog:
> > >
> > > 2019-07-09  Martin Liska  
> > >
> > >   * final.c (compute_alignments): Apply the LOOP_ALIGN only
> > >   to basic blocks that all loop headers.
> > > ---
> > >  gcc/final.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > >
> >
> > > diff --git a/gcc/final.c b/gcc/final.c
> > > index fefc4874b24..ce2678da988 100644
> > > --- a/gcc/final.c
> > > +++ b/gcc/final.c
> > > @@ -739,6 +739,7 @@ compute_alignments (void)
> > >if (has_fallthru
> > > && !(single_succ_p (bb)
> > >  && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
> > > +   && bb->loop_father->header == bb
>
> I agree that the above is the wrong condition - but I'm not sure we
> only end up using LOOP_ALIGN for blocks reached by a DFS_BACK
> edge.  Note that DFS_BACK would have to be applied considering
> the current CFG layout, simply doing mark_dfs_back_edges doesn't
> work (we're in CFG layout mode here, no?).

So a "backedge" in this sense would be e->dest->index < e->src->index.
No?

> Eventually the code
> counting brances effectively already does this though.
>
> The odd thing is that we apply LOOP_ALIGN only to blocks that
> have a fallthru incoming edge.  I don't see Honzas example
> above having one.
>
> > > && optimize_bb_for_speed_p (bb)
> > > && branch_count + fallthru_count > count_threshold
> > > && (branch_count
> > >
> >
> >
> >


Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Richard Biener
On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka  wrote:
>
> > Hi.
> >
> > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
> > basic blocks for which it makes the biggest sense. I quite some binary
> > size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly
> > positive.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Ready to be installed?
> The original idea of distinction between jump alignment and loop
> alignment was that they have two basic meanings:
>  1) jump alignment is there to avoid jumping just to the end of decode
>  window (if the window is aligned) so CPU will get stuck after reaching
>  the jump and also to possibly reduce code cache polution by populating
>  by code that is not executed
>  2) loop alignment aims to fit loop in as few cache windows as possible
>
> Now if you have loop laid in a way that header of loop is not first
> basic block, 2) IMO still apply.  I.e.
>
> jump loop
> :loopback
> loop body
> :loop
> if cond jump to loopback
>
> So dropping loop alignment for those does not seem to make much sense
> from high level.  We may want to have differnt alignment for loops
> starting by header and loops starting in the middle, but I still liked
> more your patch which did bundles for loops.
>
> modern x86 chips are not very good testing targets on it.  I guess
> generic changes to alignment needs to be tested on other chips too.
>
> Honza
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2019-07-09  Martin Liska  
> >
> >   * final.c (compute_alignments): Apply the LOOP_ALIGN only
> >   to basic blocks that all loop headers.
> > ---
> >  gcc/final.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> >
>
> > diff --git a/gcc/final.c b/gcc/final.c
> > index fefc4874b24..ce2678da988 100644
> > --- a/gcc/final.c
> > +++ b/gcc/final.c
> > @@ -739,6 +739,7 @@ compute_alignments (void)
> >if (has_fallthru
> > && !(single_succ_p (bb)
> >  && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
> > +   && bb->loop_father->header == bb

I agree that the above is the wrong condition - but I'm not sure we
only end up using LOOP_ALIGN for blocks reached by a DFS_BACK
edge.  Note that DFS_BACK would have to be applied considering
the current CFG layout, simply doing mark_dfs_back_edges doesn't
work (we're in CFG layout mode here, no?).  Eventually the code
counting brances effectively already does this though.

The odd thing is that we apply LOOP_ALIGN only to blocks that
have a fallthru incoming edge.  I don't see Honzas example
above having one.

> > && optimize_bb_for_speed_p (bb)
> > && branch_count + fallthru_count > count_threshold
> > && (branch_count
> >
>
>
>


Re: [PATCH 1/2] Come up with function_decl_type and use it in tree_function_decl.

2019-07-09 Thread Marc Glisse

On Tue, 9 Jul 2019, Martin Liška wrote:


On 7/9/19 9:49 AM, Marc Glisse wrote:

On Tue, 9 Jul 2019, Marc Glisse wrote:


On Mon, 8 Jul 2019, Martin Liška wrote:


The patch apparently has DECL_IS_OPERATOR_DELETE only on the replaceable global 
deallocation functions, not all delete operators, contrary to 
DECL_IS_OPERATOR_NEW, so the name is misleading. On the other hand, those seem 
to be the ones for which the optimization is legal (well, not quite, the rules 
are in terms of operator new, and I am not sure how well operator delete has to 
match, but close enough).


Are you talking about this location where we set OPERATOR_NEW:
https://github.com/gcc-mirror/gcc/blob/master/gcc/cp/decl.c#L13643
?

That's the only place where we set OPERATOR_NEW flag and not OPERATOR_DELETE.


Yes, I think that's the place.

Again, not setting DECL_IS_OPERATOR_DELETE on local operator delete
seems misleading, but setting it would let us optimize in cases where we
are not really allowed to. Maybe just rename your macro to
DECL_IS_GLOBAL_OPERATOR_DELETE?


Hmm, I replied too fast.

Global operator delete does not seem like a good terminology, the ones marked 
in the patch would be the usual (=non-placement) replaceable deallocation 
functions.

I cannot find a requirement that operator new and operator delete should match. 
The rules to omit allocation are stated in terms of which operator new is 
called, but do not seem to care which operator delete is used. So allocating 
with the global operator new and deallocating with a class overload of operator 
delete can be removed, but not the reverse (not sure how they came up with such 
a rule...). Which means we would need:


Thank you Mark for digging deep in that.



keep DECL_IS_OPERATOR_NEW for the current uses

DECL_IS_REPLACEABLE_OPERATOR_NEW (equivalent to DECL_IS_OPERATOR_NEW && 
DECL_IS_MALLOC? not exactly but close I think) for DCE

DECL_IS_OPERATOR_DELETE (which also includes some class overloads) for DCE


Note that with the current version of the patch we are out of free bits in 
struct GTY(()) tree_function_decl.
Would it be possible to tweak the current patch to cover what you described?


If you approximate DECL_IS_REPLACEABLE_OPERATOR_NEW with 
DECL_IS_OPERATOR_NEW && DECL_IS_MALLOC, it shouldn't need more bits than 
the current patch. I think the main difference is if a user adds attribute 
malloc to his class-specific operator new, where it will enable DCE, but 
since the attribute is non-standard, we can just document that behavior, 
it might even be desirable.


I am not very confident about anything I said in this thread, for all I 
know I may be misguiding you, please make sure someone else who 
understands the C++ standard approves your final patch.


--
Marc Glisse


Re: [C++ Patch] A few additional location improvements to grokdeclarator and check_tag_decl

2019-07-09 Thread Paolo Carlini

Hi,

On 08/07/19 23:44, Jason Merrill wrote:

On 6/23/19 7:58 AM, Paolo Carlini wrote:

+    error_at (smallest_type_location (get_type_quals (declspecs),
+  declspecs->locations),
How about adding a smallest_type_location overload that just takes 
declspecs?


Sure. The below has an additional location fixlet which I noticed over 
the last days, for "complex invalid for". Tested x86_64-linux, as usual.


Thanks, Paolo.



/cp
2019-07-09  Paolo Carlini  

* decl.c (get_type_quals,
smallest_type_location (const cp_decl_specifier_seq*)): New.
(check_tag_decl): Use smallest_type_location in error_at about
multiple types in one declaration.
(grokdeclarator): Use locations[ds_complex] in error_at about
complex invalid; use locations[ds_storage_class] in error_at
about static cdtor; use id_loc in error_at about flexible
array member in union; use get_type_quals.

/testsuite
2019-07-09  Paolo Carlini  

* g++.dg/diagnostic/complex-invalid-1.C: New.
* g++.dg/diagnostic/static-cdtor-1.C: Likewise.
* g++.dg/cpp1z/has-unique-obj-representations2.C: Test location
too.
* g++.dg/other/anon-union3.C: Adjust expected location.
* g++.dg/parse/error8.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 273227)
+++ cp/decl.c   (working copy)
@@ -100,6 +100,7 @@ static tree build_cp_library_fn (tree, enum tree_c
 static void store_parm_decls (tree);
 static void initialize_local_var (tree, tree);
 static void expand_static_init (tree, tree);
+static location_t smallest_type_location (const cp_decl_specifier_seq*);
 
 /* The following symbols are subsumed in the cp_global_trees array, and
listed here individually for documentation purposes.
@@ -4802,6 +4803,24 @@ warn_misplaced_attr_for_class_type (location_t loc
class_type, class_key_or_enum_as_string (class_type));
 }
 
+/* Returns the cv-qualifiers that apply to the type specified
+   by the DECLSPECS.  */
+
+static int
+get_type_quals (const cp_decl_specifier_seq *declspecs)
+{
+  int type_quals = TYPE_UNQUALIFIED;
+
+  if (decl_spec_seq_has_spec_p (declspecs, ds_const))
+type_quals |= TYPE_QUAL_CONST;
+  if (decl_spec_seq_has_spec_p (declspecs, ds_volatile))
+type_quals |= TYPE_QUAL_VOLATILE;
+  if (decl_spec_seq_has_spec_p (declspecs, ds_restrict))
+type_quals |= TYPE_QUAL_RESTRICT;
+
+  return type_quals;
+}
+
 /* Make sure that a declaration with no declarator is well-formed, i.e.
just declares a tagged type or anonymous union.
 
@@ -4821,7 +4840,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
   bool error_p = false;
 
   if (declspecs->multiple_types_p)
-error ("multiple types in one declaration");
+error_at (smallest_type_location (declspecs),
+ "multiple types in one declaration");
   else if (declspecs->redefined_builtin_type)
 {
   if (!in_system_header_at (input_location))
@@ -10142,6 +10162,13 @@ smallest_type_location (int type_quals, const loca
   return min_location (loc, locations[ds_type_spec]);
 }
 
+static location_t
+smallest_type_location (const cp_decl_specifier_seq *declspecs)
+{
+  int type_quals = get_type_quals (declspecs);
+  return smallest_type_location (type_quals, declspecs->locations);
+}
+
 /* Check that it's OK to declare a function with the indicated TYPE
and TYPE_QUALS.  SFK indicates the kind of special function (if any)
that this function is.  OPTYPE is the type given in a conversion
@@ -10407,7 +10434,7 @@ grokdeclarator (const cp_declarator *declarator,
  a member function.  */
   cp_ref_qualifier rqual = REF_QUAL_NONE;
   /* cv-qualifiers that apply to the type specified by the DECLSPECS.  */
-  int type_quals = TYPE_UNQUALIFIED;
+  int type_quals = get_type_quals (declspecs);
   tree raises = NULL_TREE;
   int template_count = 0;
   tree returned_attrs = NULL_TREE;
@@ -10454,13 +10481,6 @@ grokdeclarator (const cp_declarator *declarator,
   if (concept_p)
 constexpr_p = true;
 
-  if (decl_spec_seq_has_spec_p (declspecs, ds_const))
-type_quals |= TYPE_QUAL_CONST;
-  if (decl_spec_seq_has_spec_p (declspecs, ds_volatile))
-type_quals |= TYPE_QUAL_VOLATILE;
-  if (decl_spec_seq_has_spec_p (declspecs, ds_restrict))
-type_quals |= TYPE_QUAL_RESTRICT;
-
   if (decl_context == FUNCDEF)
 funcdef_flag = true, decl_context = NORMAL;
   else if (decl_context == MEMFUNCDEF)
@@ -10999,7 +11019,8 @@ grokdeclarator (const cp_declarator *declarator,
   if (decl_spec_seq_has_spec_p (declspecs, ds_complex))
 {
   if (TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != REAL_TYPE)
-   error ("complex invalid for %qs", name);
+   error_at (declspecs->locations[ds_complex],
+ "complex invalid for %qs", name);
   /* If a modifier is specified, the resulting complex is the complex
 form of TYPE.

Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Martin Liška
On 7/9/19 11:56 AM, Jan Hubicka wrote:
>> Hi.
>>
>> I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
>> basic blocks for which it makes the biggest sense. I quite some binary
>> size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly
>> positive.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> The original idea of distinction between jump alignment and loop
> alignment was that they have two basic meanings:
>  1) jump alignment is there to avoid jumping just to the end of decode
>  window (if the window is aligned) so CPU will get stuck after reaching
>  the jump and also to possibly reduce code cache polution by populating
>  by code that is not executed
>  2) loop alignment aims to fit loop in as few cache windows as possible
> 
> Now if you have loop laid in a way that header of loop is not first
> basic block, 2) IMO still apply.  I.e.
> 
>   jump loop
> :loopback
>   loop body
> :loop
>   if cond jump to loopback
> 
> So dropping loop alignment for those does not seem to make much sense
> from high level.  We may want to have differnt alignment for loops
> starting by header and loops starting in the middle,

That's quite complicated condition, I would not introduce a new alignment.

> but I still liked
> more your patch which did bundles for loops.

The patch caused regression for quite some benchmarks and has it's own
problems (need of a recent GAS, not doing a bundle for bundles that
can't fit in a single bundle window). For that reasons, I decided
to not work on it any longer.

Martin

> 
> modern x86 chips are not very good testing targets on it.  I guess
> generic changes to alignment needs to be tested on other chips too.
> 
> Honza
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2019-07-09  Martin Liska  
>>
>>  * final.c (compute_alignments): Apply the LOOP_ALIGN only
>>  to basic blocks that all loop headers.
>> ---
>>  gcc/final.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>>
> 
>> diff --git a/gcc/final.c b/gcc/final.c
>> index fefc4874b24..ce2678da988 100644
>> --- a/gcc/final.c
>> +++ b/gcc/final.c
>> @@ -739,6 +739,7 @@ compute_alignments (void)
>>if (has_fallthru
>>&& !(single_succ_p (bb)
>> && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
>> +  && bb->loop_father->header == bb
>>&& optimize_bb_for_speed_p (bb)
>>&& branch_count + fallthru_count > count_threshold
>>&& (branch_count
>>
> 
> 
> 



Re: [patch] Small improvements to coverage info (3/n)

2019-07-09 Thread Eric Botcazou
> 2019-07-08  Eric Botcazou  
> 
>   * emit-rtl.c (set_insn_locations): New function moved from...
>   * function.c (set_insn_locations): ...here.
>   * ira-emit.c (emit_moves): Propagate location of the first instruction
>   to the inserted move instructions.
>   * reg-stack.c (compensate_edge): Set the location if the sequence is
>   inserted on the edge.
>   * rtl.h (set_insn_locations): Declare.

Jeff privately pointed out that the emit_moves change is not stable wrt debug 
insns so I have tested and installed the following fix, modelled on what was 
done for emit_to_new_bb_before.


* ira-emit.c (emit_moves): Skip DEBUG_INSNs when setting the location.

-- 
Eric Botcazou
Index: ira-emit.c
===
--- ira-emit.c	(revision 273247)
+++ ira-emit.c	(working copy)
@@ -997,27 +997,30 @@ emit_moves (void)
   basic_block bb;
   edge_iterator ei;
   edge e;
-  rtx_insn *insns, *tmp;
+  rtx_insn *insns, *tmp, *next;
 
   FOR_EACH_BB_FN (bb, cfun)
 {
   if (at_bb_start[bb->index] != NULL)
 	{
 	  at_bb_start[bb->index] = modify_move_list (at_bb_start[bb->index]);
-	  insns = emit_move_list (at_bb_start[bb->index],
-  REG_FREQ_FROM_BB (bb));
+	  insns
+	= emit_move_list (at_bb_start[bb->index], REG_FREQ_FROM_BB (bb));
 	  tmp = BB_HEAD (bb);
 	  if (LABEL_P (tmp))
 	tmp = NEXT_INSN (tmp);
 	  if (NOTE_INSN_BASIC_BLOCK_P (tmp))
 	tmp = NEXT_INSN (tmp);
-	  /* Propagate the location of the current first instruction to the
-	 moves so that they don't inherit a random location.  */
-	  if (tmp != NULL_RTX && INSN_P (tmp))
-	set_insn_locations (insns, INSN_LOCATION (tmp));
+	  /* Make sure to put the location of TMP or a subsequent instruction
+	 to avoid inheriting the location of the previous instruction.  */
+	  next = tmp;
+	  while (next && !NONDEBUG_INSN_P (next))
+	next = NEXT_INSN (next);
+	  if (next)
+	set_insn_locations (insns, INSN_LOCATION (next));
 	  if (tmp == BB_HEAD (bb))
 	emit_insn_before (insns, tmp);
-	  else if (tmp != NULL_RTX)
+	  else if (tmp)
 	emit_insn_after (insns, PREV_INSN (tmp));
 	  else
 	emit_insn_after (insns, get_last_insn ());


Re: [patch, c++ openmp] Improve diagnostics for unmappable types

2019-07-09 Thread Andrew Stubbs

On 08/07/2019 23:10, Jakub Jelinek wrote:

This broke following testcase.
error_mark_node type isn't really incomplete, it is errorneous, doesn't have
TYPE_MAIN_DECL and we should have diagnosed it earlier, so it makes no sense
to emit extra explanation messages.


Apologies. Did I miss something in the regression tests? _Atomic-5 seems 
fine?


Andrew


Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.

2019-07-09 Thread Jan Hubicka
> Hi.
> 
> I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
> basic blocks for which it makes the biggest sense. I quite some binary
> size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly
> positive.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
The original idea of distinction between jump alignment and loop
alignment was that they have two basic meanings:
 1) jump alignment is there to avoid jumping just to the end of decode
 window (if the window is aligned) so CPU will get stuck after reaching
 the jump and also to possibly reduce code cache polution by populating
 by code that is not executed
 2) loop alignment aims to fit loop in as few cache windows as possible

Now if you have loop laid in a way that header of loop is not first
basic block, 2) IMO still apply.  I.e.

jump loop
:loopback
loop body
:loop
if cond jump to loopback

So dropping loop alignment for those does not seem to make much sense
from high level.  We may want to have differnt alignment for loops
starting by header and loops starting in the middle, but I still liked
more your patch which did bundles for loops.

modern x86 chips are not very good testing targets on it.  I guess
generic changes to alignment needs to be tested on other chips too.

Honza
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-07-09  Martin Liska  
> 
>   * final.c (compute_alignments): Apply the LOOP_ALIGN only
>   to basic blocks that all loop headers.
> ---
>  gcc/final.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> 

> diff --git a/gcc/final.c b/gcc/final.c
> index fefc4874b24..ce2678da988 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -739,6 +739,7 @@ compute_alignments (void)
>if (has_fallthru
> && !(single_succ_p (bb)
>  && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
> +   && bb->loop_father->header == bb
> && optimize_bb_for_speed_p (bb)
> && branch_count + fallthru_count > count_threshold
> && (branch_count
> 





Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING

2019-07-09 Thread Richard Biener
On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez  wrote:
>
>
>
> On 7/4/19 6:33 AM, Richard Biener wrote:
> > On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez  wrote:
> >>
> >> On 7/3/19 7:08 AM, Richard Biener wrote:
> >>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez  wrote:
> >
> >> How about we keep VARYING and UNDEFINED typeless until right before we
> >> call into the ranger.  At which point, we have can populate min/max
> >> because we have the tree_code and the type handy.  So right before we
> >> call into the ranger do:
> >>
> >>  if (varying_p ())
> >>foo->set_varying(TYPE);
> >>
> >> This would avoid the type cache, and keep the ranger happy.
> >
> > you cannot do set_varying on the static const range but instead you'd do
> >
> >value_range tem (*foo);
> >if (varying_p ())
> > tem->set_full_range (TYPE);
> >
> > which I think we already do in some places.  Thus my question _where_
> > you actually need this.
>
> Basically, everywhere.  By having a type for varying/undefined, we don't
> have to special case anything.  Sure, we could for example, special case
> the invert operation for undefined / varying.  And we could special case
> everything dealing with ranges to handle varying and undefined, but why?
>   We could also pass a type argument everywhere, but that's just ugly.
> However, I do understand your objection to the type cache.
>
> How about the attached approach?  Set the type for varying/undefined
> when we know it, while avoiding touching the CONST varying.  Then right
> before calling the ranger, pass down a new varying node with min/max for
> any varyings that were still typeless until that point.
>
> I have taken care of never adding a set_varying() that was not already
> there.  Would this keep the const happy?
>
> Technically we don't need to set varying/undef types for every instance
> in VRP, but we need it at least for the code that will be shared with
> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>   I just figured if we have the information, might as well set it for
> consistency.
>
> If you like this approach, I can rebase the other patches that depend on
> this one.

OK, so I went ant checked what you do for class irange which has
a type but no kind member (but constructors with a kind).  It also
uses wide_int members for storage.  For a pure integer constant
range representation this represents somewhat odd choices;  I'd
have elided the m_type member completely here, it seems fully
redundant.  Only range operations need to be carried out in a
specific type (what I was suggesting above).  Even the precision
encoded in the wide_int members is redundant then (I'd have
expected widest_int here and trailing-wide-ints for optimizing
storage).

Then class range_operator looks a bit strange to me (looking
just at the header).  Ugh, so it is all virtual because
you have one instance per tree code.  What an odd choice.
Why didn't you simply go with passing tree_code  (and type!)
to fold_range/op_range?  The API also seems to be oddly
constrained to binary ops.  Anyway, the way you build
the operator table requires an awful lot of global C++ ctor
invocations, sth we generally try to avoid.  But I'm getting
into too many details here.

So - to answer your question above, I'd like you to pass down
a type to operations.  Because that's what is fundamentally
required - a range doesn't have a "type" and the current
value_range_base doesn't fall into the trap of needing one.

Richard.

>
> Aldy


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Matthias Klose
On 08.07.19 23:19, Matthias Klose wrote:
> On 14.06.19 15:09, Gaius Mulley wrote:
>>
>> Hello,
>>
>> here is version two of the patches which introduce Modula-2 into the
>> GCC trunk.  The patches include:
>>
>>   (*)  a patch to allow all front ends to register a lang spec function.
>>(included are patches for all front ends to provide an empty
>> callback function).
>>   (*)  patch diffs to allow the Modula-2 front end driver to be
>>built using GCC Makefile and friends.
>>
>> The compressed tarball includes:
>>
>>   (*)  gcc/m2  (compiler driver and lang-spec stuff for Modula-2).
>>Including the need for registering lang spec functions.
>>   (*)  gcc/testsuite/gm2  (a Modula-2 dejagnu test to ensure that
>>the gm2 driver is built and can understands --version).
>>
>> These patches have been re-written after taking on board the comments
>> found in this thread:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02620.html
>>
>> it is a revised patch set from:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00220.html
>>
>> I've run make bootstrap and run the regression tests on trunk and no
>> extra failures occur for all languages touched in the ChangeLog.
>>
>> I'm currently tracking gcc trunk and gcc-9 with gm2 (which works well
>> with amd64/arm64/i386) - these patches are currently simply for the
>> driver to minimise the patch size.  There are also > 1800 tests in a
>> dejagnu testsuite for gm2 which can be included at some future time.
> 
> I had a look at the GCC 9 version of the patches, with a build including a 
> make
> install. Some comments:
> 
>  - A parallel build (at least with -j4) isn't working. A sequental
>build works fine.  I think forcing a sequential build will not
>work well, increasing the build time too much.
> 
>  - libgm2 multilib builds are not working.  //32/libgm2
>is configured, but not built.
> 
>  - The internal tools in the gcclibdir are installed twice, with
>both vanilla names and prefixed/suffixed names.
> 
>  - libgm2/configure.a has a libtool version 14:0:0, however all
>shared libraries are installed with soversion 0.
> 
>  - no manual page for gm2m.
> 
>  - libpth.{a,so} is installed in the system libdir, which
>conflicts with the installation of the libpth packages
>on most distros.
> 
>  - There are three letter libraries with pretty generic
>names installed into the system libdir: log, iso, cor,
>min, ulm. At least for log, you have a file conflict
>with another library.  Shouldn't these libraries named
>mpre specific, like libgm2log?
> 
> Matthias
> 
> The installed tree:

> ./usr/lib/gcc/x86_64-linux-gnu/9/m2/ulm/libulm.a
> ./usr/lib/x86_64-linux-gnu/libulm.a

and all static libraries are installed twice, not just libulm.a. What is the
correct location?

Matthias


Re: [PATCH] Improve scan_operand_equal_p

2019-07-09 Thread Richard Biener
On Tue, Jul 9, 2019 at 12:13 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The 4 testcases below weren't vectorized, because while
> tree-vect-data-refs.c now allows more forms of simd lane access,
> scan_operand_equal_p didn't allow combining them together.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> committed to trunk.
>
> 2019-07-08  Jakub Jelinek  
>
> * tree-vect-stmts.c (scan_operand_equal_p): Look through MEM_REF
> with SSA_NAME address of POINTER_PLUS_EXPR.  Handle MULT_EXPR
> and casts in offset when different, both through gimple stmts
> and through trees.  Rewritten using loops to minimize code duplication
> for each operand.
>
> * g++.dg/vect/simd-6.cc: Replace xfail with target x86.
> * g++.dg/vect/simd-9.cc: Likewise.
>
> * testsuite/libgomp.c++/scan-13.C: Replace xfail with target x86.
> * testsuite/libgomp.c++/scan-16.C: Likewise.
>
> --- gcc/tree-vect-stmts.c.jj2019-07-04 09:24:28.595303590 +0200
> +++ gcc/tree-vect-stmts.c   2019-07-08 20:59:52.376285636 +0200
> @@ -6334,30 +6334,88 @@ get_group_alias_ptr_type (stmt_vec_info
>  static bool
>  scan_operand_equal_p (tree ref1, tree ref2)
>  {
> -  machine_mode mode1, mode2;
> -  poly_int64 bitsize1, bitsize2, bitpos1, bitpos2;
> -  tree offset1, offset2;
> -  int unsignedp1, unsignedp2, reversep1, reversep2;
> -  int volatilep1 = 0, volatilep2 = 0;
> -  tree base1 = get_inner_reference (ref1, &bitsize1, &bitpos1, &offset1,
> -   &mode1, &unsignedp1, &reversep1,
> -   &volatilep1);
> -  tree base2 = get_inner_reference (ref2, &bitsize2, &bitpos2, &offset2,
> -   &mode2, &unsignedp2, &reversep2,
> -   &volatilep2);
> -  if (reversep1 || reversep2 || volatilep1 || volatilep2)
> -return false;
> -  if (!operand_equal_p (base1, base2, 0))
> -return false;
> -  if (maybe_ne (bitpos1, 0) || maybe_ne (bitpos2, 0))
> -return false;
> -  if (maybe_ne (bitsize1, bitsize2))
> +  tree ref[2] = { ref1, ref2 };
> +  poly_int64 bitsize[2], bitpos[2];
> +  tree offset[2], base[2];
> +  for (int i = 0; i < 2; ++i)
> +{
> +  machine_mode mode;
> +  int unsignedp, reversep, volatilep = 0;
> +  base[i] = get_inner_reference (ref[i], &bitsize[i], &bitpos[i],
> +&offset[i], &mode, &unsignedp,
> +&reversep, &volatilep);
> +  if (reversep || volatilep || maybe_ne (bitpos[i], 0))
> +   return false;
> +  if (TREE_CODE (base[i]) == MEM_REF
> + && offset[i] == NULL_TREE
> + && TREE_CODE (TREE_OPERAND (base[i], 0)) == SSA_NAME)
> +   {
> + gimple *def_stmt = SSA_NAME_DEF_STMT (TREE_OPERAND (base[i], 0));
> + if (is_gimple_assign (def_stmt)
> + && gimple_assign_rhs_code (def_stmt) == POINTER_PLUS_EXPR
> + && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == ADDR_EXPR
> + && TREE_CODE (gimple_assign_rhs2 (def_stmt)) == SSA_NAME)
> +   {
> + if (maybe_ne (mem_ref_offset (base[i]), 0))
> +   return false;
> + base[i] = TREE_OPERAND (gimple_assign_rhs1 (def_stmt), 0);
> + offset[i] = gimple_assign_rhs2 (def_stmt);
> +   }
> +   }
> +}
> +
> +  if (!operand_equal_p (base[0], base[1], 0))
>  return false;
> -  if (offset1 != offset2
> -  && (!offset1
> - || !offset2
> - || !operand_equal_p (offset1, offset2, 0)))
> +  if (maybe_ne (bitsize[0], bitsize[1]))
>  return false;
> +  if (offset[0] != offset[1])
> +{
> +  if (!offset[0] || !offset[1])
> +   return false;
> +  if (!operand_equal_p (offset[0], offset[1], 0))
> +   {
> + tree step[2];
> + for (int i = 0; i < 2; ++i)
> +   {
> + step[i] = integer_one_node;
> + if (TREE_CODE (offset[i]) == SSA_NAME)
> +   {
> + gimple *def_stmt = SSA_NAME_DEF_STMT (offset[i]);
> + if (is_gimple_assign (def_stmt)
> + && gimple_assign_rhs_code (def_stmt) == MULT_EXPR
> + && (TREE_CODE (gimple_assign_rhs2 (def_stmt))
> + == INTEGER_CST))
> +   {
> + step[i] = gimple_assign_rhs2 (def_stmt);
> + offset[i] = gimple_assign_rhs1 (def_stmt);
> +   }
> +   }
> + else if (TREE_CODE (offset[i]) == MULT_EXPR)
> +   {
> + step[i] = TREE_OPERAND (offset[i], 1);
> + offset[i] = TREE_OPERAND (offset[i], 0);
> +   }
> + tree rhs1 = NULL_TREE;
> + if (TREE_CODE (offset[i]) == SSA_NAME)
> +   {
> + gimple *def_stmt = SSA_NAME_DEF_STMT (offset[i]);
> + if (gimple_assign_

Re: [patch 2/2][arm]: redefine aes patterns

2019-07-09 Thread Kyrill Tkachov

Hi Sylvia,

On 7/5/19 12:32 PM, Sylvia Taylor wrote:

Greetings,

This patch removes the arch-common aese/aesmc and aesd/aesimc fusions
(i.e. aes fusion) implemented in the scheduling phase through the
aarch_crypto_can_dual function. The reason is due to observing
undesired behaviour in cases such as:
- when register allocation goes bad (e.g. extra movs)
- aes operations with xor and zeroed keys among interleaved operations

A more stable version should be provided by instead doing the aes fusion
during the combine pass. As such, new combine patterns have been added to
enable this.

The second change is the aese and aesd patterns have been rewritten as
encapsulating a xor operation. The purpose is to simplify the need of
having additional combine patterns for cases like the ones below:

For AESE (though it also applies to AESD as both have a xor operation):

    data = data ^ key;
    data = vaeseq_u8(data, zero);
    ---
    veor    q1, q0, q1
    aese.8  q1, q9

Should mean and generate the same as:

    data = vaeseq_u8(data, key);
    ---
    aese.8   q1, q0

Bootstrapped and tested on arm-none-linux-gnueabihf.


Thanks! This looks much cleaner now.

Committed on your behalf with r273296.

Kyrill


Cheers,
Syl

gcc/ChangeLog:

2019-07-05  Sylvia Taylor  

    * config/arm/crypto.md:
    (crypto_): Redefine aese/aesd pattern with xor.
    (crypto_): Remove attribute enabled for aesmc.
    (crypto_): Split CRYPTO_BINARY into 2 patterns.
    (*aarch32_crypto_aese_fused, *aarch32_crypto_aesd_fused): New.
    * config/arm/arm.c
    (aarch_macro_fusion_pair_p): Remove aes/aesmc fusion check.
    * config/arm/aarch-common-protos.h
    (aarch_crypto_can_dual_issue): Remove.
    * config/arm/aarch-common.c
    (aarch_crypto_can_dual_issue): Likewise.
    * config/arm/exynos-m1.md: Remove aese/aesmc fusion.
    * config/arm/cortex-a53.md: Likewise.
    * config/arm/cortex-a57.md: Likewise.
    * config/arm/iterators.md:
    (CRYPTO_BINARY): Redefine.
    (CRYPTO_UNARY): Removed.
    (CRYPTO_AES, CRYPTO_AESMC): New.

gcc/testsuite/ChangeLog:

2019-07-05  Sylvia Taylor  

    * gcc.target/arm/aes-fuse-1.c: New.
    * gcc.target/arm/aes-fuse-2.c: New.
    * gcc.target/arm/aes_xor_combine.c: New.


Re: [PATCH][ARM][testsuite] Fix address of sg stubs in CMSE tests

2019-07-09 Thread Kyrill Tkachov

Hi Christophe,

On 7/2/19 3:41 PM, Christophe Lyon wrote:

Hi,

While running the GCC testsuite with an armv8-m target, I noticed that
a few tests where causing the BFD linker to crash. I opened PR
ld/24709 for this [1], but fixing it properly is tricky and not worth
the headache.

I "fixed" the linker so that it emits a useful error message instead
of crashing, and on the GCC side the "fix" is simply to avoid placing
the sg stubs section too far from the destination.

This is what this patch does, by replacing
--section-start,.gnu.sgstubs=0x2040
with
--section-start,.gnu.sgstubs=0x0040

OK?



Ok.

Thanks,

Kyrill




Thanks,

Christophe


Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-09 Thread Rainer Orth
Hi Gaius,

>> it rather depends upon what you want, if you want the latest complete
>> gm2 grafting onto the svn gcc trunk then these two scripts will create a
>> patched tree and also rebuild gm2.
>
> that's my goal: I'd like to see if gm2 and libgm2 build on Solaris and
> pass at least a reasonable number of tests.
>
>> [however please be careful with the scripts - they do assume that
>> everything will be built in $HOME/GM2 - read and adapt as necessary].
>
> Ok, I will give it a try.

here are some initial issues.  I'll reply to Matthias' mail to expand on
other problems he's raised.

* First, the build broke like this:

/vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/GRTint.c:57:30: error: 'time' 
redeclared as different kind of symbol
   57 | typedef enum {input, output, time} VectorType;
  |  ^~~~
In file included from /usr/include/time.h:12,
 from /usr/include/sys/time.h:448,
 from /usr/include/sys/select.h:27,
 from /usr/include/sys/types.h:665,
 from /usr/include/stdlib.h:22,
 from /vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/Glibc.h:15,
 from /vol/gcc/src/hg/trunk/solaris/gcc/gm2/mc-boot/GRTint.c:42:
/usr/include/iso/time_iso.h:96:15: note: previous declaration of 'time' was here
   96 | extern time_t time(time_t *);
  |   ^~~~

  I've worked around this by renaming the enum value to vtime.  This
  problem is likely to occur on other targets as well.

* Building gm2.info failed with the makeinfo I happened to have
  installed:

makeinfo --split-size=500 --split-size=500 
-I/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2 -o 
/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info 
/vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi
/vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:3070: `Prerequisites' has no Up 
field (perhaps incorrect sectioning?).
makeinfo: Removing output file 
`/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info'
 due to errors; use --force to preserve.
make[2]: *** [/vol/gcc/src/hg/trunk/solaris/gcc/gm2/Make-lang.in:234: 
/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2.info] 
Error 1

  This is from texinfo 4.13, newer than the required minimum of 4.7.
  Even with makeinfo 6.1, there are a couple of warnings:

/vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:82: warning: multiple @menu
/vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:581: warning: multiple @menu
/var/gcc/gcc-10.0.0-20190708/11.5-gcc-gas-gm2-no-bootstrap-j1/gcc/gm2/gm2-libs.texi:6043:
 warning: multiple @menu
/vol/gcc/src/hg/trunk/solaris/gcc/gm2/gm2.texi:3070: warning: unreferenced node 
`Prerequisites'

Other than that, a sequential (only!) multilibbed build succeeded, and I
even managed to get some testsuite results which aren't too bad, again
for both multilibs:

=== gm2 Summary for unix ===

# of expected passes7800
# of unexpected failures1729
# of unresolved testcases   1705

=== gm2 Summary for unix/-m64 ===

# of expected passes7800
# of unexpected failures1729
# of unresolved testcases   1705

=== gm2 Summary ===

# of expected passes15600
# of unexpected failures3458
# of unresolved testcases   3410

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Support __builtin_expect_with_probability for analysis of # of loop iterations.

2019-07-09 Thread Martin Liška
On 7/4/19 6:02 PM, Jan Hubicka wrote:
> perhaps we want to also document that builtin-expect can be used this way?
> It owuld be also nice to have a testcase.

Good idea! I'm going to install the following patch.

Martin
>From 6b59938eff83600ae237409de027040b7904f66d Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 9 Jul 2019 11:10:20 +0200
Subject: [PATCH] Document and test __builtin_expect_with_probability.

gcc/ChangeLog:

2019-07-09  Martin Liska  

	* doc/extend.texi: Document influence on loop
	optimizers.

gcc/testsuite/ChangeLog:

2019-07-09  Martin Liska  

	* gcc.dg/predict-17.c: Test loop optimizer assumption
	about loop iterations.
---
 gcc/doc/extend.texi   | 5 -
 gcc/testsuite/gcc.dg/predict-17.c | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index f2619e12f93..061607411eb 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -13045,8 +13045,11 @@ when testing pointer or floating-point values.
 For the purposes of branch prediction optimizations, the probability that
 a @code{__builtin_expect} expression is @code{true} is controlled by GCC's
 @code{builtin-expect-probability} parameter, which defaults to 90%.  
+
 You can also use @code{__builtin_expect_with_probability} to explicitly 
-assign a probability value to individual expressions.
+assign a probability value to individual expressions.  If the built-in
+is used in a loop construct, the provided probability will influence
+the expected number of iterations made by loop optimizations.
 @end deftypefn
 
 @deftypefn {Built-in Function} long __builtin_expect_with_probability
diff --git a/gcc/testsuite/gcc.dg/predict-17.c b/gcc/testsuite/gcc.dg/predict-17.c
index 5069aa47c8c..45b618a942c 100644
--- a/gcc/testsuite/gcc.dg/predict-17.c
+++ b/gcc/testsuite/gcc.dg/predict-17.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fdump-tree-profile_estimate-details" } */
 
 extern int global;
 
@@ -11,3 +11,5 @@ void foo (int base)
 
 /* { dg-final { scan-tree-dump "first match heuristics: 5.00%" "profile_estimate"} } */
 /* { dg-final { scan-tree-dump "__builtin_expect_with_probability heuristics of edge .*->.*: 5.00%" "profile_estimate"} } */
+/* { dg-final { scan-tree-dump "is probably executed at most 19" "profile_estimate"} } */
+
-- 
2.22.0



Re: [PATCH v2] S/390: Improve storing asan frame_pc

2019-07-09 Thread Andreas Krebbel
On 02.07.19 17:34, Ilya Leoshkevich wrote:
...
> 2019-06-28  Ilya Leoshkevich  
> 
>   * asan.c (asan_emit_stack_protection): Provide an alignment
>   hint.
>   * config/s390/s390.h (CODE_LABEL_BOUNDARY): Specify that s390
>   requires code labels to be aligned on a 2-byte boundary.
>   * defaults.h (CODE_LABEL_BOUNDARY): New macro.
>   * doc/tm.texi: Document CODE_LABEL_BOUNDARY.
>   * doc/tm.texi.in: Likewise.

S/390 parts are ok. Thanks!

Andreas



Re: [PATCH 1/2] Come up with function_decl_type and use it in tree_function_decl.

2019-07-09 Thread Martin Liška
On 7/9/19 9:49 AM, Marc Glisse wrote:
> On Tue, 9 Jul 2019, Marc Glisse wrote:
> 
>> On Mon, 8 Jul 2019, Martin Liška wrote:
>>
 The patch apparently has DECL_IS_OPERATOR_DELETE only on the replaceable 
 global deallocation functions, not all delete operators, contrary to 
 DECL_IS_OPERATOR_NEW, so the name is misleading. On the other hand, those 
 seem to be the ones for which the optimization is legal (well, not quite, 
 the rules are in terms of operator new, and I am not sure how well 
 operator delete has to match, but close enough).
>>>
>>> Are you talking about this location where we set OPERATOR_NEW:
>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/cp/decl.c#L13643
>>> ?
>>>
>>> That's the only place where we set OPERATOR_NEW flag and not 
>>> OPERATOR_DELETE.
>>
>> Yes, I think that's the place.
>>
>> Again, not setting DECL_IS_OPERATOR_DELETE on local operator delete
>> seems misleading, but setting it would let us optimize in cases where we
>> are not really allowed to. Maybe just rename your macro to
>> DECL_IS_GLOBAL_OPERATOR_DELETE?
> 
> Hmm, I replied too fast.
> 
> Global operator delete does not seem like a good terminology, the ones marked 
> in the patch would be the usual (=non-placement) replaceable deallocation 
> functions.
> 
> I cannot find a requirement that operator new and operator delete should 
> match. The rules to omit allocation are stated in terms of which operator new 
> is called, but do not seem to care which operator delete is used. So 
> allocating with the global operator new and deallocating with a class 
> overload of operator delete can be removed, but not the reverse (not sure how 
> they came up with such a rule...). Which means we would need:

Thank you Mark for digging deep in that.

> 
> keep DECL_IS_OPERATOR_NEW for the current uses
> 
> DECL_IS_REPLACEABLE_OPERATOR_NEW (equivalent to DECL_IS_OPERATOR_NEW && 
> DECL_IS_MALLOC? not exactly but close I think) for DCE
> 
> DECL_IS_OPERATOR_DELETE (which also includes some class overloads) for DCE

Note that with the current version of the patch we are out of free bits in 
struct GTY(()) tree_function_decl.
Would it be possible to tweak the current patch to cover what you described?

> 
> Maybe we can ignore the class-specific operator delete if it simplifies 
> things.

I would like to make it as simple as possible, yes :P

Martin

> 
> Sorry for the messy comments, the messy rules don't help...
> 



Re: [PATCH] Re-instantiate access-path based analysis during VN

2019-07-09 Thread Richard Biener
On Mon, 8 Jul 2019, Richard Biener wrote:

> 
> This re-instantiates the patch I had to revert earlier, doing it in
> a safer way.  We record the original ref so we can do an additional
> disambiguation during vn_reference_lookup_3.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

And this is what I committed.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

2019-07-09  Richard Biener  

* tree-ssa-sccvn.c (struct vn_walk_cb_data): Add orig_ref member.
(vn_reference_lookup_3): If the main ref has no access path recorded
but orig_ref has use it to do access-path based disambiguation.
(vn_reference_lookup_pieces): Adjust.
(vn_reference_lookup): Pass down original ref if we valueized.

* gcc.dg/tree-ssa/alias-access-path-1.c: Scan fre1 dump.
* gcc.dg/tree-ssa/alias-access-path-2.c: Likewise.
* gcc.dg/tree-ssa/alias-access-path-8.c: Likewise.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 273234)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -1670,15 +1671,18 @@ struct pd_data
 
 struct vn_walk_cb_data
 {
-  vn_walk_cb_data (vn_reference_t vr_, tree *last_vuse_ptr_,
+  vn_walk_cb_data (vn_reference_t vr_, tree orig_ref_, tree *last_vuse_ptr_,
   vn_lookup_kind vn_walk_kind_, bool tbaa_p_)
-: vr (vr_), last_vuse_ptr (last_vuse_ptr_), vn_walk_kind (vn_walk_kind_),
-  tbaa_p (tbaa_p_), known_ranges (NULL)
-   {}
+: vr (vr_), last_vuse_ptr (last_vuse_ptr_),
+  vn_walk_kind (vn_walk_kind_), tbaa_p (tbaa_p_), known_ranges (NULL)
+   {
+ ao_ref_init (&orig_ref, orig_ref_);
+   }
   ~vn_walk_cb_data ();
   void *push_partial_def (const pd_data& pd, tree, HOST_WIDE_INT);
 
   vn_reference_t vr;
+  ao_ref orig_ref;
   tree *last_vuse_ptr;
   vn_lookup_kind vn_walk_kind;
   bool tbaa_p;
@@ -2246,6 +2298,28 @@ vn_reference_lookup_3 (ao_ref *ref, tree
  lhs_ref_ok = true;
}
 
+  /* Besides valueizing the LHS we can also use access-path based
+ disambiguation on the original non-valueized ref.  */
+  if (!ref->ref
+ && lhs_ref_ok
+ && data->orig_ref.ref)
+   {
+ /* We want to use the non-valueized LHS for this, but avoid redundant
+work.  */
+ ao_ref *lref = &lhs_ref;
+ ao_ref lref_alt;
+ if (valueized_anything)
+   {
+ ao_ref_init (&lref_alt, lhs);
+ lref = &lref_alt;
+   }
+ if (!refs_may_alias_p_1 (&data->orig_ref, lref, data->tbaa_p))
+   {
+ *disambiguate_only = true;
+ return NULL;
+   }
+   }
+
   /* If we reach a clobbering statement try to skip it and see if
  we find a VN result with exactly the same value as the
 possible clobber.  In this case we can ignore the clobber
@@ -2763,6 +2857,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree
 
   /* Do not update last seen VUSE after translating.  */
   data->last_vuse_ptr = NULL;
+  /* Invalidate the original access path since it now contains
+ the wrong base.  */
+  data->orig_ref.ref = NULL_TREE;
 
   /* Keep looking for the adjusted *REF / VR pair.  */
   return NULL;
@@ -2923,6 +3020,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree
 
   /* Do not update last seen VUSE after translating.  */
   data->last_vuse_ptr = NULL;
+  /* Invalidate the original access path since it now contains
+ the wrong base.  */
+  data->orig_ref.ref = NULL_TREE;
 
   /* Keep looking for the adjusted *REF / VR pair.  */
   return NULL;
@@ -2983,7 +3083,7 @@ vn_reference_lookup_pieces (tree vuse, a
 {
   ao_ref r;
   unsigned limit = PARAM_VALUE (PARAM_SCCVN_MAX_ALIAS_QUERIES_PER_ACCESS);
-  vn_walk_cb_data data (&vr1, NULL, kind, true);
+  vn_walk_cb_data data (&vr1, NULL_TREE, NULL, kind, true);
   if (ao_ref_init_from_vn_reference (&r, set, type, vr1.operands))
*vnresult =
  (vn_reference_t)walk_non_aliased_vuses (&r, vr1.vuse, true,
@@ -3040,7 +3140,8 @@ vn_reference_lookup (tree op, tree vuse,
  || !ao_ref_init_from_vn_reference (&r, vr1.set, vr1.type,
 vr1.operands))
ao_ref_init (&r, op);
-  vn_walk_cb_data data (&vr1, last_vuse_ptr, kind, tbaa_p);
+  vn_walk_cb_data data (&vr1, r.ref ? NULL_TREE : op,
+   last_vuse_ptr, kind, tbaa_p);
   wvnresult =
(vn_reference_t)walk_non_aliased_vuses (&r, vr1.vuse, tbaa_p,
vn_reference_lookup_2,
Index: gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c (revision 273234)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c (working copy)
@@ -1,5 +1,5

[Ada] Missing escape of the double quote in JSON output

2019-07-09 Thread Pierre-Marie de Rodat
In Ada, the name of operators contains a pair of double quotes, which
need to be properly escaped when the name appears in the JSON output of
-gnatR.

The change also ensures that formal parameters are not listed in the
layout information, since this information is not back-annotated for
them.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-09  Eric Botcazou  

gcc/ada/

* repinfo.adb (List_Entities): Disregard formals altogether.
(List_Name): Properly escape the double quote in the JSON
output.--- gcc/ada/repinfo.adb
+++ gcc/ada/repinfo.adb
@@ -525,9 +525,6 @@ package body Repinfo is
 
   List_Entities (E, Bytes_Big_Endian, True);
 
-   elsif Is_Formal (E) and then In_Subprogram then
-  null;
-
elsif Ekind_In (E, E_Entry,
   E_Entry_Family,
   E_Subprogram_Type)
@@ -560,12 +557,10 @@ package body Repinfo is
  List_Type_Info (E);
   end if;
 
-   elsif Ekind_In (E, E_Variable, E_Constant) then
-  if List_Representation_Info >= 2 then
- List_Object_Info (E);
-  end if;
+   --  Note that formals are not annotated so we skip them here
 
-   elsif Ekind (E) = E_Loop_Parameter or else Is_Formal (E) then
+   elsif Ekind_In (E, E_Variable, E_Constant, E_Loop_Parameter)
+   then
   if List_Representation_Info >= 2 then
  List_Object_Info (E);
   end if;
@@ -899,6 +894,8 @@ package body Repinfo is
---
 
procedure List_Name (Ent : Entity_Id) is
+  C : Character;
+
begin
   --  List the qualified name recursively, except
   --  at compilation unit level in default mode.
@@ -914,7 +911,16 @@ package body Repinfo is
 
   Get_Unqualified_Decoded_Name_String (Chars (Ent));
   Set_Casing (Unit_Casing);
-  Write_Str (Name_Buffer (1 .. Name_Len));
+
+  --  The name of operators needs to be properly escaped for JSON
+
+  for J in 1 .. Name_Len loop
+ C := Name_Buffer (J);
+ if C = '"' and then List_Representation_Info_To_JSON then
+Write_Char ('\');
+ end if;
+ Write_Char (C);
+  end loop;
end List_Name;
 
-



  1   2   >