GCC 8.0.0 Status Report (2018-01-15), Trunk in Regression and Documentation fixes only mode

2018-01-15 Thread Richard Biener

Status
==

GCC 8 is in regression and documentation fixes stage now similar as if
trunk was a release branch.

We're still in pretty bad shape regression-wise.  Please also take
the opportunity to check the state of your favorite host/target
combination to make sure building and testing works appropriately.


Quality Data


Priority  #   Change from last report
---   ---
P1   36   +  27
P2  133   -   1
P3   57   -  51
P4  158   -   3
P5   27
---   ---
Total P1-P3 226   -  25
Total   411   -  28


Previous Report
===

https://gcc.gnu.org/ml/gcc/2018-01/msg00033.html


[x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake isa

2018-01-15 Thread Koval, Julia
Hi,
This patch fixes subj. Ok for trunk?

gcc/
* config/i386/i386.c (F_AVX512VBMI2, F_GFNI, F_VPCLMULQDQ, F_AVX512VNNI,
F_AVX512BITALG): New.

gcc/testsuite/
* gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add 
cannonlake.
(check_features): Add avx512vbmi2, gfni, vpclmulqdq, avx512vnni,
avx512bitalg.

libgcc/
* config/i386/cpuinfo.c (get_available_features): Add 
FEATURE_AVX512VBMI2,
FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI, 
FEATURE_AVX512BITALG.
* config/i386/cpuinfo.h (processor_features) Add FEATURE_AVX512VBMI2,
FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI, 
FEATURE_AVX512BITALG.


0001-new-isa-builtin_cpu-test.patch
Description: 0001-new-isa-builtin_cpu-test.patch


Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread Richard Biener
On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
> Now my patch set has been checked into trunk.  Here is a patch set
> to move struct ix86_frame to machine_function on GCC 7, which is
> needed to backport the patch set to GCC 7:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
>
> OK for gcc-7-branch?

Yes, backporting is ok - please watch for possible fallout on trunk and make
sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
Wednesday now with the final release about a week later if no issue shows
up.

Thanks for all your work!
Richard.

> Thanks.
>
>
> --
> H.J.


Re: [PATCH] Introduce -fwrapp and make -fno-strict-overflow imply it (PR middle-end/82694)

2018-01-15 Thread Richard Biener
On Fri, 12 Jan 2018, Jakub Jelinek wrote:

> Hi!
> 
> Apparently Linux kernel contains various UB code that has been worked around
> through -fno-strict-overflow in 7.x and before, but when
> POINTER_TYPE_OVERFLOW_UNDEFINED has been removed it now fails to boot.
> 
> The following patch follows the comments in the PR, essentially reverts
> Bin's removal of that, except that it is now controlled by a separate option
> and is included in TYPE_OVERFLOW_{WRAPS,UNDEFINED} macros.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

This is ok with the name of the option/flag changed as suggested
by Martin.

Thanks,
Richard.

> 2018-01-12  Jakub Jelinek  
> 
>   PR middle-end/82694
>   * common.opt (fstrict-overflow): No longer an alias.
>   (fwrapp): New option.
>   * tree.h (TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED): Define
>   also for pointer types based on flag_wrapp.
>   * opts.c (common_handle_option) : Set
>   opts->x_flag_wrap[pv] to !value, clear opts->x_flag_trapv if
>   opts->x_flag_wrapv got set.
>   * fold-const.c (fold_comparison, fold_binary_loc): Revert 2017-08-01
>   changes, just use TYPE_OVERFLOW_UNDEFINED on pointer type instead of
>   POINTER_TYPE_OVERFLOW_UNDEFINED.
>   * match.pd: Likewise in address comparison pattern.
>   * doc/invoke.texi: Document -fwrapv and -fstrict-overflow.
> 
>   * gcc.dg/no-strict-overflow-7.c: Revert 2017-08-01 changes.
>   * gcc.dg/tree-ssa/pr81388-1.c: Likewise.
> 
> --- gcc/common.opt.jj 2018-01-03 10:19:54.936533922 +0100
> +++ gcc/common.opt2018-01-12 14:53:28.254485349 +0100
> @@ -2411,8 +2411,8 @@ Common Report Var(flag_strict_aliasing)
>  Assume strict aliasing rules apply.
>  
>  fstrict-overflow
> -Common NegativeAlias Alias(fwrapv)
> -Treat signed overflow as undefined.  Negated as -fwrapv.
> +Common Report
> +Treat signed overflow as undefined.  Negated as -fwrapv -fwrapp.
>  
>  fsync-libcalls
>  Common Report Var(flag_sync_libcalls) Init(1)
> @@ -2860,6 +2860,10 @@ fwhole-program
>  Common Report Var(flag_whole_program) Init(0)
>  Perform whole program optimizations.
>  
> +fwrapp
> +Common Report Var(flag_wrapp) Optimization
> +Assume pointer overflow wraps around.
> +
>  fwrapv
>  Common Report Var(flag_wrapv) Optimization
>  Assume signed arithmetic overflow wraps around.
> --- gcc/tree.h.jj 2018-01-11 18:58:50.993392760 +0100
> +++ gcc/tree.h2018-01-12 15:04:14.480526788 +0100
> @@ -829,13 +829,16 @@ extern void omp_clause_range_check_faile
>  /* Same as TYPE_UNSIGNED but converted to SIGNOP.  */
>  #define TYPE_SIGN(NODE) ((signop) TYPE_UNSIGNED (NODE))
>  
> -/* True if overflow wraps around for the given integral type.  That
> +/* True if overflow wraps around for the given integral or pointer type.  
> That
> is, TYPE_MAX + 1 == TYPE_MIN.  */
>  #define TYPE_OVERFLOW_WRAPS(TYPE) \
> -  (ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag || flag_wrapv)
> +  (POINTER_TYPE_P (TYPE) \
> +   ? flag_wrapp  \
> +   : (ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag   \
> +  || flag_wrapv))
>  
> -/* True if overflow is undefined for the given integral type.  We may
> -   optimize on the assumption that values in the type never overflow.
> +/* True if overflow is undefined for the given integral or pointer type.
> +   We may optimize on the assumption that values in the type never overflow.
>  
> IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED
> must issue a warning based on warn_strict_overflow.  In some cases
> @@ -843,8 +846,10 @@ extern void omp_clause_range_check_faile
> other cases it will be appropriate to simply set a flag and let the
> caller decide whether a warning is appropriate or not.  */
>  #define TYPE_OVERFLOW_UNDEFINED(TYPE)\
> -  (!ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag \
> -   && !flag_wrapv && !flag_trapv)
> +  (POINTER_TYPE_P (TYPE) \
> +   ? !flag_wrapp \
> +   : (!ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag  \
> +  && !flag_wrapv && !flag_trapv))
>  
>  /* True if overflow for the given integral type should issue a
> trap.  */
> --- gcc/opts.c.jj 2018-01-03 10:19:56.142534113 +0100
> +++ gcc/opts.c2018-01-12 14:55:06.670494955 +0100
> @@ -2465,6 +2465,13 @@ common_handle_option (struct gcc_options
>   opts->x_flag_wrapv = 0;
>break;
>  
> +case OPT_fstrict_overflow:
> +  opts->x_flag_wrapv = !value;
> +  opts->x_flag_wrapp = !value;
> +  if (!value)
> + opts->x_flag_trapv = 0;
> +  break;
> +
>  case OPT_fipa_icf:
>opts->x_flag_ipa_icf_functions = value;
>opts->x_flag_ipa_icf_variables = value;
> --- gcc/fold-const.c.jj   2018-01-04 22:08:04.394684734 +01

Re: [PATCH] Introduce -fwrapp and make -fno-strict-overflow imply it (PR middle-end/82694)

2018-01-15 Thread Richard Biener
On Fri, 12 Jan 2018, Marc Glisse wrote:

> On Fri, 12 Jan 2018, Jakub Jelinek wrote:
> 
> > Apparently Linux kernel contains various UB code that has been worked around
> > through -fno-strict-overflow in 7.x and before, but when
> > POINTER_TYPE_OVERFLOW_UNDEFINED has been removed it now fails to boot.
> > 
> > The following patch follows the comments in the PR, essentially reverts
> > Bin's removal of that, except that it is now controlled by a separate option
> > and is included in TYPE_OVERFLOW_{WRAPS,UNDEFINED} macros.
> 
> I am pretty sure there are other patterns in match.pd that need protection
> now, with pointer_diff.
> 
> (for op (simple_comparison)
>  (simplify
>   (op (pointer_diff@3 @0 @2) (pointer_diff @1 @2))
>   (if (!TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@2)))
>(op @0 @1
> 
> This is ready for sanitizers but not for wrapping pointers. And there are a
> few more like it.
> 
> 
> There were discussions at some point of implementing -fwrapp in the front-end,
> generating (unsigned)q-(unsigned)p or (unsigned)p<(unsigned)q for pointer
> operations. It has the advantage that the middle-end doesn't need to know
> about those variants, but it might have some fallout (and I am not sure what
> to do when the middle-end creates new pointer operations), and I can
> understand that in stage 3 you are more interested in an approach that looks
> like a reversal to a former known-ok state.

Yes, I think reversal to previous behavior is the only reasonable
thing at the moment.

Richard.


Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026

2018-01-15 Thread Richard Biener
On Fri, 12 Jan 2018, Jeff Law wrote:

> On 12/21/2017 02:25 PM, Qing Zhao wrote:
> > Hi, 
> > 
> > I updated my patch based on all your comments. 
> > 
> > the major changes are the following:
> > 
> > 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of 
> > __builtin_memcmp_eq;
> > in builtins.c,  when expanding the new __builtin_str(n)cmp_eq 
> > call, expand them first as
> > __builtin_memcmp_eq, if Not succeed,  change the call back to 
> > __builtin_str(n)cmp.
> > 2. change the call to “get_range_strlen” with “compute_objsize”.
> Please read the big comment before compute_objsize.  If you are going to
> use it to influence code generation or optimization, then you're most
> likely doing something wrong.
> 
> compute_objsize can return an estimate in some circumstances.
> 
> 
> > 3. add the missing case for equality checking with zero;
> > 4. adjust the new testing case for PR83026; add a new testing case for 
> > the missing case added in 3.
> > 5. update “uhwi” to “shwi” for where it needs;
> > 6. some other minor format changes.
> > 
> > the changes are retested on x86 and aarch64, bootstrapped and regression 
> > tested. no issue.
> > 
> > Okay for trunk?
> > 
> > thanks.
> > 
> > Qing
> > 
> > Please see the updated patch:
> > 
> > gcc/ChangeLog:
> > 
> > +2017-12-21  Qing Zhao  
> > +
> > +   PR middle-end/78809
> > +   PR middle-end/83026
> > +   * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> > +   and BUILT_IN_STRNCMP_EQ.
> > +   * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> > +   BUILT_IN_STRNCMP_EQ.
> > +   * tree-ssa-strlen.c (compute_string_length): New function.
> > +   (handle_builtin_string_cmp): New function to handle calls to
> > +   string compare functions.
> > +   (strlen_optimize_stmt): Add handling to builtin string compare
> > +   calls. 
> > +   * tree.c (build_common_builtin_nodes): Add new defines of
> > +   BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> > +
> > gcc/testsuite/ChangeLog
> > 
> > +2017-12-21  Qing Zhao   
> > +
> > +   PR middle-end/78809
> > +   * gcc.dg/strcmpopt_2.c: New testcase.
> > +   * gcc.dg/strcmpopt_3.c: New testcase.
> > +
> > +   PR middle-end/83026
> > +   * gcc.dg/strcmpopt_3.c: New testcase.
> What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
> ISTM that if you're going to introduce those new builtins, then you have
> to audit all the code that runs between their introduction into the IL
> and when you expand them to ensure they're handled properly.
> 
> All you're really doing is carrying along a status bit about what
> tree-ssa-strlen did.  So you could possibly store that status bit somewhere.
> 
> The concern with both is that something later invalidates the analysis
> you've done.  I'm having a hard time coming up with a case where this
> could happen, but I'm definitely concerned about this possibility.
> Though I feel it's more likely to happen if we store a status bit vs
> using STRCMP_EQ STRNCMP_EQ.
> 
> [ For example, we have two calls with the same arguments, but one feeds
> an equality test, the other does not.  If we store a status bit that one
> could be transformed, but then we end up CSE-ing the two calls, the
> status bit would be incorrect because one of the calls did not feed an
> equality test.  Hmmm. ]
> 
> 
> 
> 
> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> > index 94f20ef..57563ef 100644
> > --- a/gcc/tree-ssa-strlen.c
> > +++ b/gcc/tree-ssa-strlen.c
> > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> >return false;
> >  }
> >  
> > +/* Given an index to the strinfo vector, compute the string length for the
> > +   corresponding string. Return -1 when unknown.  */
> > + 
> > +static HOST_WIDE_INT 
> > +compute_string_length (int idx)
> > +{
> > +  HOST_WIDE_INT string_leni = -1; 
> > +  gcc_assert (idx != 0);
> > +
> > +  if (idx < 0)
> > +string_leni = ~idx;
> So it seems to me you should just
>   return ~idx;
> 
> Then appropriately simplify the rest of the code.
> 
> > +
> > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> > +   equality test against zero:
> > +
> > +   A. When both arguments are constant strings and it's a strcmp:
> > +  * if the length of the strings are NOT equal, we can safely fold the 
> > call
> > +to a non-zero value.
> > +  * otherwise, do nothing now.
> I'm guessing your comment needs a bit of work.  If both arguments are
> constant strings, then we can just use the host str[n]cmp to fold the
> str[n]cmp to a constant.  Presumably this is handled earlier :-)
> 
> So what I'm guessing is you're really referring to the case where the
> lengths are known constants, even if the contents of the strings
> themselves are not.  In that case if its an equality comparison, then
> you can fold to a constant.  Otherwise we do nothing.  So I think the
> comment needs updating here.
> 
> 

Re: [patch, fortran] Change ABI for F2008 - minloc/maxloc BACK argument

2018-01-15 Thread Janne Blomqvist
On Sun, Jan 14, 2018 at 12:58 PM, Thomas Koenig  wrote:
> Hello world,
>
> here is the latest take on the min/maxloc ABI change for BACK.
> This version now passes BACK as a GFC_LOGCIAL_4 by value in all cases.
> I did this by using the existing %VAL mechanism. I also added
> another test case which crashed during one stage of development.
>
> So, OK for trunk?

Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c(Revision 256605)
+++ gcc/fortran/trans-intrinsic.c(Arbeitskopie)
@@ -4562,6 +4562,16 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_exp
   tree pos;
   int n;

+  actual = expr->value.function.actual;
+
+  /* The last argument, BOUND, is passed by value. Ensure that
+ by setting its name to %VAL. */


Here, s/BOUND/BACK/ I presume?

Also, it seems in the library some of the back arguments are by value,
but some are still passed as pointers. Based on some quick grepping of
the patch they seem to come from  m4/iforeach.m4  (6 lines in total).

With these fixes, Ok for trunk.


-- 
Janne Blomqvist


[PATCH] PR83804, LTO memory consumption

2018-01-15 Thread Richard Biener

This axes the trivial part of trees made possible by early LTO debug
to shrink WPA memory use.  That's mainly TYPE_DECLs and BINFOs.

Shaves off a bit of the 5% regression we've seen there.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2018-01-11  Richard Biener  

PR lto/83804
* tree.c (free_lang_data_in_type): Always unlink TYPE_DECLs
from TYPE_FIELDS.  Free TYPE_BINFO if not used by devirtualization.
Reset type names to their identifier if their TYPE_DECL doesn't
have linkage (and thus is used for ODR and devirt).
(save_debug_info_for_decl): Remove.
(save_debug_info_for_type): Likewise.
(add_tree_to_fld_list): Adjust.
* tree-pretty-print.c (dump_generic_node): Make dumping of
type names more robust.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 256562)
+++ gcc/tree.c  (working copy)
@@ -5127,15 +5127,10 @@ free_lang_data_in_type (tree type)
   TREE_PURPOSE (p) = NULL;
   else if (RECORD_OR_UNION_TYPE_P (type))
 {
-  /* Remove members that are not FIELD_DECLs (and maybe
-TYPE_DECLs) from the field list of an aggregate.  These occur
-in C++.  */
+  /* Remove members that are not FIELD_DECLs from the field list
+of an aggregate.  These occur in C++.  */
   for (tree *prev = &TYPE_FIELDS (type), member; (member = *prev);)
-   if (TREE_CODE (member) == FIELD_DECL
-   || (TREE_CODE (member) == TYPE_DECL
-   && !DECL_IGNORED_P (member)
-   && debug_info_level > DINFO_LEVEL_TERSE
-   && !is_redundant_typedef (member)))
+   if (TREE_CODE (member) == FIELD_DECL)
  prev = &DECL_CHAIN (member);
else
  *prev = DECL_CHAIN (member);
@@ -5149,15 +5144,9 @@ free_lang_data_in_type (tree type)
{
  free_lang_data_in_binfo (TYPE_BINFO (type));
  /* We need to preserve link to bases and virtual table for all
-polymorphic types to make devirtualization machinery working.
-Debug output cares only about bases, but output also
-virtual table pointers so merging of -fdevirtualize and
--fno-devirtualize units is easier.  */
- if ((!BINFO_VTABLE (TYPE_BINFO (type))
-  || !flag_devirtualize)
- && ((!BINFO_N_BASE_BINFOS (TYPE_BINFO (type))
-  && !BINFO_VTABLE (TYPE_BINFO (type)))
- || debug_info_level != DINFO_LEVEL_NONE))
+polymorphic types to make devirtualization machinery working.  */
+ if (!BINFO_VTABLE (TYPE_BINFO (type))
+ || !flag_devirtualize)
TYPE_BINFO (type) = NULL;
}
 }
@@ -5185,6 +5174,11 @@ free_lang_data_in_type (tree type)
   while (ctx && TREE_CODE (ctx) == BLOCK);
   TYPE_CONTEXT (type) = ctx;
 }
+
+  /* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
+ TYPE_DECL if the type doesn't have linkage.  */
+  if (! type_with_linkage_p (type))
+TYPE_NAME (type) = TYPE_IDENTIFIER (type);
 }
 
 
@@ -5407,34 +5401,6 @@ struct free_lang_data_d
 };
 
 
-/* Save all language fields needed to generate proper debug information
-   for DECL.  This saves most fields cleared out by free_lang_data_in_decl.  */
-
-static void
-save_debug_info_for_decl (tree t)
-{
-  /*struct saved_debug_info_d *sdi;*/
-
-  gcc_assert (debug_info_level > DINFO_LEVEL_TERSE && t && DECL_P (t));
-
-  /* FIXME.  Partial implementation for saving debug info removed.  */
-}
-
-
-/* Save all language fields needed to generate proper debug information
-   for TYPE.  This saves most fields cleared out by free_lang_data_in_type.  */
-
-static void
-save_debug_info_for_type (tree t)
-{
-  /*struct saved_debug_info_d *sdi;*/
-
-  gcc_assert (debug_info_level > DINFO_LEVEL_TERSE && t && TYPE_P (t));
-
-  /* FIXME.  Partial implementation for saving debug info removed.  */
-}
-
-
 /* Add type or decl T to one of the list of tree nodes that need their
language data removed.  The lists are held inside FLD.  */
 
@@ -5442,17 +5408,9 @@ static void
 add_tree_to_fld_list (tree t, struct free_lang_data_d *fld)
 {
   if (DECL_P (t))
-{
-  fld->decls.safe_push (t);
-  if (debug_info_level > DINFO_LEVEL_TERSE)
-   save_debug_info_for_decl (t);
-}
+fld->decls.safe_push (t);
   else if (TYPE_P (t))
-{
-  fld->types.safe_push (t);
-  if (debug_info_level > DINFO_LEVEL_TERSE)
-   save_debug_info_for_type (t);
-}
+fld->types.safe_push (t);
   else
 gcc_unreachable ();
 }
Index: gcc/tree-pretty-print.c
===
--- gcc/tree-pretty-print.c (revision 256562)
+++ gcc/tree-pretty-print.c (working copy)
@@ -1412,8 +1412,8 @@ dump_generic_node (pretty_printer *pp, t
  pp_space (pp);
  pp_left_paren (pp);
  pp_string 

Re: Add support for masked load/store_lanes

2018-01-15 Thread Christophe Lyon
On 13 January 2018 at 16:50, Jeff Law  wrote:
> On 01/12/2018 09:28 AM, Richard Sandiford wrote:
>>
>> Here's the patch with the updated docs.  Does this version look OK?
>>
>> Thanks,
>> Richard
>>
>>
>> 2018-01-12  Richard Sandiford  
>>   Alan Hayward  
>>   David Sherwood  
>>
>> gcc/
>>   * doc/md.texi (vec_mask_load_lanes@var{m}@var{n}): Document.
>>   (vec_mask_store_lanes@var{m}@var{n}): Likewise.
>>   * optabs.def (vec_mask_load_lanes_optab): New optab.
>>   (vec_mask_store_lanes_optab): Likewise.
>>   * internal-fn.def (MASK_LOAD_LANES): New internal function.
>>   (MASK_STORE_LANES): Likewise.
>>   * internal-fn.c (mask_load_lanes_direct): New macro.
>>   (mask_store_lanes_direct): Likewise.
>>   (expand_mask_load_optab_fn): Handle masked operations.
>>   (expand_mask_load_lanes_optab_fn): New macro.
>>   (expand_mask_store_optab_fn): Handle masked operations.
>>   (expand_mask_store_lanes_optab_fn): New macro.
>>   (direct_mask_load_lanes_optab_supported_p): Likewise.
>>   (direct_mask_store_lanes_optab_supported_p): Likewise.
>>   * tree-vectorizer.h (vect_store_lanes_supported): Take a masked_p
>>   parameter.
>>   (vect_load_lanes_supported): Likewise.
>>   * tree-vect-data-refs.c (strip_conversion): New function.
>>   (can_group_stmts_p): Likewise.
>>   (vect_analyze_data_ref_accesses): Use it instead of checking
>>   for a pair of assignments.
>>   (vect_store_lanes_supported): Take a masked_p parameter.
>>   (vect_load_lanes_supported): Likewise.
>>   * tree-vect-loop.c (vect_analyze_loop_2): Update calls to
>>   vect_store_lanes_supported and vect_load_lanes_supported.
>>   * tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
>>   * tree-vect-stmts.c (get_group_load_store_type): Take a masked_p
>>   parameter.  Don't allow gaps for masked accesses.
>>   Use vect_get_store_rhs.  Update calls to vect_store_lanes_supported
>>   and vect_load_lanes_supported.
>>   (get_load_store_type): Take a masked_p parameter and update
>>   call to get_group_load_store_type.
>>   (vectorizable_store): Update call to get_load_store_type.
>>   Handle IFN_MASK_STORE_LANES.
>>   (vectorizable_load): Update call to get_load_store_type.
>>   Handle IFN_MASK_LOAD_LANES.
>>
>> gcc/testsuite/
>>   * gcc.dg/vect/vect-ooo-group-1.c: New test.
>>   * gcc.target/aarch64/sve/mask_struct_load_1.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_1_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_2_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_3_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_6.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_7.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_load_8.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise.
>>   * gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
> OK.  I guess in retrospect I should have made the assumption that the
> docs were slightly off and reviewed the rest in that light.
>
> Sorry for making this wait.
>
>
Hi Richard,

I've noticed that this commit (r256620) causes new failures, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83845


> Jeff
>
>


Re: [PATCH] C/C++: Add -Waddress-of-packed-member

2018-01-15 Thread Jakub Jelinek
On Sun, Jan 14, 2018 at 06:29:54AM -0800, H.J. Lu wrote:
> +   if (TREE_CODE (field) == FIELD_DECL && DECL_PACKED (field))
> + {
> +   tree field_type = TREE_TYPE (field);
> +   unsigned int type_align = TYPE_ALIGN (field_type);
> +   tree context = DECL_CONTEXT (field);
> +   unsigned int record_align = TYPE_ALIGN (context);
> +   if ((record_align % type_align) != 0)
> + return context;
> +   type_align /= BITS_PER_UNIT;
> +   unsigned HOST_WIDE_INT field_off
> +  = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> + + (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
> +/ BITS_PER_UNIT));

This has the same bug I've just created PR83844 for, you can't assume
DECL_FIELD_OFFSET is INTEGER_CST that fits into UHWI, and also we have
byte_position wrapper that should be used to compute the offset from
DECL_FIELD_*OFFSET.

Jakub


Re: [PATCH v2, rs6000] Add -msafe-indirect-jumps option and implement safe bctr / bctrl

2018-01-15 Thread Richard Biener
On Sun, Jan 14, 2018 at 5:53 AM, Bill Schmidt
 wrote:
> Hi,
>
> [This patch supercedes and extends 
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01135.html.
> There was a small error in the assembly code produced by that patch (bad
> memory on my account of how to spell "crset eq").  I've also increased the
> function provided; see below.]
>
> This patch adds a new option for the compiler to produce only "safe" indirect
> jumps, in the sense that these jumps are deliberately mispredicted to inhibit
> speculative execution.  For now, this option is undocumented; this may change
> at some future date.  It is intended eventually for the linker to also honor
> this flag when creating PLT stubs, for example.
>
> In addition to the new option, I've included changes to indirect calls for
> the ELFv2 ABI when the option is specified.  In place of bctrl, we generate
> a "crset eq" followed by a beqctrl-.  Using the CR0.eq bit is safe since CR0
> is volatile over the call.
>
> I've also added code to replace uses of bctr when the new option is specified,
> with the sequence
>
> crset 4x[CRb]+2
> beqctr- CRb
> b .
>
> where CRb is an available condition register field.  This applies to all
> subtargets, and in particular is not restricted to ELFv2.  The use cases
> covered here are computed gotos and switch statements.
>
> NOT yet covered by this patch: indirect calls for ELFv1.  That will come 
> later.
>
> Please let me know if there is a better way to represent the crset without
> an unspec.  For the indirect jump, I don't see a way around it due to the
> expected form of indirect jumps in cfganal.c.
>
> Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu with
> no regressions.  Is this okay for trunk?

As this sounds Spectre related feel free to backport this to the GCC 7 branch
as well (even if you don't hit the Wednesday deadline for RC1 of GCC 7.3).

Thanks,
Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2018-01-13  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
> -msafe-indirect-jumps.
> * config/rs6000/rs6000.md (UNSPEC_CRSET_EQ): New UNSPEC enum.
> (UNSPEC_COMP_GOTO_CR): Likewise.
> (*call_indirect_elfv2): Disable for -msafe-indirect-jumps.
> (*call_indirect_elfv2_safe): New define_insn.
> (*call_value_indirect_elfv2): Disable for
> -msafe-indirect-jumps.
> (*call_value_indirect_elfv2_safe): New define_insn.
> (indirect_jump): Emit different RTL for -msafe-indirect-jumps.
> (*indirect_jump): Disable for -msafe-indirect-jumps.
> (*indirect_jump_safe): New define_insn.
> (*set_cr_eq): New define_insn.
> (tablejump): Emit different RTL for -msafe-indirect-jumps.
> (tablejumpsi): Disable for -msafe-indirect-jumps.
> (tablejumpsi_safe): New define_expand.
> (tablejumpdi): Disable for -msafe-indirect-jumps.
> (tablejumpdi_safe): New define_expand.
> (*tablejump_internal1): Disable for -msafe-indirect-jumps.
> (*tablejump_internal1_safe): New define_insn.
> * config/rs6000/rs6000.opt (msafe-indirect-jumps): New option.
>
> [gcc/testsuite]
>
> 2018-01-13  Bill Schmidt  
>
> * gcc.target/powerpc/safe-indirect-jump-1.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-2.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-3.c: New file.
>
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 256364)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -36726,6 +36726,9 @@ static struct rs6000_opt_var const rs6000_opt_vars
>{ "sched-epilog",
>  offsetof (struct gcc_options, x_TARGET_SCHED_PROLOG),
>  offsetof (struct cl_target_option, x_TARGET_SCHED_PROLOG), },
> +  { "safe-indirect-jumps",
> +offsetof (struct gcc_options, x_rs6000_safe_indirect_jumps),
> +offsetof (struct cl_target_option, x_rs6000_safe_indirect_jumps), },
>  };
>
>  /* Inner function to handle attribute((target("..."))) and #pragma GCC target
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md (revision 256364)
> +++ gcc/config/rs6000/rs6000.md (working copy)
> @@ -150,6 +150,8 @@
> UNSPEC_SIGNBIT
> UNSPEC_SF_FROM_SI
> UNSPEC_SI_FROM_SF
> +   UNSPEC_CRSET_EQ
> +   UNSPEC_COMP_GOTO_CR
>])
>
>  ;;
> @@ -11222,11 +11224,22 @@
>  (match_operand 1 "" "g,g"))
> (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
> "n,n")] UNSPEC_TOCSLOT))
> (clobber (reg:P LR_REGNO))]
> -  "DEFAULT_ABI == ABI_ELFv2"
> +  "DEFAULT_ABI == ABI_ELFv2 && !rs6000_safe_indirect_jumps"
>"b%T0l\; 2,%2(1)"
>[(set_attr "type" "jmpreg")
> (set_attr "length" "8")])
>
> +;; Variant with deliberate misprediction.
> +(define_insn "*call_indirect_elfv2_

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Tom de Vries

On 01/07/2018 02:17 PM, Tom de Vries wrote:

On 01/06/2018 12:36 PM, Jakub Jelinek wrote:

On Sat, Jan 06, 2018 at 09:21:59AM +0100, Tom de Vries wrote:

this patch adds the following builtins in C/C++:
- __builtin_goacc_gang_id
- __builtin_goacc_worker_id
- __builtin_goacc_vector_id
- __builtin_goacc_gang_size
- __builtin_goacc_worker_size
- __builtin_goacc_vector_size


I wonder if it wouldn't be better to have just 2 builtins instead of 6,
with one argument (required to be constant) - the kind of parallelism
you're interested in, to avoid the inflation of the builtins.



Like so:
- __built_goacc_id
- __built_goacc_size
?


Hi,

I understand this is more of a stage1 patch and has no priority atm, but 
if the concept as such is acceptable can we at least settle on the name 
and interface?


Thanks,
- Tom


Re: Add support for fully-predicated loops

2018-01-15 Thread Christophe Lyon
Hi Richard,


On 7 January 2018 at 18:08, James Greenhalgh  wrote:
> On Mon, Dec 18, 2017 at 07:40:00PM +, Jeff Law wrote:
>> On 11/17/2017 07:56 AM, Richard Sandiford wrote:
>> > This patch adds support for using a single fully-predicated loop instead
>> > of a vector loop and a scalar tail.  An SVE WHILELO instruction generates
>> > the predicate for each iteration of the loop, given the current scalar
>> > iv value and the loop bound.  This operation is wrapped up in a new 
>> > internal
>> > function called WHILE_ULT.  E.g.:
>> >
>> >WHILE_ULT (0, 3, { 0, 0, 0, 0}) -> { 1, 1, 1, 0 }
>> >WHILE_ULT (UINT_MAX - 1, UINT_MAX, { 0, 0, 0, 0 }) -> { 1, 0, 0, 0 }
>> >
>> > The third WHILE_ULT argument is needed to make the operation
>> > unambiguous: without it, WHILE_ULT (0, 3) for one vector type would
>> > seem equivalent to WHILE_ULT (0, 3) for another, even if the types have
>> > different numbers of elements.
>> >
>> > Note that the patch uses "mask" and "fully-masked" instead of
>> > "predicate" and "fully-predicated", to follow existing GCC terminology.
>> >
>> > This patch just handles the simple cases, punting for things like
>> > reductions and live-out values.  Later patches remove most of these
>> > restrictions.
>> >
>> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> > and powerpc64le-linux-gnu.  OK to install?
>> >
>> > Richard
>> >
>> >
>> > 2017-11-17  Richard Sandiford  
>> > Alan Hayward  
>> > David Sherwood  
>> >
>> > gcc/
>> > * optabs.def (while_ult_optab): New optab.
>> > * doc/md.texi (while_ult@var{m}@var{n}): Document.
>> > * internal-fn.def (WHILE_ULT): New internal function.
>> > * internal-fn.h (direct_internal_fn_supported_p): New override
>> > that takes two types as argument.
>> > * internal-fn.c (while_direct): New macro.
>> > (expand_while_optab_fn): New function.
>> > (convert_optab_supported_p): Likewise.
>> > (direct_while_optab_supported_p): New macro.
>> > * wide-int.h (wi::udiv_ceil): New function.
>> > * tree-vectorizer.h (rgroup_masks): New structure.
>> > (vec_loop_masks): New typedef.
>> > (_loop_vec_info): Add masks, mask_compare_type, can_fully_mask_p
>> > and fully_masked_p.
>> > (LOOP_VINFO_CAN_FULLY_MASK_P, LOOP_VINFO_FULLY_MASKED_P)
>> > (LOOP_VINFO_MASKS, LOOP_VINFO_MASK_COMPARE_TYPE): New macros.
>> > (vect_max_vf): New function.
>> > (slpeel_make_loop_iterate_ntimes): Delete.
>> > (vect_set_loop_condition, vect_get_loop_mask_type, vect_gen_while)
>> > (vect_halve_mask_nunits, vect_double_mask_nunits): Declare.
>> > )vect_record_loop_mask, vect_get_loop_mask): Likewise.
>> > * tree-vect-loop-manip.c: Include tree-ssa-loop-niter.h,
>> > internal-fn.h, stor-layout.h and optabs-query.h.
>> > (vect_set_loop_mask): New function.
>> > (add_preheader_seq): Likewise.
>> > (add_header_seq): Likewise.
>> > (vect_maybe_permute_loop_masks): Likewise.
>> > (vect_set_loop_masks_directly): Likewise.
>> > (vect_set_loop_condition_masked): Likewise.
>> > (vect_set_loop_condition_unmasked): New function, split out from
>> > slpeel_make_loop_iterate_ntimes.
>> > (slpeel_make_loop_iterate_ntimes): Rename to..
>> > (vect_set_loop_condition): ...this.  Use vect_set_loop_condition_masked
>> > for fully-masked loops and vect_set_loop_condition_unmasked otherwise.
>> > (vect_do_peeling): Update call accordingly.
>> > (vect_gen_vector_loop_niters): Use VF as the step for fully-masked
>> > loops.
>> > * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
>> > mask_compare_type, can_fully_mask_p and fully_masked_p.
>> > (release_vec_loop_masks): New function.
>> > (_loop_vec_info): Use it to free the loop masks.
>> > (can_produce_all_loop_masks_p): New function.
>> > (vect_get_max_nscalars_per_iter): Likewise.
>> > (vect_verify_full_masking): Likewise.
>> > (vect_analyze_loop_2): Save LOOP_VINFO_CAN_FULLY_MASK_P around
>> > retries, and free the mask rgroups before retrying.  Check loop-wide
>> > reasons for disallowing fully-masked loops.  Make the final decision
>> > about whether use a fully-masked loop or not.
>> > (vect_estimate_min_profitable_iters): Do not assume that peeling
>> > for the number of iterations will be needed for fully-masked loops.
>> > (vectorizable_reduction): Disable fully-masked loops.
>> > (vectorizable_live_operation): Likewise.
>> > (vect_halve_mask_nunits): New function.
>> > (vect_double_mask_nunits): Likewise.
>> > (vect_record_loop_mask): Likewise.
>> > (vect_get_loop_mask): Likewise.
>> > (vect_transform_loop): Handle the case in which the final loop
>> > iteration might handle a partial vector.  Call vect_set_loop_condition
>> > instead of slpeel_make_loop_iterate_ntimes.
>> > * tree-vect-stmts.c: Include tree-ssa-loop-niter.h and gimple-fold.h.
>> >  

Re: Add support for reductions in fully-masked loops

2018-01-15 Thread Christophe Lyon
Hi Richard,


On 7 January 2018 at 21:35, James Greenhalgh  wrote:
> On Wed, Dec 13, 2017 at 04:34:34PM +, Jeff Law wrote:
>> On 11/17/2017 07:59 AM, Richard Sandiford wrote:
>> > This patch removes the restriction that fully-masked loops cannot
>> > have reductions.  The key thing here is to make sure that the
>> > reduction accumulator doesn't include any values associated with
>> > inactive lanes; the patch adds a bunch of conditional binary
>> > operations for doing that.
>> >
>> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> > and powerpc64le-linux-gnu.
>> >
>> > Richard
>> >
>> >
>> > 2017-11-17  Richard Sandiford  
>> > Alan Hayward  
>> > David Sherwood  
>> >
>> > gcc/
>> > * doc/md.texi (cond_add@var{mode}, cond_sub@var{mode})
>> > (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode})
>> > (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode})
>> > (cond_umax@var{mode}): Document.
>> > * optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab)
>> > (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab)
>> > (cond_umin_optab, cond_umax_optab): New optabs.
>> > * internal-fn.def (COND_ADD, COND_SUB, COND_SMIN, COND_SMAX)
>> > (COND_UMIN, COND_UMAX, COND_AND, COND_IOR, COND_XOR): New internal
>> > functions.
>> > * internal-fn.h (get_conditional_internal_fn): Declare.
>> > * internal-fn.c (cond_binary_direct): New macro.
>> > (expand_cond_binary_optab_fn): Likewise.
>> > (direct_cond_binary_optab_supported_p): Likewise.
>> > (get_conditional_internal_fn): New function.
>> > * tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops.
>> > Cope with reduction statements that are vectorized as calls rather
>> > than assignments.
>> > * config/aarch64/aarch64-sve.md (cond_): New insns.
>> > * config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB)
>> > (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN)
>> > (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR)
>> > (UNSPEC_COND_EOR): New unspecs.
>> > (optab): Add mappings for them.
>> > (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators.
>> > (sve_int_op, sve_fp_op): New int attributes.
>> >
>> > gcc/testsuite/
>> > * gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
>> > * gcc.target/aarch64/sve_reduc_1.c: Expect the loop operations
>> > to be predicated.
>> > * gcc.target/aarch64/sve_slp_5.c: Check for a fully-masked loop.
>> > * gcc.target/aarch64/sve_slp_7.c: Likewise.
>> > * gcc.target/aarch64/sve_reduc_5.c: New test.
>> > * gcc.target/aarch64/sve_slp_13.c: Likewise.
>> > * gcc.target/aarch64/sve_slp_13_run.c: Likewise.
>> I didn't walk through the aarch64 specific bits here.  The generic bits
>> are OK.
>
> As are the AArch64 bits.
>

As of r256626, I've noticed that a new test says XPASS on aarch64-none-elf with
-mabi=ilp32:
XPASS: gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
scan-assembler-times \\tsub\\t 8

Not sure if I should file a PR for this?

Christophe


> OK.
>
> James


Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Richard Biener
On Fri, Jan 12, 2018 at 10:22 PM, Will Schmidt
 wrote:
> Hi,
>   Add support for gimple folding of the mergeh, mergel intrinsics.
> Since the merge low and merge high variants are almost identical, a
> new helper function has been added so that code can be shared.
>
> This also adds define_insn for xxmrghw, xxmrglw instructions, allowing us
> to generate xxmrglw instead of vmrglw after folding.  A few whitespace
> fixes have been made to the existing vmrg?w defines.
>
> The changes introduced here affect the existing target testcases
> gcc.target/powerpc/builtins-1-be.c and builtins-1-le.c, such that
> a number of the scan-assembler tests would fail due to instruction counts
> changing.  Since the purpose of that test is to primarily ensure those
> intrinsics are accepted by the compiler, I have disabled gimple-folding for
> the existing tests that count instructions, and created new variants of those
> tests with folding enabled and a higher optimization level, that do not count
> instructions.
>
> Regtests are currently running across assorted power systems.
> OK for trunk, pending successful results?
>
> Thanks,
> -Will
>
> [gcc]
>
> 2018-01-12  Will Schmidt  
>
> * config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
> support for merge[hl].  (fold_mergehl_helper): New helper function.
> * config/rs6000/altivec.md (altivec_xxmrghw_direct): New.
> (altivec_xxmrglw_direct): New.
>
> [testsuite]
>
> 2018-01-12  Will Schmidt  
>
> * gcc.target/powerpc/fold-vec-mergehl-char.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-double.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-float.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-int.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-longlong.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-pixel.c: New.
> * gcc.target/powerpc/fold-vec-mergehl-short.c: New.
> * gcc.target/powerpc/builtins-1-be.c: Disable gimple-folding.
> * gcc.target/powerpc/builtins-1-le.c: Disable gimple-folding.
> * gcc.target/powerpc/builtins-1-be-folded.c: New.
> * gcc.target/powerpc/builtins-1-le-folded.c: New.
> * gcc.target/powerpc/builtins-1.fold.h: New.
>
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 733d920..65d4548 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -1101,10 +1101,20 @@
>else
>  return "vmrglw %0,%2,%1";
>  }
>[(set_attr "type" "vecperm")])
>
> +
> +(define_insn "altivec_xxmrghw_direct"
> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> + (match_operand:V4SI 2 "register_operand" "v")]
> +UNSPEC_VMRGH_DIRECT))]
> +  "TARGET_P8_VECTOR"
> +  "xxmrghw %x0,%x1,%x2"
> +  [(set_attr "type" "vecperm")])
> +
>  (define_insn "altivec_vmrghw_direct"
>[(set (match_operand:V4SI 0 "register_operand" "=v")
>  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>(match_operand:V4SI 2 "register_operand" "v")]
>   UNSPEC_VMRGH_DIRECT))]
> @@ -1185,12 +1195,12 @@
>[(set_attr "type" "vecperm")])
>
>  (define_insn "altivec_vmrglb_direct"
>[(set (match_operand:V16QI 0 "register_operand" "=v")
>  (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
> -  (match_operand:V16QI 2 "register_operand" "v")]
> -  UNSPEC_VMRGL_DIRECT))]
> +  (match_operand:V16QI 2 "register_operand" "v")]
> + UNSPEC_VMRGL_DIRECT))]
>"TARGET_ALTIVEC"
>"vmrglb %0,%1,%2"
>[(set_attr "type" "vecperm")])
>
>  (define_expand "altivec_vmrglh"
> @@ -1242,11 +1252,11 @@
>
>  (define_insn "altivec_vmrglh_direct"
>[(set (match_operand:V8HI 0 "register_operand" "=v")
>  (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
>   (match_operand:V8HI 2 "register_operand" "v")]
> - UNSPEC_VMRGL_DIRECT))]
> +UNSPEC_VMRGL_DIRECT))]
>"TARGET_ALTIVEC"
>"vmrglh %0,%1,%2"
>[(set_attr "type" "vecperm")])
>
>  (define_expand "altivec_vmrglw"
> @@ -1290,10 +1300,19 @@
>else
>  return "vmrghw %0,%2,%1";
>  }
>[(set_attr "type" "vecperm")])
>
> +(define_insn "altivec_xxmrglw_direct"
> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> + (match_operand:V4SI 2 "register_operand" "v")]
> +UNSPEC_VMRGL_DIRECT))]
> +  "TARGET_P8_VECTOR"
> +  "xxmrglw %x0,%x1,%x2"
> +  [(set_attr "type" "vecperm")])
> +
>  (define_insn "altivec_vmrglw_direct"
>[(set (match_operand:V4SI 0 "register_operand" "=v")
>  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>   (match_operand:V4SI 2 "reg

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Jakub Jelinek
On Mon, Jan 15, 2018 at 10:49:52AM +0100, Tom de Vries wrote:
> On 01/07/2018 02:17 PM, Tom de Vries wrote:
> > On 01/06/2018 12:36 PM, Jakub Jelinek wrote:
> > > On Sat, Jan 06, 2018 at 09:21:59AM +0100, Tom de Vries wrote:
> > > > this patch adds the following builtins in C/C++:
> > > > - __builtin_goacc_gang_id
> > > > - __builtin_goacc_worker_id
> > > > - __builtin_goacc_vector_id
> > > > - __builtin_goacc_gang_size
> > > > - __builtin_goacc_worker_size
> > > > - __builtin_goacc_vector_size
> > > 
> > > I wonder if it wouldn't be better to have just 2 builtins instead of 6,
> > > with one argument (required to be constant) - the kind of parallelism
> > > you're interested in, to avoid the inflation of the builtins.
> > > 
> > 
> > Like so:
> > - __built_goacc_id
> > - __built_goacc_size
> > ?
> 
> Hi,
> 
> I understand this is more of a stage1 patch and has no priority atm, but if
> the concept as such is acceptable can we at least settle on the name and
> interface?

Does OpenACC have some term for the 3 dimensions/kinds of parallelism?
Is there some enum describing those 3 already?

Jakub


[patch,avr,committed] Add tests for PR83801.

2018-01-15 Thread Georg-Johann Lay

Added the following avr specific test cases for the already fixed PR83801.

Johann


PR c/83801
PR c/83729
* gcc.target/avr/torture/pr83729.c: New test.
* gcc.target/avr/torture/pr83801.c: New test.

Index: gcc.target/avr/torture/pr83729.c
===
--- gcc.target/avr/torture/pr83729.c(nonexistent)
+++ gcc.target/avr/torture/pr83729.c(working copy)
@@ -0,0 +1,17 @@
+/* { dg-options { "-std=gnu99" } } */
+/* { dg-do run { target { ! avr_tiny } } } */
+
+__attribute((noinline,noclone))
+char to_ascii (unsigned i)
+{
+static const char __memx code_tab[] = "0123456789";
+return code_tab[i];
+}
+
+int main()
+{
+  if (to_ascii (2) != '2')
+__builtin_abort();
+
+  return 0;
+}
Index: gcc.target/avr/torture/pr83801.c
===
--- gcc.target/avr/torture/pr83801.c(nonexistent)
+++ gcc.target/avr/torture/pr83801.c(working copy)
@@ -0,0 +1,17 @@
+/* { dg-options { "-std=gnu99" } } */
+/* { dg-do run { target { ! avr_tiny } } } */
+
+__attribute((noinline,noclone))
+char to_ascii (unsigned i)
+{
+static const char __flash code_tab[] = "0123456789";
+return code_tab[i];
+}
+
+int main()
+{
+  if (to_ascii (2) != '2')
+__builtin_abort();
+
+  return 0;
+}


Re: Improve canonicalisation of TARGET_MEM_REFs

2018-01-15 Thread Richard Biener
On Tue, Jan 9, 2018 at 3:39 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Tue, Nov 7, 2017 at 7:04 PM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford
  wrote:
> A general TARGET_MEM_REF is:
>
> BASE + STEP * INDEX + INDEX2 + OFFSET
>
> After classifying the address in this way, the code that builds
> TARGET_MEM_REFs tries to simplify the address until it's valid
> for the current target and for the mode of memory being addressed.
> It does this in a fixed order:
>
> (1) add SYMBOL to BASE
> (2) add INDEX * STEP to the base, if STEP != 1
> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)
> (4) add INDEX to BASE
> (5) add OFFSET to BASE
>
> So suppose we had an address:
>
> &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")
>
> on a target that only allows an index or an offset, not both.  Following
> the steps above, we'd first create:
>
> tmp = symbol
> tmp2 = tmp + index * 8
>
> Then if the given offset value was valid for the mode being addressed,
> we'd create:
>
> MEM[base:tmp2, offset:offset]
>
> while if it was invalid we'd create:
>
> tmp3 = tmp2 + offset
> MEM[base:tmp3, offset:0]
>
> The problem is that this could happen if ivopts had decided to use
> a scaled index for an address that happens to have a constant base.
> The old procedure failed to give an indexed TARGET_MEM_REF in that case,
> and adding the offset last prevented later passes from being able to
> fold the index back in.
>
> The patch avoids this by skipping (2) if BASE + INDEX * STEP
> is a legitimate address and if OFFSET is stopping the address
> being valid.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
> OK to install?
>
> Richard
>
>
> 2017-10-31  Richard Sandiford  
> Alan Hayward  
> David Sherwood  
>
> gcc/
> * tree-ssa-address.c (keep_index_p): New function.
> (create_mem_ref): Use it.  Only split out the INDEX * STEP
> component if that is invalid even with the symbol and offset
> removed.
>
> Index: gcc/tree-ssa-address.c
> ===
> --- gcc/tree-ssa-address.c  2017-11-03 12:15:44.097060121 +
> +++ gcc/tree-ssa-address.c  2017-11-03 12:21:18.060359821 +
> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter
>  true, GSI_SAME_STMT);
>  }
>
> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP
> +   address for type TYPE and if the offset is making it appear invalid.  
> */
> +
> +static bool
> +keep_index_p (tree type, mem_address parts)

 mem_ref_valid_without_offset_p (...)

 ?
>>>
>>> OK.
>>>
> +{
> +  if (!parts.base)
> +return false;
> +
> +  gcc_assert (!parts.symbol);
> +  parts.offset = NULL_TREE;
> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), 
> &parts);
> +}
> +
>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary
> computations are emitted in front of GSI.  TYPE is the mode
> of created memory reference. IV_CAND is the selected iv candidate in 
> ADDR,
> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

 Which means all of the following would be more naturally written as

>   into:
> index' = index << step;
> [... + index' + ,,,].  */
> -  if (parts.step && !integer_onep (parts.step))
> +  bool scaled_p = (parts.step && !integer_onep (parts.step));
> +  if (scaled_p && !keep_index_p (type, parts))
>  {

   if (mem_ref_valid_without_offset_p (...))
{
  ...
  return create_mem_ref_raw (...);
}
>>>
>>> Is this inside the test for a scale:
>>>
>>>   if (parts.step && !integer_onep (parts.step))
>>> {
>>>   if (mem_ref_valid_without_offset_p (...))
>>> {
>>>   tree tmp = parts.offset;
>>>   if (parts.base)
>>> {
>>>   tmp = fold_build_pointer_plus (parts.base, tmp);
>>>   tmp = force_gimple_operand_gsi_1 (gsi, tmp,
>>> is_gimple_mem_ref_addr,
>>> NULL_TREE, true,
>>> GSI_SAME_STMT);
>>> }
>>>   parts.base = tmp;
>>>   parts.offset = NULL_TREE;
>>>   mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
>>>   gcc_assert (mem_ref);
>>>   ret

Re: [PATCH] document -Wclass-memaccess suppression by casting (PR 81327)

2018-01-15 Thread Florian Weimer
* Martin Sebor:

> +the virtual table.  Modifying the representation of such objects may violate
   ^vtable pointer?

The vtable itself is not corrupted, I assume.



Re: [PATCH] lto, testsuite: Fix ICE in -Wodr (PR lto/83121)

2018-01-15 Thread Richard Biener
On Mon, Jan 8, 2018 at 8:36 PM, David Malcolm  wrote:
> On Sat, 2018-01-06 at 08:44 +0100, Richard Biener wrote:
>> On January 5, 2018 11:55:11 PM GMT+01:00, David Malcolm > hat.com> wrote:
>> > On Fri, 2018-01-05 at 10:36 +0100, Richard Biener wrote:
>> > > On Thu, Jan 4, 2018 at 10:52 PM, David Malcolm > > > om>
>> > > wrote:
>> > > > PR lto/83121 reports an ICE deep inside the linemap code when
>> > > > -Wodr
>> > > > reports on a type mismatch.
>> > > >
>> > > > The root cause is that the warning can access the
>> > > > DECL_SOURCE_LOCATION
>> > > > of a streamed-in decl before the lto_location_cache has been
>> > > > applied.
>> > > >
>> > > > lto_location_cache::input_location stores
>> > > > RESERVED_LOCATION_COUNT
>> > > > (==2)
>> > > > as a poison value until the cache is applied:
>> > > > 250   /* Keep value RESERVED_LOCATION_COUNT in *loc as
>> > > > linemap
>> > > > lookups will
>> > > > 251  ICE on it.  */
>> > > >
>> > > > The fix is relatively simple: apply the cache before reading
>> > > > the
>> > > > DECL_SOURCE_LOCATION.
>> > > >
>> > > > (I wonder if we should instead have a INVALID_LOCATION value to
>> > > > handle
>> > > > this case more explicitly?  e.g. 0x?  or reserve 2 in
>> > > > libcpp for
>> > > > that purpose, and have the non-reserved locations start at
>> > > > 3?  Either
>> > > > would be more invasive, though)
>> > > >
>> > > > Triggering the ICE was fiddly: it seems to be affected by many
>> > > > things,
>> > > > including the order of files, and (I think) by filenames.  My
>> > > > theory is
>> > > > that it's affected by the ordering of the tree nodes in the LTO
>> > > > stream:
>> > > > for the ICE to occur, the types in question need to be compared
>> > > > before
>> > > > some other operation flushes the lto_location_cache.  This
>> > > > ordering
>> > > > is affected by the hash-based ordering in DFS in lto-streamer-
>> > > > out.c, which
>> > > > might explain why r255066 seemed to trigger the bug; the only
>> > > > relevant
>> > > > change to LTO there seemed to be:
>> > > >   * lto-streamer-out.c (hash_tree): Hash TYPE_EMPTY_P and
>> > > > DECL_PADDING_P.
>> > > > If so, then the bug was presumably already present, but hidden.
>> > > >
>> > > > The patch also adds regression test coverage for the ICE, which
>> > > > is
>> > > > more
>> > > > involved - as far as I can tell, we don't have an existing way
>> > > > to
>> > > > verify
>> > > > diagnostics emitted during link-time optimization.
>> > > >
>> > > > Hence the patch adds some machinery to lib/lto.exp to support
>> > > > two
>> > > > new
>> > > > directives: dg-lto-warning and dg-lto-message, corresponding to
>> > > > dg-warning and dg-message respectively, where the diagnostics
>> > > > are
>> > > > expected to be emitted at link-time.
>> > > >
>> > > > The test case includes examples of LTO warnings and notes in
>> > > > both
>> > > > the
>> > > > primary and secondary source files
>> > > >
>> > > > Doing so required reusing the logic from DejaGnu for handling
>> > > > diagnostics.
>> > > > Unfortunately the pertinent code is a 50 line loop within a
>> > > > ~200
>> > > > line Tcl
>> > > > function in dg.exp (dg-test), so I had to copy it from DejaGnu,
>> > > > making
>> > > > various changes as necessary (see
>> > > > lto_handle_diagnostics_for_file
>> > > > in the
>> > > > patch; for example the LTO version supports multiple source
>> > > > files,
>> > > > identifying which source file emitted a diagnostic).
>> > > >
>> > > > For non-LTO diagnostics we currently ignore surplus "note"
>> > > > diagnostics.
>> > > > This patch updates lto_prune_warns to follow this behavior
>> > > > (since
>> > > > otherwise we'd need numerous dg-lto-message directives for the
>> > > > motivating
>> > > > test case).
>> > > >
>> > > > The patch adds these PASS results to g++.sum:
>> > > >
>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o assemble, -O0 -flto
>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_1.o assemble, -O0 -flto
>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_0.C
>> > > > line
>> > > > 6)
>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_0.C
>> > > > line
>> > > > 8)
>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_1.C
>> > > > line
>> > > > 2)
>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_1.C
>> > > > line
>> > > > 3)
>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o-cp_lto_pr83121_1.o
>> > > > link, -O0 -flto
>> > > >
>> > > > The output for dg-lto-message above refers to "warnings",
>> > > > rather
>> > > > than
>> > > > "messages" but that's the same as for the non-LTO case, where
>> > > > dg-
>> > > > message
>> > > > also refers to "warnings".
>> > > >
>> > > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
>> > > >
>> > > > OK for trunk?
>> > >
>> > > Hmm, but we do this in warn_odr already?  How's that not enough?
>> > >
>> > > At least it seems the place you add this isn't ideal (not 

Re: Make ivopts handle calls to internal functions

2018-01-15 Thread Christophe Lyon
Hi,


On 13 January 2018 at 16:34, Jeff Law  wrote:
> On 01/09/2018 08:23 AM, Richard Sandiford wrote:
>> Richard Biener  writes:
>>> On Mon, Nov 20, 2017 at 12:31 PM, Bin.Cheng  wrote:
 On Fri, Nov 17, 2017 at 3:03 PM, Richard Sandiford
  wrote:
> ivopts previously treated pointer arguments to internal functions
> like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.
> This patch makes it treat them as addresses instead.  This makes
> a significant difference to the code quality for SVE loops,
> since we can then use loads and stores with scaled indices.
 Thanks for working on this.  This can be extended to other internal
 functions which eventually
 are expanded into memory references.  I believe (at least) both x86
 and AArch64 has such
 requirement.
>>>
>>> In addition to Bins comments I only have a single one (the rest of the
>>> middle-end
>>> changes look OK).  The alias type of MEM_REFs and TARGET_MEM_REFs
>>> in ADDR_EXPR context is meaningless so you don't need to jump through hoops
>>> to get at it or preserve it in any way, likewise for CLIQUE/BASE if it
>>> were present.
>>
>> Ah, OK.
>>
>>> Maybe you can simplify code with this.
>>
>> In the end it didn't really simplify the code, since internal-fn.c
>> uses the address to build a (TARGET_)MEM_REF, and the alias information
>> of that ref needs to be correct, since it gets carried across to the
>> MEM rtx.  But it does mean that the alias_ptr_type check in the previous:
>>
>>   if (TREE_CODE (mem) == TARGET_MEM_REF
>> && types_compatible_p (TREE_TYPE (mem), type)
>> && alias_ptr_type == TREE_TYPE (TMR_OFFSET (mem))
>> && integer_zerop (TMR_OFFSET (mem)))
>>   return mem;
>>
>> made no sense: we should simply replace the TMR_OFFSET if it has
>> the wrong type.
>>
>>> As you're introducing &TARGET_MEM_REF as a valid construct (it weren't
>>> before) you'll run into missing / misguided foldings eventually.  So
>>> be prepared to fix up fallout.
>>
>> OK :-) I haven't hit any new places yet, but like you say, I'll be on
>> the lookout.
>>
>> Is the version below OK?  Tested on aarch64-linux-gnu, x86_64-linux-gnu
>> and powerpc64le-linux-gnu.
>>
>> Richard
>>
>>
>> 2018-01-09  Richard Sandiford  
>>   Alan Hayward  
>>   David Sherwood  
>>
>> gcc/
>>   * expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of
>>   TARGET_MEM_REFs.
>>   * gimple-expr.h (is_gimple_addressable: Likewise.
>>   * gimple-expr.c (is_gimple_address): Likewise.
>>   * internal-fn.c (expand_call_mem_ref): New function.
>>   (expand_mask_load_optab_fn): Use it.
>>   (expand_mask_store_optab_fn): Likewise.
> OK.
> jeff


I've reported that the updated tests fail on aarch64-none-elf -mabi=ilp32:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83848

Christophe


Re: Allow the number of iterations to be smaller than VF

2018-01-15 Thread Christophe Lyon
On 7 January 2018 at 21:51, James Greenhalgh  wrote:
> On Mon, Nov 20, 2017 at 12:12:38AM +, Jeff Law wrote:
>> On 11/17/2017 08:11 AM, Richard Sandiford wrote:
>> > Fully-masked loops can be profitable even if the iteration
>> > count is smaller than the vectorisation factor.  In this case
>> > we're effectively doing a complete unroll followed by SLP.
>> >
>> > The documentation for min-vect-loop-bound says that the
>> > default value is 0, but actually the default and minimum
>> > were 1.  We need it to be 0 for this case since the parameter
>> > counts a whole number of vector iterations.
>> >
>> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> > and powerpc64le-linux-gnu.  OK to install?
>> >
>> > Richard
>> >
>> >
>> > 2017-11-17  Richard Sandiford  
>> > Alan Hayward  
>> > David Sherwood  
>> >
>> > gcc/
>> > * doc/sourcebuild.texi (vect_fully_masked): Document.
>> > * params.def (PARAM_MIN_VECT_LOOP_BOUND): Change minimum and
>> > default value to 0.
>> > * tree-vect-loop.c (vect_analyze_loop_costing): New function,
>> > split out from...
>> > (vect_analyze_loop_2): ...here. Don't check the vectorization
>> > factor against the number of loop iterations if the loop is
>> > fully-masked.
>> >
>> > gcc/testsuite/
>> > * lib/target-supports.exp (check_effective_target_vect_fully_masked):
>> > New proc.
>> > * gcc.dg/vect/slp-3.c: Expect all loops to be vectorized if
>> > vect_fully_masked.
>> > * gcc.target/aarch64/sve_loop_add_4.c: New test.
>> > * gcc.target/aarch64/sve_loop_add_4_run.c: Likewise.
>> > * gcc.target/aarch64/sve_loop_add_5.c: Likewise.
>> > * gcc.target/aarch64/sve_loop_add_5_run.c: Likewise.
>> > * gcc.target/aarch64/sve_miniloop_1.c: Likewise.
>> > * gcc.target/aarch64/sve_miniloop_2.c: Likewise.
>> OK.
>> Jeff
>
> The AArch64 tests are OK.
>

I've reported the failures on aarch64-none-elf -mabi=ilp32 in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83849

Christophe

> James
>


Re: Allow single-element interleaving for non-power-of-2 strides

2018-01-15 Thread Christophe Lyon
On 7 January 2018 at 21:55, James Greenhalgh  wrote:
> On Fri, Nov 17, 2017 at 06:40:13PM +, Jeff Law wrote:
>> On 11/17/2017 08:33 AM, Richard Sandiford wrote:
>> > This allows LD3 to be used for isolated a[i * 3] accesses, in a similar
>> > way to the current a[i * 2] and a[i * 4] for LD2 and LD4 respectively.
>> > Given the problems with the cost model underestimating the cost of
>> > elementwise accesses, the patch continues to reject the VMAT_ELEMENTWISE
>> > cases that are currently rejected.
>> >
>> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> > and powerpc64le-linux-gnu.  OK to install?
>> >
>> > Richard
>> >
>> >
>> > 2017-11-17  Richard Sandiford  
>> > Alan Hayward  
>> > David Sherwood  
>> >
>> > gcc/
>> > * tree-vect-data-refs.c (vect_analyze_group_access_1): Allow
>> > single-element interleaving even if the size is not a power of 2.
>> > * tree-vect-stmts.c (get_load_store_type): Disallow elementwise
>> > accesses for single-element interleaving if the group size is
>> > not a power of 2.
>> >
>> > gcc/testsuite/
>> > * gcc.target/aarch64/sve_struct_vect_18.c: New test.
>> > * gcc.target/aarch64/sve_struct_vect_18_run.c: Likewise.
>> > * gcc.target/aarch64/sve_struct_vect_19.c: Likewise.
>> > * gcc.target/aarch64/sve_struct_vect_19_run.c: Likewise.
>> OK.
>> jeff
>
> The AArch64 tests are OK.
>

Hi,

After this commit (r256634), I have reported regressions on armeb in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83851

Christophe

> Thanks,
> James
>


Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Jan 12, 2018 at 10:22 PM, Will Schmidt
>  wrote:
>> Hi,
>>   Add support for gimple folding of the mergeh, mergel intrinsics.
>> Since the merge low and merge high variants are almost identical, a
>> new helper function has been added so that code can be shared.
>>
>> This also adds define_insn for xxmrghw, xxmrglw instructions, allowing us
>> to generate xxmrglw instead of vmrglw after folding.  A few whitespace
>> fixes have been made to the existing vmrg?w defines.
>>
>> The changes introduced here affect the existing target testcases
>> gcc.target/powerpc/builtins-1-be.c and builtins-1-le.c, such that
>> a number of the scan-assembler tests would fail due to instruction counts
>> changing.  Since the purpose of that test is to primarily ensure those
>> intrinsics are accepted by the compiler, I have disabled gimple-folding for
>> the existing tests that count instructions, and created new variants of those
>> tests with folding enabled and a higher optimization level, that do not count
>> instructions.
>>
>> Regtests are currently running across assorted power systems.
>> OK for trunk, pending successful results?
>>
>> Thanks,
>> -Will
>>
>> [gcc]
>>
>> 2018-01-12  Will Schmidt  
>>
>> * config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
>> support for merge[hl].  (fold_mergehl_helper): New helper function.
>> * config/rs6000/altivec.md (altivec_xxmrghw_direct): New.
>> (altivec_xxmrglw_direct): New.
>>
>> [testsuite]
>>
>> 2018-01-12  Will Schmidt  
>>
>> * gcc.target/powerpc/fold-vec-mergehl-char.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-double.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-float.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-int.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-longlong.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-pixel.c: New.
>> * gcc.target/powerpc/fold-vec-mergehl-short.c: New.
>> * gcc.target/powerpc/builtins-1-be.c: Disable gimple-folding.
>> * gcc.target/powerpc/builtins-1-le.c: Disable gimple-folding.
>> * gcc.target/powerpc/builtins-1-be-folded.c: New.
>> * gcc.target/powerpc/builtins-1-le-folded.c: New.
>> * gcc.target/powerpc/builtins-1.fold.h: New.
>>
>> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
>> index 733d920..65d4548 100644
>> --- a/gcc/config/rs6000/altivec.md
>> +++ b/gcc/config/rs6000/altivec.md
>> @@ -1101,10 +1101,20 @@
>>else
>>  return "vmrglw %0,%2,%1";
>>  }
>>[(set_attr "type" "vecperm")])
>>
>> +
>> +(define_insn "altivec_xxmrghw_direct"
>> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
>> +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>> + (match_operand:V4SI 2 "register_operand" "v")]
>> +UNSPEC_VMRGH_DIRECT))]
>> +  "TARGET_P8_VECTOR"
>> +  "xxmrghw %x0,%x1,%x2"
>> +  [(set_attr "type" "vecperm")])
>> +
>>  (define_insn "altivec_vmrghw_direct"
>>[(set (match_operand:V4SI 0 "register_operand" "=v")
>>  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>>(match_operand:V4SI 2 "register_operand" "v")]
>>   UNSPEC_VMRGH_DIRECT))]
>> @@ -1185,12 +1195,12 @@
>>[(set_attr "type" "vecperm")])
>>
>>  (define_insn "altivec_vmrglb_direct"
>>[(set (match_operand:V16QI 0 "register_operand" "=v")
>>  (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
>> -  (match_operand:V16QI 2 "register_operand" "v")]
>> -  UNSPEC_VMRGL_DIRECT))]
>> +  (match_operand:V16QI 2 "register_operand" "v")]
>> + UNSPEC_VMRGL_DIRECT))]
>>"TARGET_ALTIVEC"
>>"vmrglb %0,%1,%2"
>>[(set_attr "type" "vecperm")])
>>
>>  (define_expand "altivec_vmrglh"
>> @@ -1242,11 +1252,11 @@
>>
>>  (define_insn "altivec_vmrglh_direct"
>>[(set (match_operand:V8HI 0 "register_operand" "=v")
>>  (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
>>   (match_operand:V8HI 2 "register_operand" "v")]
>> - UNSPEC_VMRGL_DIRECT))]
>> +UNSPEC_VMRGL_DIRECT))]
>>"TARGET_ALTIVEC"
>>"vmrglh %0,%1,%2"
>>[(set_attr "type" "vecperm")])
>>
>>  (define_expand "altivec_vmrglw"
>> @@ -1290,10 +1300,19 @@
>>else
>>  return "vmrghw %0,%2,%1";
>>  }
>>[(set_attr "type" "vecperm")])
>>
>> +(define_insn "altivec_xxmrglw_direct"
>> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
>> +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>> + (match_operand:V4SI 2 "register_operand" "v")]
>> +UNSPEC_VMRGL_DIRECT))]
>> +  "TARGET_P8_VECTOR"
>> +  "xxmrglw %x0,%x1,%x2"
>> +  [(set_attr "type" "vecperm")])
>> +
>>  (define_insn "altivec_vmrglw_direct"
>>[(set (match_operand:V4SI 

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Tom de Vries

On 01/15/2018 11:05 AM, Jakub Jelinek wrote:

On Mon, Jan 15, 2018 at 10:49:52AM +0100, Tom de Vries wrote:

On 01/07/2018 02:17 PM, Tom de Vries wrote:

On 01/06/2018 12:36 PM, Jakub Jelinek wrote:

On Sat, Jan 06, 2018 at 09:21:59AM +0100, Tom de Vries wrote:

this patch adds the following builtins in C/C++:
- __builtin_goacc_gang_id
- __builtin_goacc_worker_id
- __builtin_goacc_vector_id
- __builtin_goacc_gang_size
- __builtin_goacc_worker_size
- __builtin_goacc_vector_size


I wonder if it wouldn't be better to have just 2 builtins instead of 6,
with one argument (required to be constant) - the kind of parallelism
you're interested in, to avoid the inflation of the builtins.



Like so:
- __built_goacc_id
- __built_goacc_size
?


Hi,

I understand this is more of a stage1 patch and has no priority atm, but if
the concept as such is acceptable can we at least settle on the name and
interface?


Does OpenACC have some term for the 3 dimensions/kinds of parallelism?


openacc spec: "OpenACC exposes these three levels of parallelism via 
gang, worker and vector parallelism."


So, maybe we abbreviate to: 'parlevel' or 'par_level'?


Is there some enum describing those 3 already?


There's no enum type in openacc.h or gomp-constants.h.

There's an enumeration of int constants from gomp-constants.h:
...
#define GOMP_DIM_GANG   0
#define GOMP_DIM_WORKER 1
#define GOMP_DIM_VECTOR 2
...
which I'm currently using as argument.

Given the amount of trouble that having an enum type as argument for 
acc_on_device has given us, I'm not sure that we want an enum type as 
argument for these builtins. [ See the comments and kludge related to 
c++ in the https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01529.html 
patch for PR82391 Fold acc_on_device with const arg. ]


Thanks,
- Tom


Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Jakub Jelinek
On Mon, Jan 15, 2018 at 11:39:28AM +0100, Tom de Vries wrote:
> > Does OpenACC have some term for the 3 dimensions/kinds of parallelism?
> 
> openacc spec: "OpenACC exposes these three levels of parallelism via gang,
> worker and vector parallelism."
> 
> So, maybe we abbreviate to: 'parlevel' or 'par_level'?
> 
> > Is there some enum describing those 3 already?
> 
> There's no enum type in openacc.h or gomp-constants.h.
> 
> There's an enumeration of int constants from gomp-constants.h:
> ...
> #define GOMP_DIM_GANG   0
> #define GOMP_DIM_WORKER 1
> #define GOMP_DIM_VECTOR 2
> ...
> which I'm currently using as argument.
> 
> Given the amount of trouble that having an enum type as argument for
> acc_on_device has given us, I'm not sure that we want an enum type as
> argument for these builtins. [ See the comments and kludge related to c++ in
> the https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01529.html patch for
> PR82391 Fold acc_on_device with const arg. ]

Sure, the argument to the builtin should be just int.  I'm talking about
what would users use and what would be documented in extend.texi.
It can be just number of course.  parlevel is fine for me.

Jakub


Re: [v3 PATCH] Make optional conditionally trivially_{copy,move}_{constructible,assignable}

2018-01-15 Thread Jonathan Wakely

On 14/01/18 01:09 +0200, Ville Voutilainen wrote:

On 8 January 2018 at 15:36, Jonathan Wakely  wrote:

On 25/12/17 23:59 +0200, Ville Voutilainen wrote:


In the midst of the holiday season, the king and ruler of all elves,
otherwise
known as The Elf, was told by little elves that users are complaining how
stlstl and libc++ make optional's copy and move operations conditionally
trivial, but libstdc++ doesn't. This made The Elf fairly angry, and he
spoke
"this will not stand".

Tested on Linux-PPC64. The change is an ABI break due to changing
optional to a trivially copyable type. It's perhaps
better to get that ABI break in now rather than later.



Agreed, but a few comments and questions below.


New patch attached. I made _M_reset and _M_destruct noexcept, and
added a comment about the protected
inheritance in the code. Please double-check the whitespace department.


Thanks, OK for trunk.




Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Tom de Vries

On 01/15/2018 11:44 AM, Jakub Jelinek wrote:

On Mon, Jan 15, 2018 at 11:39:28AM +0100, Tom de Vries wrote:

Does OpenACC have some term for the 3 dimensions/kinds of parallelism?


openacc spec: "OpenACC exposes these three levels of parallelism via gang,
worker and vector parallelism."

So, maybe we abbreviate to: 'parlevel' or 'par_level'?


Is there some enum describing those 3 already?


There's no enum type in openacc.h or gomp-constants.h.

There's an enumeration of int constants from gomp-constants.h:
...
#define GOMP_DIM_GANG   0
#define GOMP_DIM_WORKER 1
#define GOMP_DIM_VECTOR 2
...
which I'm currently using as argument.

Given the amount of trouble that having an enum type as argument for
acc_on_device has given us, I'm not sure that we want an enum type as
argument for these builtins. [ See the comments and kludge related to c++ in
the https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01529.html patch for
PR82391 Fold acc_on_device with const arg. ]


Sure, the argument to the builtin should be just int.


Right.


 I'm talking about
what would users use and what would be documented in extend.texi.


And that is precisely my concern. Say that we define:
...
enum goacc_parlevel_t { gang = 0, worker = 1, vector = 2 };
...
and define as interface:
...
int __builtin_goacc_parlevel_id (enum goacc_parlevel_t);
...
but define the builtin with int argument.

Then we run into the same trouble as for acc_on_device: in c++ the 
__builtin_goacc_parlevel_id (enum goacc_parlevel_t) does not map onto 
the __builtin_goacc_parlevel_id (int), and we need to employ a kludge to 
make it so (the current kludge for acc_on_device is the inline function 
in openacc.h. The patch mentioned above replaces that kludge with 
another one: testing for name and signature of the function).



It can be just number of course.  parlevel is fine for me.



So, in summary, I propose as interface:
- int __builtin_goacc_parlevel_id (int);
- int __builtin_goacc_parlevel_size (int);
with arguments 0, 1, and 2 meaning gang, worker and vector.

Thanks,
- Tom


[patch,avr,committed] Adjust tests to AVR_TINY

2018-01-15 Thread Georg-Johann Lay

This obvious patch adds more handling of AVR_TINY, mostly by
applying "!avr_tiny" target filter of by defaulting to
generic address-space if __flash is not available.

Committed as https://gcc.gnu.org/r256690

Johann

* gcc.target/avr/progmem.h (pgm_read_char): Handle AVR_TINY.
* gcc.target/avr/pr52472.c: Add "! avr_tiny" target filter.
* gcc.target/avr/pr71627.c: Same.
* gcc.target/avr/torture/addr-space-1-0.c: Same.
* gcc.target/avr/torture/addr-space-1-1.c: Same.
* gcc.target/avr/torture/addr-space-1-x.c: Same.
* gcc.target/avr/torture/addr-space-2-0.c: Same.
* gcc.target/avr/torture/addr-space-2-1.c: Same.
* gcc.target/avr/torture/addr-space-2-x.c: Same.
* gcc.target/avr/torture/sat-hr-plus-minus.c: Same.
* gcc.target/avr/torture/sat-k-plus-minus.c: Same.
* gcc.target/avr/torture/sat-llk-plus-minus.c: Same.
* gcc.target/avr/torture/sat-r-plus-minus.c: Same.
* gcc.target/avr/torture/sat-uhr-plus-minus.c: Same.
* gcc.target/avr/torture/sat-uk-plus-minus.c: Same.
* gcc.target/avr/torture/sat-ullk-plus-minus.c: Same.
* gcc.target/avr/torture/sat-ur-plus-minus.c: Same.
* gcc.target/avr/torture/pr61055.c: Same.
* gcc.target/avr/torture/builtins-3-absfx.c: Only use __flash if
available.
* gcc.target/avr/torture/int24-mul.c: Same.
* gcc.target/avr/torture/pr51782-1.c: Same.
* gcc.target/avr/torture/pr61443.c: Same.
* gcc.target/avr/torture/builtins-2.c: Factor out addr-space stuff...
* gcc.target/avr/torture/builtins-2-flash.c: ...to this new test.
Index: gcc.target/avr/pr52472.c
===
--- gcc.target/avr/pr52472.c	(revision 256686)
+++ gcc.target/avr/pr52472.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! avr_tiny } } } */
 /* { dg-options "-Os -g -Wno-pointer-to-int-cast" } */
 
 /* This testcase exposes PR52472. expand_debug_expr mistakenly
Index: gcc.target/avr/pr71627.c
===
--- gcc.target/avr/pr71627.c	(revision 256686)
+++ gcc.target/avr/pr71627.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! avr_tiny } } } */
 /* { dg-options "-O1" } */
 
 
Index: gcc.target/avr/progmem.h
===
--- gcc.target/avr/progmem.h	(revision 256686)
+++ gcc.target/avr/progmem.h	(working copy)
@@ -13,6 +13,10 @@
 __asm__ ("lpm %0, %a1"  \
  : "=r" (__result) : "z" (__addr16));   \
 __result; }))
+#elif defined (__AVR_TINY__)
+/* PR71948 auto-adds 0x4000 as needed, hance just a plain read. */
+#define pgm_read_char(addr) \
+  (*(addr))
 #else
 #define pgm_read_char(addr) \
 (__extension__({\
Index: gcc.target/avr/torture/addr-space-1-0.c
===
--- gcc.target/avr/torture/addr-space-1-0.c	(revision 256686)
+++ gcc.target/avr/torture/addr-space-1-0.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-options "-std=gnu99" } */
-/* { dg-do run } */
+/* { dg-do run { target { ! avr_tiny } } } */
 
 #define __as __flash
 
Index: gcc.target/avr/torture/addr-space-1-1.c
===
--- gcc.target/avr/torture/addr-space-1-1.c	(revision 256686)
+++ gcc.target/avr/torture/addr-space-1-1.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-options "-std=gnu99 -Tavr51-flash1.x" } */
-/* { dg-do run } */
+/* { dg-do run { target { ! avr_tiny } } } */
 
 #define __as __flash1
 
Index: gcc.target/avr/torture/addr-space-1-x.c
===
--- gcc.target/avr/torture/addr-space-1-x.c	(revision 256686)
+++ gcc.target/avr/torture/addr-space-1-x.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-options "-std=gnu99" } */
-/* { dg-do run } */
+/* { dg-do run { target { ! avr_tiny } } } */
 
 #define __as __memx
 
Index: gcc.target/avr/torture/addr-space-2-0.c
===
--- gcc.target/avr/torture/addr-space-2-0.c	(revision 256686)
+++ gcc.target/avr/torture/addr-space-2-0.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-options "-std=gnu99" } */
-/* { dg-do run } */
+/* { dg-do run { target { ! avr_tiny } } } */
 
 #define __as __flash
 
Index: gcc.target/avr/torture/addr-space-2-1.c
===
--- gcc.target/avr/torture/addr-space-2-1.c	(revision 256686)
+++ gcc.target/avr/torture/addr-space-2-1.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-options "-std=gnu99 -Tavr51-flash1.x" } */
-/* { dg-do run } */
+/* { dg-do run { target { ! avr_tiny } } } */
 
 #define __as __flash1
 

[PATCH][arm] PR target/83687: Fix invalid combination of VSUB + VABS into VABD

2018-01-15 Thread Kyrill Tkachov

Hi all,

In this wrong-code bug we combine a VSUB.I8 and a VABS.S8
into a VABD.S8 instruction . This combination is not valid
for integer operands because in the VABD instruction the semantics
are that the difference is computed in notionally infinite precision
and the absolute difference is computed on that, whereas for a
VSUB.I8 + VABS.S8 sequence the VSUB operation will perform any
wrapping that's needed for the 8-bit signed type before the VABS
gets its hands on it.

This leads to the wrong-code in the PR where the expected
sequence from the intrinsics:
VSUB + VABS of two vectors {-100, -100, -100...}, {100, 100, 100...}
gives a result of {56, 56, 56...} (-100 - 100)

but GCC optimises it into a single
VABD of {-100, -100, -100...}, {100, 100, 100...}
which produces a result of {200, 200, 200...}

The transformation is still valid for floating-point operands,
which is why it was added in the first place I believe (r178817)
but this patch disables it for integer operands.
The HFmode variants though only exist for TARGET_NEON_FP16INST, so
this patch adds the appropriate guards to the new mode iterator

Bootstrapped and tested on arm-none-linux-gnueabihf.

Committing to trunk.

Thanks,
Kyrill

2018-01-15  Kyrylo Tkachov  

PR target/83687
* config/arm/iterators.md (VF): New mode iterator.
* config/arm/neon.md (neon_vabd_2): Use the above.
Remove integer-related logic from pattern.
(neon_vabd_3): Likewise.

2018-01-15  Kyrylo Tkachov  

PR target/83687
* gcc.target/arm/neon-combine-sub-abs-into-vabd.c: Delete integer
tests.
* gcc.target/arm/pr83687.c: New test.
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 5772aa99cc92de66ef4438b76632e86325a96ef2..0b2d42399d22ba89a976e39bef6182d31173c1ef 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -119,6 +119,10 @@ (define_mode_iterator VN [V8HI V4SI V2DI])
 ;; All supported vector modes (except singleton DImode).
 (define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF V2DI])
 
+;; All supported floating-point vector modes (except V2DF).
+(define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST")
+			   (V8HF "TARGET_NEON_FP16INST") V2SF V4SF])
+
 ;; All supported vector modes (except those with 64-bit integer elements).
 (define_mode_iterator VDQW [V8QI V16QI V4HI V8HI V2SI V4SI V2SF V4SF])
 
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 59fb6435da8abfe46254558e8646cd4606acb4fa..6a6f5d737715e4100adee8fb7de1d6211da3d85c 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -6706,28 +6706,22 @@ (define_expand "vec_pack_trunc_"
 })
 
 (define_insn "neon_vabd_2"
- [(set (match_operand:VDQ 0 "s_register_operand" "=w")
-   (abs:VDQ (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
-   (match_operand:VDQ 2 "s_register_operand" "w"]
- "TARGET_NEON && (! || flag_unsafe_math_optimizations)"
+ [(set (match_operand:VF 0 "s_register_operand" "=w")
+   (abs:VF (minus:VF (match_operand:VF 1 "s_register_operand" "w")
+			 (match_operand:VF 2 "s_register_operand" "w"]
+ "TARGET_NEON && flag_unsafe_math_optimizations"
  "vabd. %0, %1, %2"
- [(set (attr "type")
-   (if_then_else (ne (symbol_ref "") (const_int 0))
- (const_string "neon_fp_abd_s")
- (const_string "neon_abd")))]
+ [(set_attr "type" "neon_fp_abd_s")]
 )
 
 (define_insn "neon_vabd_3"
- [(set (match_operand:VDQ 0 "s_register_operand" "=w")
-   (abs:VDQ (unspec:VDQ [(match_operand:VDQ 1 "s_register_operand" "w")
- (match_operand:VDQ 2 "s_register_operand" "w")]
- UNSPEC_VSUB)))]
- "TARGET_NEON && (! || flag_unsafe_math_optimizations)"
+ [(set (match_operand:VF 0 "s_register_operand" "=w")
+   (abs:VF (unspec:VF [(match_operand:VF 1 "s_register_operand" "w")
+			(match_operand:VF 2 "s_register_operand" "w")]
+		UNSPEC_VSUB)))]
+ "TARGET_NEON && flag_unsafe_math_optimizations"
  "vabd. %0, %1, %2"
- [(set (attr "type")
-   (if_then_else (ne (symbol_ref "") (const_int 0))
- (const_string "neon_fp_abd_s")
- (const_string "neon_abd")))]
+ [(set_attr "type" "neon_fp_abd_s")]
 )
 
 ;; Copy from core-to-neon regs, then extend, not vice-versa
diff --git a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
index fe3d78b308cde0338300785cf7cb6ca77a831e3d..784714f0e87d8cd1216af948c61cdb87319e02cd 100644
--- a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
+++ b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
@@ -12,31 +12,3 @@ float32x2_t f_sub_abs_to_vabd_32(float32x2_t val1, float32x2_t val2)
   return res;
 }
 /* { dg-final { scan-assembler "vabd\.f32" } }*/
-
-#include 
-int8x8_t sub_abs_to_vabd_8(int8x8_t val1, int8x8_t val2)
-{
-  int8x8_t sres = vsub_s8(val1, val2);
-  int8x8_t res = vabs_s8 

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-15 Thread Jakub Jelinek
On Mon, Jan 15, 2018 at 12:12:10PM +0100, Tom de Vries wrote:
> > It can be just number of course.  parlevel is fine for me.
> > 
> 
> So, in summary, I propose as interface:
> - int __builtin_goacc_parlevel_id (int);
> - int __builtin_goacc_parlevel_size (int);
> with arguments 0, 1, and 2 meaning gang, worker and vector.

LGTM.

Jakub


[PATCH] PR82964: Fix 128-bit immediate ICEs

2018-01-15 Thread Wilco Dijkstra
This fixes PR82964 which reports ICEs for some CONST_WIDE_INT immediates.
It turns out decimal floating point CONST_DOUBLE get changed into
CONST_WIDE_INT without checking the constraint on the operand, which 
results in failures.  Avoid this by only allowing SF/DF/TF mode floating
point constants in aarch64_legitimate_constant_p.  A similar issue can
occur with 128-bit immediates which may be emitted even when disallowed
in aarch64_legitimate_constant_p, and the constraints in movti_aarch64
don't match.  Fix this with a new constraint and allowing valid immediates
in aarch64_legitimate_constant_p.

Rather than allowing all 128-bit immediates and expanding in up to 8
MOV/MOVK instructions, limit them to 4 instructions and use a literal
load for other cases.  Improve the pr79041-2.c test to use a literal and
skip it for -fpic.

This fixes all reported failures. OK for commit?


ChangeLog:
2018-01-15  Wilco Dijkstra  
Richard Sandiford  

gcc/
PR target/82964
* config/aarch64/aarch64.md (movti_aarch64): Use Uti constraint.
* config/aarch64/aarch64.c (aarch64_mov128_immediate): New function.
(aarch64_legitimate_constant_p): Just support CONST_DOUBLE 
SF/DF/TF mode to avoid creating illegal CONST_WIDE_INT immediates.
Call aarch64_mov128_immediate for CONST_WIDE_INT.
* config/aarch64/aarch64-protos.h (aarch64_mov128_immediate): Add 
declaration.
* config/aarch64/constraints.md (aarch64_movti_operand): Limit 
immediates.
* config/aarch64/predicates.md (Uti): Add new constraint.

gcc/testsuite/
PR target/79041
PR target/82964
* gcc.target/aarch64/pr79041-2.c: improve test, disable with fpic.
--

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
15c3b46ebef8f305f960e60a8b4e85d8be07e8c7..bc93f4c5753b47c05c144f4a80ba8034603d3736
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -431,6 +431,8 @@ void aarch64_split_128bit_move (rtx, rtx);
 
 bool aarch64_split_128bit_move_p (rtx, rtx);
 
+bool aarch64_mov128_immediate (rtx);
+
 void aarch64_split_simd_combine (rtx, rtx, rtx);
 
 void aarch64_split_simd_move (rtx, rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
7b50ab43dbc075e6b6d4541c3fb71e5cc872c88b..e6cdbe74356e395c887082cea66a468b51b2ff47
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1996,6 +1996,23 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   return num_insns;
 }
 
+/* Return whether imm is a 128-bit immediate which is simple enough to
+   expand inline.  */
+bool
+aarch64_mov128_immediate (rtx imm)
+{
+  if (GET_CODE (imm) == CONST_INT)
+return true;
+
+  gcc_assert (CONST_WIDE_INT_NUNITS (imm) == 2);
+
+  rtx lo = GEN_INT (CONST_WIDE_INT_ELT (imm, 0));
+  rtx hi = GEN_INT (CONST_WIDE_INT_ELT (imm, 1));
+
+  return aarch64_internal_mov_immediate (NULL_RTX, lo, false, DImode)
++ aarch64_internal_mov_immediate (NULL_RTX, hi, false, DImode) <= 4;
+}
+
 /* Add DELTA to REGNUM in mode MODE.  SCRATCHREG can be used to hold a
temporary value if necessary.  FRAME_RELATED_P should be true if
the RTX_FRAME_RELATED flag should be set and CFA adjustments added
@@ -10650,7 +10667,10 @@ static bool
 aarch64_legitimate_constant_p (machine_mode mode, rtx x)
 {
   /* Support CSE and rematerialization of common constants.  */
-  if (CONST_INT_P (x) || CONST_DOUBLE_P (x) || GET_CODE (x) == CONST_VECTOR)
+  if (CONST_INT_P (x)
+  || (CONST_DOUBLE_P (x)
+ && (mode == SFmode || mode == DFmode || mode == TFmode))
+  || GET_CODE (x) == CONST_VECTOR)
 return true;
 
   /* Do not allow vector struct mode constants.  We could support
@@ -10658,9 +10678,9 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x)
   if (aarch64_vect_struct_mode_p (mode))
 return false;
 
-  /* Do not allow wide int constants - this requires support in movti.  */
+  /* Only allow simple 128-bit immediates.  */
   if (CONST_WIDE_INT_P (x))
-return false;
+return aarch64_mov128_immediate (x);
 
   /* Do not allow const (plus (anchor_symbol, const_int)).  */
   if (GET_CODE (x) == CONST)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
d14b57b0ef7f4eeca40bfdcaf3ebb02a1031cb99..382953e6ec42ae4475d66143be1e25d22e48571f
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1023,9 +1023,9 @@ (define_expand "movti"
 
 (define_insn "*movti_aarch64"
   [(set (match_operand:TI 0
-"nonimmediate_operand"  "=r, w,r,w,r,m,m,w,m")
+"nonimmediate_operand"  "=   r,w, r,w,r,m,m,w,m")
(match_operand:TI 1
-"aarch64_movti_operand" " rn,r,w,w,m,r,Z,m,w"))]
+"aarch64_movti_operand" " rUti,r, w,w,m,r,Z,m,w"))]
   "(register_operand (operands[0], TImode)
 || aarch64_reg_or_zero (operands[1], TImode))"
   "@
diff --git a/gcc/

[PATCH] Fix PR83850

2018-01-15 Thread Richard Biener

The following fixes a typo to make gcc.target/i386/pr80846-1.c PASS again.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-01-15  Richard Biener  

PR middle-end/83850
* expmed.c (extract_bit_field_1): Fix typo.

Index: gcc/expmed.c
===
--- gcc/expmed.c(revision 256687)
+++ gcc/expmed.c(working copy)
@@ -1631,7 +1631,7 @@ extract_bit_field_1 (rtx str_rtx, poly_u
   if (VECTOR_MODE_P (GET_MODE (op0))
   && !MEM_P (op0)
   && VECTOR_MODE_P (tmode)
-  && known_eq (bitsize, GET_MODE_SIZE (tmode))
+  && known_eq (bitsize, GET_MODE_BITSIZE (tmode))
   && maybe_gt (GET_MODE_SIZE (GET_MODE (op0)), GET_MODE_SIZE (tmode)))
 {
   machine_mode new_mode = GET_MODE (op0);


Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
 wrote:
> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>> Now my patch set has been checked into trunk.  Here is a patch set
>> to move struct ix86_frame to machine_function on GCC 7, which is
>> needed to backport the patch set to GCC 7:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
>>
>> OK for gcc-7-branch?
>
> Yes, backporting is ok - please watch for possible fallout on trunk and make
> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
> Wednesday now with the final release about a week later if no issue shows
> up.
>

Backport is blocked by

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838

There are many test failures due to lack of comdat support in linker on Solaris.
I can limit these tests to Linux.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839

Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  I
uploaded a patch:

https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124

There is no confirmation on it.  Also there may be test failures on Dardwin
due to difference in assembly output.

-- 
H.J.


[committed] Missing vect_double in gcc.dg/vect/pr79920.c (PR83836)

2018-01-15 Thread Richard Sandiford
Tested on aarch64-linux-gnu and x86_64-linux-gnu, also spot-tested
on sparc-sun-solaris2.11.  Installed as obvious.

Richard


2018-01-15  Richard Sandiford  

gcc/testsuite/
PR testsuite/79920
* gcc.dg/vect/pr79920.c: Restrict reduction test to vect_double.

Index: gcc/testsuite/gcc.dg/vect/pr79920.c
===
--- gcc/testsuite/gcc.dg/vect/pr79920.c 2018-01-13 18:01:15.294116882 +
+++ gcc/testsuite/gcc.dg/vect/pr79920.c 2018-01-15 12:38:14.908597619 +
@@ -41,5 +41,5 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times {using an in-order \(fold-left\) 
reduction} 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times {using an in-order \(fold-left\) 
reduction} 1 "vect" { target vect_double } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { 
vect_double && { vect_perm && vect_hw_misalign } } } } } */


[PATCH] use shasum instead of sha512sum on FreeBSD and DragonFly

2018-01-15 Thread Jonathan Wakely

boru on Freenode's #gcc channel pointed out that
contrib/download_prerequisites should use shasum for FreeBSD, not
sha512sum (which comes from GNU coreutils on GNU/Linux).  I checked
FreeBSD 11.0 and 10.2 and neither has sha512sum, not does DragonFly
4.2, another FreeBSD derivative.

OK for trunk?


diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
index ae0b5ffeb32..b50f47cda79 100755
--- a/contrib/download_prerequisites
+++ b/contrib/download_prerequisites
@@ -47,7 +47,7 @@ force=0
 OS=$(uname)
 
 case $OS in
-  "Darwin")
+  "Darwin"|"FreeBSD"|"DragonFly")
 chksum='shasum -a 512 --check'
   ;;
   *)


Re: [PATCH v2, rs6000] Add -msafe-indirect-jumps option and implement safe bctr / bctrl

2018-01-15 Thread Bill Schmidt
On Jan 15, 2018, at 3:46 AM, Richard Biener  wrote:
> 
> On Sun, Jan 14, 2018 at 5:53 AM, Bill Schmidt
>  wrote:
>> Hi,
>> 
>> [This patch supercedes and extends 
>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01135.html.
>> There was a small error in the assembly code produced by that patch (bad
>> memory on my account of how to spell "crset eq").  I've also increased the
>> function provided; see below.]
>> 
>> This patch adds a new option for the compiler to produce only "safe" indirect
>> jumps, in the sense that these jumps are deliberately mispredicted to inhibit
>> speculative execution.  For now, this option is undocumented; this may change
>> at some future date.  It is intended eventually for the linker to also honor
>> this flag when creating PLT stubs, for example.
>> 
>> In addition to the new option, I've included changes to indirect calls for
>> the ELFv2 ABI when the option is specified.  In place of bctrl, we generate
>> a "crset eq" followed by a beqctrl-.  Using the CR0.eq bit is safe since CR0
>> is volatile over the call.
>> 
>> I've also added code to replace uses of bctr when the new option is 
>> specified,
>> with the sequence
>> 
>>crset 4x[CRb]+2
>>beqctr- CRb
>>b .
>> 
>> where CRb is an available condition register field.  This applies to all
>> subtargets, and in particular is not restricted to ELFv2.  The use cases
>> covered here are computed gotos and switch statements.
>> 
>> NOT yet covered by this patch: indirect calls for ELFv1.  That will come 
>> later.
>> 
>> Please let me know if there is a better way to represent the crset without
>> an unspec.  For the indirect jump, I don't see a way around it due to the
>> expected form of indirect jumps in cfganal.c.
>> 
>> Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu with
>> no regressions.  Is this okay for trunk?
> 
> As this sounds Spectre related feel free to backport this to the GCC 7 branch
> as well (even if you don't hit the Wednesday deadline for RC1 of GCC 7.3).

Thanks, Richard -- I hadn't seen this deadline announced, but I will do my best
to get this completed/committed by then.  Thanks for the heads-up!

Bill

> 
> Thanks,
> Richard.
> 
>> Thanks,
>> Bill
>> 
>> 
>> [gcc]
>> 
>> 2018-01-13  Bill Schmidt  
>> 
>>* config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
>>-msafe-indirect-jumps.
>>* config/rs6000/rs6000.md (UNSPEC_CRSET_EQ): New UNSPEC enum.
>>(UNSPEC_COMP_GOTO_CR): Likewise.
>>(*call_indirect_elfv2): Disable for -msafe-indirect-jumps.
>>(*call_indirect_elfv2_safe): New define_insn.
>>(*call_value_indirect_elfv2): Disable for
>>-msafe-indirect-jumps.
>>(*call_value_indirect_elfv2_safe): New define_insn.
>>(indirect_jump): Emit different RTL for -msafe-indirect-jumps.
>>(*indirect_jump): Disable for -msafe-indirect-jumps.
>>(*indirect_jump_safe): New define_insn.
>>(*set_cr_eq): New define_insn.
>>(tablejump): Emit different RTL for -msafe-indirect-jumps.
>>(tablejumpsi): Disable for -msafe-indirect-jumps.
>>(tablejumpsi_safe): New define_expand.
>>(tablejumpdi): Disable for -msafe-indirect-jumps.
>>(tablejumpdi_safe): New define_expand.
>>(*tablejump_internal1): Disable for -msafe-indirect-jumps.
>>(*tablejump_internal1_safe): New define_insn.
>>* config/rs6000/rs6000.opt (msafe-indirect-jumps): New option.
>> 
>> [gcc/testsuite]
>> 
>> 2018-01-13  Bill Schmidt  
>> 
>>* gcc.target/powerpc/safe-indirect-jump-1.c: New file.
>>* gcc.target/powerpc/safe-indirect-jump-2.c: New file.
>>* gcc.target/powerpc/safe-indirect-jump-3.c: New file.
>> 
>> 
>> Index: gcc/config/rs6000/rs6000.c
>> ===
>> --- gcc/config/rs6000/rs6000.c  (revision 256364)
>> +++ gcc/config/rs6000/rs6000.c  (working copy)
>> @@ -36726,6 +36726,9 @@ static struct rs6000_opt_var const rs6000_opt_vars
>>   { "sched-epilog",
>> offsetof (struct gcc_options, x_TARGET_SCHED_PROLOG),
>> offsetof (struct cl_target_option, x_TARGET_SCHED_PROLOG), },
>> +  { "safe-indirect-jumps",
>> +offsetof (struct gcc_options, x_rs6000_safe_indirect_jumps),
>> +offsetof (struct cl_target_option, x_rs6000_safe_indirect_jumps), },
>> };
>> 
>> /* Inner function to handle attribute((target("..."))) and #pragma GCC target
>> Index: gcc/config/rs6000/rs6000.md
>> ===
>> --- gcc/config/rs6000/rs6000.md (revision 256364)
>> +++ gcc/config/rs6000/rs6000.md (working copy)
>> @@ -150,6 +150,8 @@
>>UNSPEC_SIGNBIT
>>UNSPEC_SF_FROM_SI
>>UNSPEC_SI_FROM_SF
>> +   UNSPEC_CRSET_EQ
>> +   UNSPEC_COMP_GOTO_CR
>>   ])
>> 
>> ;;
>> @@ -11222,11 +11224,22 @@
>> (match_operand 1 "" "g,g"))
>>(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
>> "n,n"

[PATCH][i386] Fix PR83546 - missing RDRND for -march=silvermont

2018-01-15 Thread Peryt, Sebastian
Hi,

This patch re-enables RDRND for Silvermont. It got lost in r206178 as pointed 
out in PR.
Bootstraped and tested.

2018-01-15  Sebastian Peryt  

gcc/

PR target/83546
* config/i386/i386.c (ix86_option_override_internal): Add PTA_RDRND
to PTA_SILVERMONT.

2018-01-15  Sebastian Peryt  

gcc/testsuite/

PR target/83546
* gcc.target/i386/pr83546.c: New test.

Is it ok for trunk?

Sebastian


0001-PR83546.patch
Description: 0001-PR83546.patch


[PATCH][ARM] Use utxb rN, rM, ror #8 to implement zero_extract on armv6.

2018-01-15 Thread Roger Sayle

I was hoping I could ask an ARM backend maintainer to look over the
following patch.

I was examining the code generated for the following C snippet on a
raspberry pi,

static inline int popcount_lut8(unsigned *buf, int n)
{
  int cnt=0;
  unsigned int i;
  do {
i = *buf;
cnt += lut[i&255];
cnt += lut[i>>8&255];
cnt += lut[i>>16&255];
cnt += lut[i>>24];
buf++;
  } while(--n);
  return cnt;
}

and was surprised to see following instruction sequence generated by the
compiler:

movr5, r2, lsr #8
uxtb   r5, r5

This sequence can be performed by a single ARM instruction:

   uxtb   r5, r2, ror #8

The attached patch allows GCC's combine pass to take advantage of the ARM's
uxtb with
rotate functionality to implement the above zero_extract, and likewise to
use the sxtb
with rotate to implement sign_extract.  ARM's uxtb and sxtb can only be used
with rotates
of 0, 8, 16 and 24, and of these only the 8 and 16 are useful [ror #0 is a
nop, and extends
with ror #24 can be implemented using regular shifts],  so the approach here
is to add the
six missing but useful instructions as 6 different define_insn in arm.md,
rather than try to
be clever with new predicates.

Alas, later ARM hardware has advanced bit field instructions, and earlier
ARM cores 
didn't support extend-with-rotate, so this appears to only benefit armv6 era
CPUs.

The following patch has been minimally tested by building cc1 of a
cross-compiler 
and confirming the desired instructions appear in the assembly output for
the test
case.  Alas, my minimal raspberry pi hardware is unlikely to be able to
bootstrap gcc
or run the testsuite, so I'm hoping a ARM expert can check (and confirm)
whether this
change is safe and suitable.  [Thanks in advance and apologies for any
inconvenience].


2018-01-14  Roger Sayle  

* config/arm/arm.md (*arm_zeroextractsi2_8_8,
*arm_signextractsi2_8_8,
*arm_zeroextractsi2_8_16, *arm_signextractsi2_8_16,
*arm_zeroextractsi2_16_8, *arm_signextractsi2_16_8): New.

2018-01-14  Roger Sayle  

* gcc.target/arm/extend-ror.c: New test.


Cheers,
Roger
--
Roger Sayle, PhD.
NextMove Software Limited
Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY




arm_zext.log
Description: Binary data


arm_zext.patch
Description: Binary data
/* { dg-do compile } */
/* { dg-options "-O -march=armv6" } */
/* { dg-prune-output "switch .* conflicts with" } */

unsigned int zeroextractsi2_8_8(unsigned int x)
{
  return (unsigned char)(x>>8);
}

unsigned int zeroextractsi2_8_16(unsigned int x)
{
  return (unsigned char)(x>>16);
}

unsigned int signextractsi2_8_8(unsigned int x)
{
  return (int)(signed char)(x>>8);
}

unsigned int signextractsi2_8_16(unsigned int x)
{
  return (int)(signed char)(x>>16);
}

unsigned int zeroextractsi2_16_8(unsigned int x)
{
  return (unsigned short)(x>>8);
}

unsigned int signextractsi2_16_8(unsigned int x)
{
  return (int)(short)(x>>8);
}

/* { dg-final { scan-assembler-times ", ror #8" 4 } } */
/* { dg-final { scan-assembler-times ", ror #16" 2 } } */


[PATCH] PR libstdc++/83830 Define std::has_unique_object_representations_v

2018-01-15 Thread Jonathan Wakely

Add this missing C++17 variable template.

PR libstdc++/83830
* include/std/type_traits (has_unique_object_representations_v): Add
variable template.
* testsuite/20_util/has_unique_object_representations/value.cc: Check
variable template.

Tested powerpc64le-linux, committed to trunk.

commit 37f21e406cb84c041adb1537079ced465a79e4be
Author: Jonathan Wakely 
Date:   Mon Jan 15 13:23:53 2018 +

PR libstdc++/83830 Define std::has_unique_object_representations_v

PR libstdc++/83830
* include/std/type_traits (has_unique_object_representations_v): Add
variable template.
* testsuite/20_util/has_unique_object_representations/value.cc: 
Check
variable template.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 43ea68e6c6b..711d6c50dd1 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2903,6 +2903,10 @@ template 
   remove_cv_t>
   )>
 { };
+
+  template
+inline constexpr bool has_unique_object_representations_v
+  = has_unique_object_representations<_Tp>::value;
 #endif
 #undef _GLIBCXX_HAVE_BUILTIN_HAS_UNIQ_OBJ_REP
 
diff --git 
a/libstdc++-v3/testsuite/20_util/has_unique_object_representations/value.cc 
b/libstdc++-v3/testsuite/20_util/has_unique_object_representations/value.cc
index c2a5873ee69..7ac97cf0ba4 100644
--- a/libstdc++-v3/testsuite/20_util/has_unique_object_representations/value.cc
+++ b/libstdc++-v3/testsuite/20_util/has_unique_object_representations/value.cc
@@ -108,3 +108,17 @@ void test01()
   static_assert(test_category(false), "");
 }
+
+void
+test02()
+{
+  using std::has_unique_object_representations;
+  using std::has_unique_object_representations_v;
+
+  static_assert(has_unique_object_representations_v
+   == has_unique_object_representations::value);
+  static_assert(has_unique_object_representations_v
+   == has_unique_object_representations::value);
+  static_assert(has_unique_object_representations_v
+   == has_unique_object_representations::value);
+}


Re: [PATCH] C/C++: Add -Waddress-of-packed-member

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 1:42 AM, Jakub Jelinek  wrote:
> On Sun, Jan 14, 2018 at 06:29:54AM -0800, H.J. Lu wrote:
>> +   if (TREE_CODE (field) == FIELD_DECL && DECL_PACKED (field))
>> + {
>> +   tree field_type = TREE_TYPE (field);
>> +   unsigned int type_align = TYPE_ALIGN (field_type);
>> +   tree context = DECL_CONTEXT (field);
>> +   unsigned int record_align = TYPE_ALIGN (context);
>> +   if ((record_align % type_align) != 0)
>> + return context;
>> +   type_align /= BITS_PER_UNIT;
>> +   unsigned HOST_WIDE_INT field_off
>> +  = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
>> + + (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
>> +/ BITS_PER_UNIT));
>
> This has the same bug I've just created PR83844 for, you can't assume
> DECL_FIELD_OFFSET is INTEGER_CST that fits into UHWI, and also we have
> byte_position wrapper that should be used to compute the offset from
> DECL_FIELD_*OFFSET.

Here is the updated patch to use byte_position wrapper.  OK for trunk?


-- 
H.J.
From 2a26ed809f7af5f52a24367bfa0b29898ac7fa87 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 12 Jan 2018 21:12:05 -0800
Subject: [PATCH] C/C++: Add -Waddress-of-packed-member

When address of packed member of struct or union is taken, it may result
in an unaligned pointer value.  This patch adds -Waddress-of-packed-member
to warn it:

$ cat x.i
struct pair_t
{
  char c;
  int i;
} __attribute__ ((packed));

extern struct pair_t p;
int *addr = &p.i;
$ gcc -O2 -S x.i
x.i:8:13: warning: initialization of 'int *' from address of packed member of 'struct pair_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
 int *addr = &p.i;
 ^
$

This warning is enabled by default.

gcc/c/

	PR c/51628
	* doc/invoke.texi: Document -Wno-address-of-packed-member.

gcc/c-family/

	PR c/51628
	* c-common.h (warn_for_address_of_packed_member): New.
	* c-warn.c (warn_for_address_of_packed_member): New function.
	* c.opt: Add -Wno-address-of-packed-member.

gcc/c/

	PR c/51628
	* c-typeck.c (convert_for_assignment): Call
	warn_for_address_of_packed_member.  Issue an warning if address
	of packed member is taken.

gcc/cp/

	PR c/51628
	* call.c (convert_for_arg_passing): Call
	warn_for_address_of_packed_member.  Issue an warning if address
	of packed member is taken.
	* typeck.c (convert_for_assignment): Likewise.

gcc/testsuite/

	PR c/51628
	* c-c++-common/pr51628-1.c: New tests.
	* c-c++-common/pr51628-2.c: Likewise.
	* c-c++-common/pr51628-3.c: Likewise.
	* c-c++-common/pr51628-4.c: Likewise.
	* c-c++-common/pr51628-5.c: Likewise.
	* c-c++-common/pr51628-6.c: Likewise.
	* c-c++-common/pr51628-7.c: Likewise.
	* c-c++-common/pr51628-8.c: Likewise.
	* c-c++-common/pr51628-9.c: Likewise.
	* gcc.dg/pr51628-10.c: Likewise.
	* gcc.dg/pr51628-11.c: Likewise.
	* c-c++-common/ubsan/align-10.c: Add -Wno-address-of-packed-member.
	* c-c++-common/ubsan/align-2.c: Likewise.
	* c-c++-common/ubsan/align-4.c: Likewise.
	* c-c++-common/ubsan/align-6.c: Likewise.
	* c-c++-common/ubsan/align-7.c: Likewise.
	* c-c++-common/ubsan/align-8.c: Likewise.
	* g++.dg/ubsan/align-2.C: Likewise.
---
 gcc/c-family/c-common.h |  1 +
 gcc/c-family/c-warn.c   | 56 +
 gcc/c-family/c.opt  |  4 +++
 gcc/c/c-typeck.c| 40 -
 gcc/cp/call.c   |  8 +
 gcc/cp/typeck.c | 41 +
 gcc/doc/invoke.texi | 11 --
 gcc/testsuite/c-c++-common/pr51628-1.c  | 29 +++
 gcc/testsuite/c-c++-common/pr51628-2.c  | 29 +++
 gcc/testsuite/c-c++-common/pr51628-3.c  | 35 ++
 gcc/testsuite/c-c++-common/pr51628-4.c  | 35 ++
 gcc/testsuite/c-c++-common/pr51628-5.c  | 35 ++
 gcc/testsuite/c-c++-common/pr51628-6.c  | 35 ++
 gcc/testsuite/c-c++-common/pr51628-7.c  | 29 +++
 gcc/testsuite/c-c++-common/pr51628-8.c  | 36 +++
 gcc/testsuite/c-c++-common/pr51628-9.c  | 36 +++
 gcc/testsuite/c-c++-common/ubsan/align-10.c |  2 +-
 gcc/testsuite/c-c++-common/ubsan/align-2.c  |  2 +-
 gcc/testsuite/c-c++-common/ubsan/align-4.c  |  2 +-
 gcc/testsuite/c-c++-common/ubsan/align-6.c  |  2 +-
 gcc/testsuite/c-c++-common/ubsan/align-7.c  |  2 +-
 gcc/testsuite/c-c++-common/ubsan/align-8.c  |  2 +-
 gcc/testsuite/g++.dg/ubsan/align-2.C|  2 +-
 gcc/testsuite/gcc.dg/pr51628-10.c   | 23 
 gcc/testsuite/gcc.dg/pr51628-11.c   | 26 ++
 25 files changed, 513 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pr51628-1.c
 create mode 100644 gcc/testsuite/c-c++-common/pr51628-2.c
 create mode 100644 gcc/testsuite/c-c

Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Segher Boessenkool
Hi Will,

On Fri, Jan 12, 2018 at 03:22:06PM -0600, Will Schmidt wrote:
>   Add support for gimple folding of the mergeh, mergel intrinsics.
> Since the merge low and merge high variants are almost identical, a
> new helper function has been added so that code can be shared.
> 
> This also adds define_insn for xxmrghw, xxmrglw instructions, allowing us
> to generate xxmrglw instead of vmrglw after folding.  A few whitespace
> fixes have been made to the existing vmrg?w defines.
> 
> The changes introduced here affect the existing target testcases
> gcc.target/powerpc/builtins-1-be.c and builtins-1-le.c, such that
> a number of the scan-assembler tests would fail due to instruction counts
> changing.  Since the purpose of that test is to primarily ensure those
> intrinsics are accepted by the compiler, I have disabled gimple-folding for
> the existing tests that count instructions, and created new variants of those
> tests with folding enabled and a higher optimization level, that do not count
> instructions.

> 2018-01-12  Will Schmidt  
> 
>   * config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
>   support for merge[hl].

New line here.

>   (fold_mergehl_helper): New helper function.
>   * config/rs6000/altivec.md (altivec_xxmrghw_direct): New.
>   (altivec_xxmrglw_direct): New.

> +(define_insn "altivec_xxmrghw_direct"
> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> +   (match_operand:V4SI 2 "register_operand" "v")]
> +  UNSPEC_VMRGH_DIRECT))]
> +  "TARGET_P8_VECTOR"
> +  "xxmrghw %x0,%x1,%x2"
> +  [(set_attr "type" "vecperm")])
> +
>  (define_insn "altivec_vmrghw_direct"
>[(set (match_operand:V4SI 0 "register_operand" "=v")
>  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>(match_operand:V4SI 2 "register_operand" "v")]
>   UNSPEC_VMRGH_DIRECT))]

How do these two differ?  The xx variant can write all 64 VSR registers,
it needs different constraints (wa?).  Can the two patterns be merged?
It doesn't need the TARGET_P8_VECTOR condition then: the constraints
will handle that.  And actually it is a v2.06 insn (p7)?

>  (define_insn "altivec_vmrglb_direct"
>[(set (match_operand:V16QI 0 "register_operand" "=v")
>  (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")

This line should start with a tab as well?

> -(match_operand:V16QI 2 "register_operand" "v")]
> -  UNSPEC_VMRGL_DIRECT))]
> +(match_operand:V16QI 2 "register_operand" "v")]
> +   UNSPEC_VMRGL_DIRECT))]
>"TARGET_ALTIVEC"
>"vmrglb %0,%1,%2"
>[(set_attr "type" "vecperm")])


>  (define_insn "altivec_vmrglh_direct"
>[(set (match_operand:V8HI 0 "register_operand" "=v")
>  (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")

Same here.

> (match_operand:V8HI 2 "register_operand" "v")]
> - UNSPEC_VMRGL_DIRECT))]
> +  UNSPEC_VMRGL_DIRECT))]
>"TARGET_ALTIVEC"
>"vmrglh %0,%1,%2"
>[(set_attr "type" "vecperm")])


> +(define_insn "altivec_xxmrglw_direct"
> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> +   (match_operand:V4SI 2 "register_operand" "v")]
> +  UNSPEC_VMRGL_DIRECT))]
> +  "TARGET_P8_VECTOR"
> +  "xxmrglw %x0,%x1,%x2"
> +  [(set_attr "type" "vecperm")])

Exactly analogous to mrghw comments.

> +/* Helper function to handle the vector merge[hl] built-ins.  The
> + implementation difference between h and l versions for this code are in
> + the values used when building of the permute vector for high word versus
> + low word merge.  The variance is keyed off the use_high parameter.  */

The continuation lines should be indented by three spaces, so that the
text lines up.

> +static void
> +fold_mergehl_helper (gimple_stmt_iterator *gsi, gimple *stmt, int use_high)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = gimple_call_arg (stmt, 1);
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree lhs_type = TREE_TYPE (lhs);
> +  tree lhs_type_type = TREE_TYPE (lhs_type);
> +  gimple *g;
> +  int n_elts = TYPE_VECTOR_SUBPARTS (lhs_type);
> +  vec *ctor_elts = NULL;
> +  int midpoint = n_elts / 2;
> +  int offset = 0;
> +  if (use_high == 1)
> +offset = midpoint;
> +  for (int i = 0; i < midpoint; i++)
> +{
> +  tree tmp1 = build_int_cst (lhs_type_type, offset + i);
> +  tree tmp2 = build_int_cst (lhs_type_type, offset + n_elts + i);
> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp1);
> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp2);
> +}
> +  tree permute = create_tmp_reg_or_ssa_name (lhs_type);
> +  g = gimple_build_assign (permute, build_constructor (lhs_type, ctor_elts));

Don't group gather loads (PR83847)

2018-01-15 Thread Richard Sandiford
In the testcase we were trying to group two gather loads, even though
that isn't supported.  Fixed by explicitly disallowing grouping of
gathers and scatters.

This problem didn't show up on SVE because there we convert to
IFN_GATHER_LOAD/IFN_SCATTER_STORE pattern statements, which fail
the can_group_stmts_p check.

Tested on x86_64-linux-gnu.  OK to install?

Richard


2018-01-15  Richard Sandiford  

gcc/
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses):

gcc/testsuite/
* gcc.dg/torture/pr83847.c: New test.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-01-13 18:02:00.948360274 +
+++ gcc/tree-vect-data-refs.c   2018-01-15 12:22:47.066621712 +
@@ -2923,7 +2923,8 @@ vect_analyze_data_ref_accesses (vec_info
   data_reference_p dra = datarefs_copy[i];
   stmt_vec_info stmtinfo_a = vinfo_for_stmt (DR_STMT (dra));
   stmt_vec_info lastinfo = NULL;
-  if (! STMT_VINFO_VECTORIZABLE (stmtinfo_a))
+  if (!STMT_VINFO_VECTORIZABLE (stmtinfo_a)
+ || STMT_VINFO_GATHER_SCATTER_P (stmtinfo_a))
{
  ++i;
  continue;
@@ -2932,7 +2933,8 @@ vect_analyze_data_ref_accesses (vec_info
{
  data_reference_p drb = datarefs_copy[i];
  stmt_vec_info stmtinfo_b = vinfo_for_stmt (DR_STMT (drb));
- if (! STMT_VINFO_VECTORIZABLE (stmtinfo_b))
+ if (!STMT_VINFO_VECTORIZABLE (stmtinfo_b)
+ || STMT_VINFO_GATHER_SCATTER_P (stmtinfo_b))
break;
 
  /* ???  Imperfect sorting (non-compatible types, non-modulo
Index: gcc/testsuite/gcc.dg/torture/pr83847.c
===
--- /dev/null   2018-01-12 06:40:27.684409621 +
+++ gcc/testsuite/gcc.dg/torture/pr83847.c  2018-01-15 12:22:47.064621805 
+
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=bdver4" { target i?86-*-* x86_64-*-* } } */
+
+typedef struct {
+  struct {
+int a;
+int b;
+  } c;
+} * d;
+typedef struct {
+  unsigned e;
+  d f[];
+} g;
+g h;
+d *k;
+int i(int j) {
+  if (j) {
+*k = *h.f;
+return 1;
+  }
+  return 0;
+}
+int l;
+int m;
+int n;
+d o;
+void p() {
+  for (; i(l); l++) {
+n += o->c.a;
+m += o->c.b;
+  }
+}


Re: [PATCH] Fix PR83435

2018-01-15 Thread Szabolcs Nagy
On 11/01/18 13:41, Richard Biener wrote:
> 2018-01-11  Richard Biener  
> 
>   PR tree-optimization/83435
>   * graphite.c (canonicalize_loop_form): Ignore fake loop exit edges.
>   * graphite-scop-detection.c (scop_detection::get_sese): Likewise.
>   * tree-vrp.c (add_assert_info): Drop TREE_OVERFLOW if they appear.
> 
>   * gcc.dg/graphite/pr83435.c: New testcase.

this test case fails on baremetal targets for me with

xgcc: error: unrecognized command line option '-pthread'


> Index: gcc/testsuite/gcc.dg/graphite/pr83435.c
> ===
> --- gcc/testsuite/gcc.dg/graphite/pr83435.c   (nonexistent)
> +++ gcc/testsuite/gcc.dg/graphite/pr83435.c   (working copy)
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -ftree-parallelize-loops=2 -floop-parallelize-all" } */
> +
> +int yj, ax;
> +
> +void
> +gf (signed char mp)
> +{
> +  int *dh = &yj;
> +
> +  for (;;)
> +{
> +  signed char sb;
> +
> +  for (sb = 0; sb < 1; sb -= 8)
> + {
> + }
> +
> +  mp &= mp <= sb;
> +  if (mp == 0)
> + dh = &ax;
> +  mp = 0;
> +  *dh = 0;
> +}
> +}
> 



Re: [PATCH][i386] Fix PR83546 - missing RDRND for -march=silvermont

2018-01-15 Thread Uros Bizjak
On Mon, Jan 15, 2018 at 2:50 PM, Peryt, Sebastian
 wrote:
> Hi,
>
> This patch re-enables RDRND for Silvermont. It got lost in r206178 as pointed 
> out in PR.
> Bootstraped and tested.
>
> 2018-01-15  Sebastian Peryt  
>
> gcc/
>
> PR target/83546
> * config/i386/i386.c (ix86_option_override_internal): Add PTA_RDRND
> to PTA_SILVERMONT.
>
> 2018-01-15  Sebastian Peryt  
>
> gcc/testsuite/
>
> PR target/83546
> * gcc.target/i386/pr83546.c: New test.
>
> Is it ok for trunk?

OK.

Thanks,
Uros.


Re: [PATCH v2, rs6000] Add -msafe-indirect-jumps option and implement safe bctr / bctrl

2018-01-15 Thread Segher Boessenkool
Hi!

On Sat, Jan 13, 2018 at 10:53:57PM -0600, Bill Schmidt wrote:
> This patch adds a new option for the compiler to produce only "safe" indirect
> jumps, in the sense that these jumps are deliberately mispredicted to inhibit
> speculative execution.  For now, this option is undocumented; this may change
> at some future date.  It is intended eventually for the linker to also honor
> this flag when creating PLT stubs, for example.

I think we settled on calling the option -mmispredict-indirect-jumps;
please let me know if you still agree with that.  Or have thought of a
better name :-)

> In addition to the new option, I've included changes to indirect calls for
> the ELFv2 ABI when the option is specified.  In place of bctrl, we generate
> a "crset eq" followed by a beqctrl-.  Using the CR0.eq bit is safe since CR0
> is volatile over the call.

And CR0 is unused by the call; compare to CR1 (on older ABIs) for example.

> I've also added code to replace uses of bctr when the new option is specified,
> with the sequence
> 
>   crset 4x[CRb]+2
>   beqctr- CRb
>   b .
> 
> where CRb is an available condition register field.  This applies to all
> subtargets, and in particular is not restricted to ELFv2.  The use cases
> covered here are computed gotos and switch statements.
> 
> NOT yet covered by this patch: indirect calls for ELFv1.  That will come 
> later.

Would be nice to have it for all ABIs, even.

> Please let me know if there is a better way to represent the crset without
> an unspec.

See the various patterns using cr%q.  I'm not sure they can generate creqv
(i.e. crset) currently, but that could be added (like crnot is there already,
for example).  If you don't use unspec (or maybe unspec_volatile) it can be
optimised away though.

Maybe it is best not to put the crset into its own insn, just make it part
of the bigger pattern, with an appropriate clobber?

> For the indirect jump, I don't see a way around it due to the
> expected form of indirect jumps in cfganal.c.

I'm not sure what you are getting at here, could you explain a bit?

>  (define_expand "indirect_jump"
> -  [(set (pc) (match_operand 0 "register_operand"))])
> +  [(set (pc) (match_operand 0 "register_operand"))]
> + ""
> +{
> +  /* We need to reserve a CR when forcing a mispredicted jump.  */
> +  if (rs6000_safe_indirect_jumps) {
> +rtx ccreg = gen_reg_rtx (CCmode);
> +emit_insn (gen_rtx_SET (ccreg,
> + gen_rtx_UNSPEC (CCmode,
> + gen_rtvec (1, const0_rtx),
> + UNSPEC_CRSET_EQ)));
> +rtvec v = rtvec_alloc (2);
> +RTVEC_ELT (v, 0) = operands[0];
> +RTVEC_ELT (v, 1) = ccreg;
> +emit_jump_insn (gen_rtx_SET (pc_rtx,
> +  gen_rtx_UNSPEC (Pmode, v,
> +  UNSPEC_COMP_GOTO_CR)));
> +DONE;
> +  }
> +})
>  
>  (define_insn "*indirect_jump"
>[(set (pc)
>   (match_operand:P 0 "register_operand" "c,*l"))]
> -  ""
> +  "!rs6000_safe_indirect_jumps"
>"b%T0"
>[(set_attr "type" "jmpreg")])
>  
> +(define_insn "*indirect_jump_safe"
> +  [(set (pc)
> + (unspec:P [(match_operand:P 0 "register_operand" "c,*l")
> +(match_operand:CC 1 "cc_reg_operand" "y,y")]
> +UNSPEC_COMP_GOTO_CR))]
> +  "rs6000_safe_indirect_jumps"
> +  "beq%T0- %1\;b ."
> +  [(set_attr "type" "jmpreg")
> +   (set_attr "length" "8")])
> +
> +(define_insn "*set_cr_eq"
> +  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
> + (unspec:CC [(const_int 0)] UNSPEC_CRSET_EQ))]
> +  "rs6000_safe_indirect_jumps"
> +  "crset %E0"
> +  [(set_attr "type" "cr_logical")])

So merge this latter insn into the previous, making the CC a clobber?
Like (not tested):

+(define_insn "indirect_jump_mispredict"
+  [(set (pc)
+   (match_operand:P 0 "register_operand" "c,*l")
+   (clobber (match_operand:CC 1 "cc_reg_operand" "y,y"))]
+  "rs6000_safe_indirect_jumps"
+  "crset %E1\;beq%T0- %1\;b ."
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "12")])

and then change the indirect_jump pattern to simply select between the normal
and mispredict patterns?

> +(define_expand "tablejumpsi_safe"

And then similar for tablejump.


Segher


[PATCH] i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO

2018-01-15 Thread H.J. Lu
ASM_OUTPUT_DEF isn't defined for TARGET_MACHO.  Use ASM_OUTPUT_LABEL to
generate the __x86_return_thunk label, instead of the set directive.
Update testcase to remove the __x86_return_thunk label check.  Since
-fno-pic is ignored on Darwin, update testcases to sscan or "push"
only on Linux.

Tested with a cross compiler to x86_64-apple-darwin10.4.0.  OK for
trunk?

H.J.
---
gcc/

PR target/83839
* config/i386/i386.c (output_indirect_thunk_function): Use
ASM_OUTPUT_LABEL, instead of ASM_OUTPUT_DEF, for TARGET_MACHO
for  __x86.return_thunk.

gcc/testsuite/

PR target/83839
* gcc.target/i386/indirect-thunk-1.c: Scan for "push" only on
Linux.
* gcc.target/i386/indirect-thunk-2.c: Likewise.
* gcc.target/i386/indirect-thunk-3.c: Likewise.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-7.c: Likewise.
* gcc.target/i386/indirect-thunk-register-1.c: Likewise.
* gcc.target/i386/indirect-thunk-register-3.c: Likewise.
* gcc.target/i386/indirect-thunk-register-4.c: Likewise.
* gcc.target/i386/ret-thunk-10.c: Likewise.
* gcc.target/i386/ret-thunk-11.c: Likewise.
* gcc.target/i386/ret-thunk-12.c: Likewise.
* gcc.target/i386/ret-thunk-13.c: Likewise.
* gcc.target/i386/ret-thunk-14.c: Likewise.
* gcc.target/i386/ret-thunk-15.c: Likewise.
* gcc.target/i386/ret-thunk-9.c: Don't check the
__x86_return_thunk label.
Scan for "push" only for Linux.
---
 gcc/config/i386/i386.c  | 3 ++-
 gcc/testsuite/gcc.target/i386/indirect-thunk-1.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-2.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-3.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-4.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-7.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c   | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c| 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-10.c| 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-11.c| 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-13.c| 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-14.c| 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-15.c| 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-9.c | 3 +--
 31 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5e4f845a1bd..bfb31db8752 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10970,7 +10970,6 @@ output_indirect_thunk_function (bool need_bnd_p, int 
regno)
   char alias[32];
 
   indirect_thunk_name (alias, regno, need_bnd_p, true);
-  ASM_OUTPUT_DEF (asm_out_file, alias, name);
 #if TARGET_MACHO
   if (TARGET_MACHO)
{
@@ -10979,8 +10978,10 @@ output_indirect_thunk_function (bool need_bnd_p, int 
regno)
  fputs ("\n\t.private_extern\t", asm_out_file);
  assemble_name (asm_out_file, alias);
  putc ('\n', asm_out_file);
+ ASM_OUTPUT_LABEL (asm_out_file, alias);
}
 #else
+  ASM_OUTPUT_DEF (asm_out_file, alias, 

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
>  wrote:
>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>>> Now my patch set has been checked into trunk.  Here is a patch set
>>> to move struct ix86_frame to machine_function on GCC 7, which is
>>> needed to backport the patch set to GCC 7:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>>> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
>>>
>>> OK for gcc-7-branch?
>>
>> Yes, backporting is ok - please watch for possible fallout on trunk and make
>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
>> Wednesday now with the final release about a week later if no issue shows
>> up.
>>
>
> Backport is blocked by
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>
> There are many test failures due to lack of comdat support in linker on 
> Solaris.
> I can limit these tests to Linux.

These are testcase issues and shouldn't block backport to GCC 7.

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
>
> Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  I
> uploaded a patch:
>
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
>
> There is no confirmation on it.  Also there may be test failures on Dardwin
> due to difference in assembly output.

I posted a patch for Darwin build:

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html

This needs to be checked into trunk before I can start backport to GCC 7.

-- 
H.J.


Re: [PATCH 3/5] x86: Add -mindirect-branch-register

2018-01-15 Thread Uros Bizjak
On Mon, Jan 15, 2018 at 4:05 AM, H.J. Lu  wrote:
> On Sun, Jan 14, 2018 at 1:23 PM, H.J. Lu  wrote:
>> On Sun, Jan 14, 2018 at 10:52 AM, Uros Bizjak  wrote:
>>> On Sun, Jan 14, 2018 at 7:08 PM, H.J. Lu  wrote:
 On Sun, Jan 14, 2018 at 9:51 AM, Uros Bizjak  wrote:
> -  (ior (and (not (match_test "TARGET_X32"))
> +  (ior (and (not (match_test "TARGET_X32
> +  || ix86_indirect_branch_thunk_register"))
>  (match_operand 0 "sibcall_memory_operand"))
> -   (and (match_test "TARGET_X32 && Pmode == DImode")
> +   (and (match_test "TARGET_X32 && Pmode == DImode
> + && !ix86_indirect_branch_thunk_register")
>  (match_operand 0 "GOT_memory_operand"
>
> Is this patch just trying to disable the predicate when
> ix86_indirect_branch_thunk_register is set? Because this is what this
> convoluted logic does.

 Yes, we want to disable all indirect branch via memory with
 -mindirect-branch-register, just like -mx32.   We could do

 #idefine TARGET_INDIRECT_BRANCH_REGISTER \
  (TARGER_X32 ||  ix86_indirect_branch_thunk_register)
>>>
>>> Index: predicates.md
>>> ===
>>> --- predicates.md   (revision 25)
>>> +++ predicates.md   (working copy)
>>> @@ -710,11 +710,10 @@
>>>(ior (match_test "constant_call_address_operand
>>>  (op, mode == VOIDmode ? mode : Pmode)")
>>> (match_operand 0 "call_register_no_elim_operand")
>>> -   (ior (and (not (match_test "TARGET_X32
>>> -  || ix86_indirect_branch_thunk_register"))
>>> +   (and (not (match_test "ix86_indirect_branch_thunk_register"))
>>> +   (ior (and (not (match_test "TARGET_X32")))
>>>  (match_operand 0 "memory_operand"))
>>> -   (and (match_test "TARGET_X32 && Pmode == DImode
>>> - && !ix86_indirect_branch_thunk_register")
>>> +(and (match_test "TARGET_X32 && Pmode == DImode")
>>>  (match_operand 0 "GOT_memory_operand")
>>>
>>> or something like that.
>>>
>>
>> I am testing this patch.  OK for trunk if there is no regression?
>>
>
> Here is the updated patch.  Tested on i686 and x86-64.  OK for
> trunk?

There are two of the same issues in constraints.md

On further inspection, there are several new
ix86_indirect_branch_thunk_register conditions sprinkled around
predicates.md. The one in indirect_branch_operand is understandable
(but should be written as:

  (ior (match_operand 0 "register_operand")
   (and (not (match_test "ix86_indirect_thunk_register"))
(not (match_test "TARGET_X32"))
(match_operand 0 "memory_operand"
)

but the ones in GOT_memory_operand and GOT32_symbol_operand should
*not* be there, since these are simple pattern matches. Now we have
situation where e.g. call_got_x32 and sibcall_got_32 patterns never
match, and should be disabled with
ix86_indirect_branch_thunk_register. Please move
ix86_indirect_branch_thunk_register conditions out of these two
predicates.

Uros.


Re: [PATCH, rs6000] Executable tests for -msafe-indirect-jumps

2018-01-15 Thread Segher Boessenkool
Hi!

On Sun, Jan 14, 2018 at 11:34:06AM -0600, Bill Schmidt wrote:
> It was pointed out off-list that I should add some executable tests for
> the new -msafe-indirect-jumps implementation.  This patch adds three
> such tests to demonstrate correct behavior.
> 
> Tested on powerpc64-linux-gnu and powerpc64le-linux-gnu.  Are these tests
> okay for trunk after the other patch is approved?

These look fine, so sure.  One nit:

> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-4.c   (nonexistent)
> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-4.c   (working copy)
> @@ -0,0 +1,25 @@
> +/* { dg-do run { target { powerpc64le-*-* } } } */
> +/* { dg-additional-options "-msafe-indirect-jumps" } */

You could as well run all these tests on powerpc*-*-* as far as I see?
Or does that -m error if there is no "safe" implementation for the current
target?


Segher


Re: [PATCH, rs6000] Executable tests for -msafe-indirect-jumps

2018-01-15 Thread Bill Schmidt
On Jan 15, 2018, at 11:05 AM, Segher Boessenkool  
wrote:
> 
> Hi!
> 
> On Sun, Jan 14, 2018 at 11:34:06AM -0600, Bill Schmidt wrote:
>> It was pointed out off-list that I should add some executable tests for
>> the new -msafe-indirect-jumps implementation.  This patch adds three
>> such tests to demonstrate correct behavior.
>> 
>> Tested on powerpc64-linux-gnu and powerpc64le-linux-gnu.  Are these tests
>> okay for trunk after the other patch is approved?
> 
> These look fine, so sure.  One nit:
> 
>> --- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-4.c  (nonexistent)
>> +++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-4.c  (working copy)
>> @@ -0,0 +1,25 @@
>> +/* { dg-do run { target { powerpc64le-*-* } } } */
>> +/* { dg-additional-options "-msafe-indirect-jumps" } */
> 
> You could as well run all these tests on powerpc*-*-* as far as I see?
> Or does that -m error if there is no "safe" implementation for the current
> target?

Ah, yes, certainly.  The compile-only tests can't, but the execution ones can.
I'll fix that in the next revision.  Thanks!

Bill
> 
> 
> Segher
> 



Re: [PATCH 2/2] Fix unstable sort

2018-01-15 Thread Cory Fields
Thanks!

Cory

On Jan 15, 2018 1:06 AM, "Jeff Law"  wrote:

> On 01/12/2018 01:58 PM, li...@coryfields.com wrote:
> > From: Cory Fields 
> >
> > 2018-01-12  Cory Fields  
> >* tree-ira.c (allocno_hard_regs_compare): stabilize sort
> Thanks.  I fixed the ChangeLog entry and installed hte patch on the trunk.
>
> jeff
>


Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Will Schmidt
On Mon, 2018-01-15 at 10:24 +, Richard Sandiford wrote:
> >> +  for (int i = 0; i < midpoint; i++)
> >> +{
> >> +  tree tmp1 = build_int_cst (lhs_type_type, offset + i);
> >> +  tree tmp2 = build_int_cst (lhs_type_type, offset + n_elts +
> i);
> >> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp1);
> >> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp2);
> >> +}
> >> +  tree permute = create_tmp_reg_or_ssa_name (lhs_type);
> >> +  g = gimple_build_assign (permute, build_constructor (lhs_type,
> ctor_elts));
> >
> > I think this is no longer canonical GIMPLE (Richard?)
> 
> FWIW, although the recent patches added the option of using wider
> permute vectors if the permute vector is constant, it's still OK to
> use the original style of permute vectors if that's known to be valid.
> In this case it is because we know the indices won't wrap, given the
> size of the input vectors.

Ok.

> > and given it is also a constant you shouldn't emit a CONSTRUCTOR
> here
> > but directly construct the appropriate VECTOR_CST.  So it looks like
> > the mergel/h intrinsics interleave the low or high part of two
> > vectors? 

Right, it is an interleaving of the two vectors.  The size and contents
vary depending on the type, and though i briefly considered building up
a if/else table, this approach was far simpler (and less error prone for
me) to code up.
i.e. (int, mergel)   (permute) D.2885 = {0, 4, 1, 5};
(long long, mergel)  (permute) D.2876 = {1, 3};

Thanks
-Will




Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Will Schmidt
On Mon, 2018-01-15 at 09:08 -0600, Segher Boessenkool wrote:
> Hi Will,
> 
> On Fri, Jan 12, 2018 at 03:22:06PM -0600, Will Schmidt wrote:
> >   Add support for gimple folding of the mergeh, mergel intrinsics.
> > Since the merge low and merge high variants are almost identical, a
> > new helper function has been added so that code can be shared.
> > 
> > This also adds define_insn for xxmrghw, xxmrglw instructions, allowing us
> > to generate xxmrglw instead of vmrglw after folding.  A few whitespace
> > fixes have been made to the existing vmrg?w defines.
> > 
> > The changes introduced here affect the existing target testcases
> > gcc.target/powerpc/builtins-1-be.c and builtins-1-le.c, such that
> > a number of the scan-assembler tests would fail due to instruction counts
> > changing.  Since the purpose of that test is to primarily ensure those
> > intrinsics are accepted by the compiler, I have disabled gimple-folding for
> > the existing tests that count instructions, and created new variants of 
> > those
> > tests with folding enabled and a higher optimization level, that do not 
> > count
> > instructions.
> 
> > 2018-01-12  Will Schmidt  
> > 
> > * config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
> > support for merge[hl].
> 
> New line here.
> 
> > (fold_mergehl_helper): New helper function.
> > * config/rs6000/altivec.md (altivec_xxmrghw_direct): New.
> > (altivec_xxmrglw_direct): New.
> 
> > +(define_insn "altivec_xxmrghw_direct"
> > +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> > +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> > + (match_operand:V4SI 2 "register_operand" "v")]
> > +UNSPEC_VMRGH_DIRECT))]
> > +  "TARGET_P8_VECTOR"
> > +  "xxmrghw %x0,%x1,%x2"
> > +  [(set_attr "type" "vecperm")])
> > +
> >  (define_insn "altivec_vmrghw_direct"
> >[(set (match_operand:V4SI 0 "register_operand" "=v")
> >  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> >(match_operand:V4SI 2 "register_operand" "v")]
> >   UNSPEC_VMRGH_DIRECT))]
> 
> How do these two differ?  The xx variant can write all 64 VSR registers,
> it needs different constraints (wa?).  Can the two patterns be merged?
> It doesn't need the TARGET_P8_VECTOR condition then: the constraints
> will handle that.  And actually it is a v2.06 insn (p7)?


They differ in.. 
  TARGET_P8_VECTOR, versus TARGET_ALTIVEC
  xxmrghw %x0,%x1,%x2,  versus vmrghw %0,%1,%2

Not clear to me if they can be merged.   I'm weak in my grasp of the
constraints.  I can dig into that, (and would accept additional hints
too :-)  )

xxmrghw does show up in my book as V2.06. 


In full, for reference:


(define_insn "altivec_xxmrghw_direct"
  [(set (match_operand:V4SI 0 "register_operand" "=v")
(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
  (match_operand:V4SI 2 "register_operand" "v")]
 UNSPEC_VMRGH_DIRECT))]
  "TARGET_P8_VECTOR"
  "xxmrghw %x0,%x1,%x2"
  [(set_attr "type" "vecperm")])

(define_insn "altivec_vmrghw_direct"
  [(set (match_operand:V4SI 0 "register_operand" "=v")
(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
  (match_operand:V4SI 2 "register_operand" "v")]
 UNSPEC_VMRGH_DIRECT))]
  "TARGET_ALTIVEC"
  "vmrghw %0,%1,%2"
  [(set_attr "type" "vecperm")])



> 
> >  (define_insn "altivec_vmrglb_direct"
> >[(set (match_operand:V16QI 0 "register_operand" "=v")
> >  (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
> 
> This line should start with a tab as well?
> 
> > -  (match_operand:V16QI 2 "register_operand" "v")]
> > -  UNSPEC_VMRGL_DIRECT))]
> > +  (match_operand:V16QI 2 "register_operand" "v")]
> > + UNSPEC_VMRGL_DIRECT))]
> >"TARGET_ALTIVEC"
> >"vmrglb %0,%1,%2"
> >[(set_attr "type" "vecperm")])
> 
> 
> >  (define_insn "altivec_vmrglh_direct"
> >[(set (match_operand:V8HI 0 "register_operand" "=v")
> >  (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
> 
> Same here.
> 
> >   (match_operand:V8HI 2 "register_operand" "v")]
> > - UNSPEC_VMRGL_DIRECT))]
> > +UNSPEC_VMRGL_DIRECT))]
> >"TARGET_ALTIVEC"
> >"vmrglh %0,%1,%2"
> >[(set_attr "type" "vecperm")])
> 
> 
> > +(define_insn "altivec_xxmrglw_direct"
> > +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> > +   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> > + (match_operand:V4SI 2 "register_operand" "v")]
> > +UNSPEC_VMRGL_DIRECT))]
> > +  "TARGET_P8_VECTOR"
> > +  "xxmrglw %x0,%x1,%x2"
> > +  [(set_attr "type" "vecperm")])
> 
> Exactly analogous to mrghw comments.
> 
> > +/* Helper function to handle the vector merge[hl] built-ins.  The
> > + implementation 

Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-15 Thread Segher Boessenkool
On Mon, Jan 15, 2018 at 11:29:50AM -0600, Will Schmidt wrote:
> > How do these two differ?  The xx variant can write all 64 VSR registers,
> > it needs different constraints (wa?).  Can the two patterns be merged?
> > It doesn't need the TARGET_P8_VECTOR condition then: the constraints
> > will handle that.  And actually it is a v2.06 insn (p7)?
> 
> 
> They differ in.. 
>   TARGET_P8_VECTOR, versus TARGET_ALTIVEC
>   xxmrghw %x0,%x1,%x2,  versus vmrghw %0,%1,%2
> 
> Not clear to me if they can be merged.   I'm weak in my grasp of the
> constraints.  I can dig into that, (and would accept additional hints
> too :-)  )
> 
> xxmrghw does show up in my book as V2.06. 
> 
> 
> In full, for reference:
> 
> 
> (define_insn "altivec_xxmrghw_direct"
>   [(set (match_operand:V4SI 0 "register_operand" "=v")
>   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> (match_operand:V4SI 2 "register_operand" "v")]
>UNSPEC_VMRGH_DIRECT))]
>   "TARGET_P8_VECTOR"
>   "xxmrghw %x0,%x1,%x2"
>   [(set_attr "type" "vecperm")])
> 
> (define_insn "altivec_vmrghw_direct"
>   [(set (match_operand:V4SI 0 "register_operand" "=v")
> (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>   (match_operand:V4SI 2 "register_operand" "v")]
>  UNSPEC_VMRGH_DIRECT))]
>   "TARGET_ALTIVEC"
>   "vmrghw %0,%1,%2"
>   [(set_attr "type" "vecperm")])

Something like


(define_insn "altivec_vmrghw_direct"
  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v,wa")
  (match_operand:V4SI 2 "register_operand" "v,wa")]
 UNSPEC_VMRGH_DIRECT))]
  "TARGET_ALTIVEC"
  "*
   vmrghw %0,%1,%2
   xxmrghw %x0,%x1,%x2"
  [(set_attr "type" "vecperm")])


should work (but maybe the "vmrghw" name should be changed then, drop
the "v"?  Not sure what we do elsewhere.  Fine to postpone that, too).

(The "wa" constraint is not active unless there is VSX support, which
is exactly ISA 2.06 and up).


Segher


Re: [PATCH v2, rs6000] Add -msafe-indirect-jumps option and implement safe bctr / bctrl

2018-01-15 Thread Bill Schmidt
Hi Segher,

Thanks for the quick review!

> On Jan 15, 2018, at 10:38 AM, Segher Boessenkool  
> wrote:
> 
> Hi!
> 
> On Sat, Jan 13, 2018 at 10:53:57PM -0600, Bill Schmidt wrote:
>> This patch adds a new option for the compiler to produce only "safe" indirect
>> jumps, in the sense that these jumps are deliberately mispredicted to inhibit
>> speculative execution.  For now, this option is undocumented; this may change
>> at some future date.  It is intended eventually for the linker to also honor
>> this flag when creating PLT stubs, for example.
> 
> I think we settled on calling the option -mmispredict-indirect-jumps;
> please let me know if you still agree with that.  Or have thought of a
> better name :-)

Looks like we are now looking at -m[no-]speculate-indirect-jumps with
default to true.  LLVM folks are in agreement too.
> 
>> In addition to the new option, I've included changes to indirect calls for
>> the ELFv2 ABI when the option is specified.  In place of bctrl, we generate
>> a "crset eq" followed by a beqctrl-.  Using the CR0.eq bit is safe since CR0
>> is volatile over the call.
> 
> And CR0 is unused by the call; compare to CR1 (on older ABIs) for example.
> 
>> I've also added code to replace uses of bctr when the new option is 
>> specified,
>> with the sequence
>> 
>>  crset 4x[CRb]+2
>>  beqctr- CRb
>>  b .
>> 
>> where CRb is an available condition register field.  This applies to all
>> subtargets, and in particular is not restricted to ELFv2.  The use cases
>> covered here are computed gotos and switch statements.
>> 
>> NOT yet covered by this patch: indirect calls for ELFv1.  That will come 
>> later.
> 
> Would be nice to have it for all ABIs, even.

Yeah, that's the plan.  Everything that uses bctr[l].  I used too loose of 
language.
> 
>> Please let me know if there is a better way to represent the crset without
>> an unspec.
> 
> See the various patterns using cr%q.  I'm not sure they can generate creqv
> (i.e. crset) currently, but that could be added (like crnot is there already,
> for example).  If you don't use unspec (or maybe unspec_volatile) it can be
> optimised away though.
> 
> Maybe it is best not to put the crset into its own insn, just make it part
> of the bigger pattern, with an appropriate clobber?
> 
>> For the indirect jump, I don't see a way around it due to the
>> expected form of indirect jumps in cfganal.c.
> 
> I'm not sure what you are getting at here, could you explain a bit?

An indirect jump is expected to be of the form (set (pc) (...other stuff));
otherwise it might get missed and blocks that are only reachable from
an indirect jump can get deleted.  I found this out when I tried to do
something involving a parallel that was not very bright; that doesn't
actually prevent anything if you are doing things right.  And the
clobber solution you suggest should be just fine.

tldr:  Ignore my lunatic ravings. ;-)

Thanks,
Bill
> 
>> (define_expand "indirect_jump"
>> -  [(set (pc) (match_operand 0 "register_operand"))])
>> +  [(set (pc) (match_operand 0 "register_operand"))]
>> + ""
>> +{
>> +  /* We need to reserve a CR when forcing a mispredicted jump.  */
>> +  if (rs6000_safe_indirect_jumps) {
>> +rtx ccreg = gen_reg_rtx (CCmode);
>> +emit_insn (gen_rtx_SET (ccreg,
>> +gen_rtx_UNSPEC (CCmode,
>> +gen_rtvec (1, const0_rtx),
>> +UNSPEC_CRSET_EQ)));
>> +rtvec v = rtvec_alloc (2);
>> +RTVEC_ELT (v, 0) = operands[0];
>> +RTVEC_ELT (v, 1) = ccreg;
>> +emit_jump_insn (gen_rtx_SET (pc_rtx,
>> + gen_rtx_UNSPEC (Pmode, v,
>> + UNSPEC_COMP_GOTO_CR)));
>> +DONE;
>> +  }
>> +})
>> 
>> (define_insn "*indirect_jump"
>>   [(set (pc)
>>  (match_operand:P 0 "register_operand" "c,*l"))]
>> -  ""
>> +  "!rs6000_safe_indirect_jumps"
>>   "b%T0"
>>   [(set_attr "type" "jmpreg")])
>> 
>> +(define_insn "*indirect_jump_safe"
>> +  [(set (pc)
>> +(unspec:P [(match_operand:P 0 "register_operand" "c,*l")
>> +   (match_operand:CC 1 "cc_reg_operand" "y,y")]
>> +   UNSPEC_COMP_GOTO_CR))]
>> +  "rs6000_safe_indirect_jumps"
>> +  "beq%T0- %1\;b ."
>> +  [(set_attr "type" "jmpreg")
>> +   (set_attr "length" "8")])
>> +
>> +(define_insn "*set_cr_eq"
>> +  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
>> +(unspec:CC [(const_int 0)] UNSPEC_CRSET_EQ))]
>> +  "rs6000_safe_indirect_jumps"
>> +  "crset %E0"
>> +  [(set_attr "type" "cr_logical")])
> 
> So merge this latter insn into the previous, making the CC a clobber?
> Like (not tested):
> 
> +(define_insn "indirect_jump_mispredict"
> +  [(set (pc)
> + (match_operand:P 0 "register_operand" "c,*l")
> +   (clobber (match_operand:CC 1 "cc_reg_operand" "y,y"))]
> +  "rs6000_safe_indirect_jumps"
> +  "crset %E1\;beq%T0- %1\;b ."
> +  [(set_attr "type" "jmpreg")
> +   (set_attr "length" "12"

Re: Fwd: [PATCH] i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO

2018-01-15 Thread Jan Hubicka
> Hi Jan,
> 
> Can you review this patch?  This blocks the GCC 7 backport.
> 
> Thanks.
> 
> H.J.
> 
> 
> -- Forwarded message --
> From: H.J. Lu 
> Date: Mon, Jan 15, 2018 at 8:45 AM
> Subject: [PATCH] i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO
> To: gcc-patches@gcc.gnu.org
> Cc: Uros Bizjak 
> 
> 
> ASM_OUTPUT_DEF isn't defined for TARGET_MACHO.  Use ASM_OUTPUT_LABEL to
> generate the __x86_return_thunk label, instead of the set directive.
> Update testcase to remove the __x86_return_thunk label check.  Since
> -fno-pic is ignored on Darwin, update testcases to sscan or "push"
> only on Linux.
> 
> Tested with a cross compiler to x86_64-apple-darwin10.4.0.  OK for
> trunk?
> 
> H.J.
> ---
> gcc/
> 
> PR target/83839
> * config/i386/i386.c (output_indirect_thunk_function): Use
> ASM_OUTPUT_LABEL, instead of ASM_OUTPUT_DEF, for TARGET_MACHO
> for  __x86.return_thunk.

Hmm, we really ought to merge it with the way normal thunks are output from
middle-end next stage1.

OK
Honza
> 
> gcc/testsuite/
> 
> PR target/83839
> * gcc.target/i386/indirect-thunk-1.c: Scan for "push" only on
> Linux.
> * gcc.target/i386/indirect-thunk-2.c: Likewise.
> * gcc.target/i386/indirect-thunk-3.c: Likewise.
> * gcc.target/i386/indirect-thunk-4.c: Likewise.
> * gcc.target/i386/indirect-thunk-7.c: Likewise.
> * gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
> * gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
> * gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
> * gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
> * gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
> * gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
> * gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
> * gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
> * gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
> * gcc.target/i386/indirect-thunk-extern-7.c: Likewise.
> * gcc.target/i386/indirect-thunk-register-1.c: Likewise.
> * gcc.target/i386/indirect-thunk-register-3.c: Likewise.
> * gcc.target/i386/indirect-thunk-register-4.c: Likewise.
> * gcc.target/i386/ret-thunk-10.c: Likewise.
> * gcc.target/i386/ret-thunk-11.c: Likewise.
> * gcc.target/i386/ret-thunk-12.c: Likewise.
> * gcc.target/i386/ret-thunk-13.c: Likewise.
> * gcc.target/i386/ret-thunk-14.c: Likewise.
> * gcc.target/i386/ret-thunk-15.c: Likewise.
> * gcc.target/i386/ret-thunk-9.c: Don't check the
> __x86_return_thunk label.
> Scan for "push" only for Linux.
> ---
>  gcc/config/i386/i386.c  | 3 ++-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-1.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-2.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-3.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-4.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-7.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-1.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-2.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-3.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-4.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-5.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-6.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-attr-7.c   | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-1.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-2.c| 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-extern-1.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-extern-2.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-extern-3.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-extern-4.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-extern-7.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-inline-1.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-inline-2.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-inline-3.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-inline-4.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-inline-7.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-10.c| 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-11.c| 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-13.c| 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-14.c| 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-15.c| 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-9.c | 3 +--
>  31 files changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 5e4f845a1bd..bfb31db8752 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -

Re: [PATCH v2, rs6000] Add -msafe-indirect-jumps option and implement safe bctr / bctrl

2018-01-15 Thread Segher Boessenkool
On Mon, Jan 15, 2018 at 11:54:41AM -0600, Bill Schmidt wrote:
> > I think we settled on calling the option -mmispredict-indirect-jumps;
> > please let me know if you still agree with that.  Or have thought of a
> > better name :-)
> 
> Looks like we are now looking at -m[no-]speculate-indirect-jumps with
> default to true.  LLVM folks are in agreement too.

A fine name!

> >> For the indirect jump, I don't see a way around it due to the
> >> expected form of indirect jumps in cfganal.c.
> > 
> > I'm not sure what you are getting at here, could you explain a bit?
> 
> An indirect jump is expected to be of the form (set (pc) (...other stuff));
> otherwise it might get missed and blocks that are only reachable from
> an indirect jump can get deleted.  I found this out when I tried to do
> something involving a parallel that was not very bright; that doesn't
> actually prevent anything if you are doing things right.  And the
> clobber solution you suggest should be just fine.

(rtlanal.c:computed_jump_p).  Yeah, PARALLELs work just fine.


Segher


[PATCH 4/4] i386: Rewrite indirect_branch_operand logic

2018-01-15 Thread H.J. Lu
* config/i386/predicates.md (indirect_branch_operand): Rewrite
ix86_indirect_branch_register logic.
---
 gcc/config/i386/predicates.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index a502657f9e3..2f2393b9e3e 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -665,8 +665,8 @@
 ;; Test for a valid operand for indirect branch.
 (define_predicate "indirect_branch_operand"
   (ior (match_operand 0 "register_operand")
-   (and (not (match_test "TARGET_X32
- || ix86_indirect_branch_register"))
+   (and (not (match_test "ix86_indirect_branch_register"))
+   (not (match_test "TARGET_X32"))
(match_operand 0 "memory_operand"
 
 ;; Return true if OP is a memory operands that can be used in sibcalls.
-- 
2.14.3



[PATCH 2/4] x86: Rewrite ix86_indirect_branch_register logic

2018-01-15 Thread H.J. Lu
Rewrite ix86_indirect_branch_register logic with

(and (not (match_test "ix86_indirect_branch_register"))
 (original condition before r256662))

* config/i386/predicates.md (constant_call_address_operand):
Rewrite ix86_indirect_branch_register logic.
(sibcall_insn_operand): Likewise.
---
 gcc/config/i386/predicates.md | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 64f13a01326..6ec7ff2e784 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -710,24 +710,22 @@
   (ior (match_test "constant_call_address_operand
 (op, mode == VOIDmode ? mode : Pmode)")
(match_operand 0 "call_register_no_elim_operand")
-   (ior (and (not (match_test "TARGET_X32
-  || ix86_indirect_branch_register"))
-(match_operand 0 "memory_operand"))
-   (and (match_test "TARGET_X32 && Pmode == DImode
- && !ix86_indirect_branch_register")
-(match_operand 0 "GOT_memory_operand")
+   (and (not (match_test "ix86_indirect_branch_register"))
+   (ior (and (not (match_test "TARGET_X32"))
+ (match_operand 0 "memory_operand"))
+(and (match_test "TARGET_X32 && Pmode == DImode")
+ (match_operand 0 "GOT_memory_operand"))
 
 ;; Similarly, but for tail calls, in which we cannot allow memory references.
 (define_special_predicate "sibcall_insn_operand"
   (ior (match_test "constant_call_address_operand
 (op, mode == VOIDmode ? mode : Pmode)")
(match_operand 0 "register_no_elim_operand")
-   (ior (and (not (match_test "TARGET_X32
-  || ix86_indirect_branch_register"))
-(match_operand 0 "sibcall_memory_operand"))
-   (and (match_test "TARGET_X32 && Pmode == DImode
- && !ix86_indirect_branch_register")
-(match_operand 0 "GOT_memory_operand")
+   (and (not (match_test "ix86_indirect_branch_register"))
+   (ior (and (not (match_test "TARGET_X32"))
+ (match_operand 0 "sibcall_memory_operand"))
+(and (match_test "TARGET_X32 && Pmode == DImode")
+ (match_operand 0 "GOT_memory_operand"))
 
 ;; Return true if OP is a 32-bit GOT symbol operand.
 (define_predicate "GOT32_symbol_operand"
-- 
2.14.3



[PATCH 3/4] Don't check ix86_indirect_branch_register for GOT operand

2018-01-15 Thread H.J. Lu
Since GOT_memory_operand and GOT32_symbol_operand are simple pattern
matches, don't check ix86_indirect_branch_register here.  If needed,
-mindirect-branch= will convert indirect branch via GOT slot to a call
and return thunk.

* config/i386/constraints.md (Bs): Update
ix86_indirect_branch_register check.  Don't check
ix86_indirect_branch_register with GOT_memory_operand.
(Bw): Likewise.
* config/i386/predicates.md (GOT_memory_operand): Don't check
ix86_indirect_branch_register here.
(GOT32_symbol_operand): Likewise.
---
 gcc/config/i386/constraints.md | 14 ++
 gcc/config/i386/predicates.md  |  6 ++
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index d6072b9bcd9..664e906b311 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -225,20 +225,18 @@
 
 (define_constraint "Bs"
   "@internal Sibcall memory operand."
-  (ior (and (not (match_test "TARGET_X32
- || ix86_indirect_branch_register"))
+  (ior (and (not (match_test "ix86_indirect_branch_register"))
+   (not (match_test "TARGET_X32"))
(match_operand 0 "sibcall_memory_operand"))
-   (and (match_test "TARGET_X32 && Pmode == DImode
-&& !ix86_indirect_branch_register")
+   (and (match_test "TARGET_X32 && Pmode == DImode")
(match_operand 0 "GOT_memory_operand"
 
 (define_constraint "Bw"
   "@internal Call memory operand."
-  (ior (and (not (match_test "TARGET_X32
- || ix86_indirect_branch_register"))
+  (ior (and (not (match_test "ix86_indirect_branch_register"))
+   (not (match_test "TARGET_X32"))
(match_operand 0 "memory_operand"))
-   (and (match_test "TARGET_X32 && Pmode == DImode
-&& !ix86_indirect_branch_register")
+   (and (match_test "TARGET_X32 && Pmode == DImode")
(match_operand 0 "GOT_memory_operand"
 
 (define_constraint "Bz"
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 6ec7ff2e784..a502657f9e3 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -695,8 +695,7 @@
 
 ;; Return true if OP is a GOT memory operand.
 (define_predicate "GOT_memory_operand"
-  (and (match_test "!ix86_indirect_branch_register")
-   (match_operand 0 "memory_operand"))
+  (match_operand 0 "memory_operand")
 {
   op = XEXP (op, 0);
   return (GET_CODE (op) == CONST
@@ -729,8 +728,7 @@
 
 ;; Return true if OP is a 32-bit GOT symbol operand.
 (define_predicate "GOT32_symbol_operand"
-  (match_test "!ix86_indirect_branch_register
-  && GET_CODE (op) == CONST
+  (match_test "GET_CODE (op) == CONST
&& GET_CODE (XEXP (op, 0)) == UNSPEC
&& XINT (XEXP (op, 0), 1) == UNSPEC_GOT"))
 
-- 
2.14.3



[PATCH 1/4] i386: Rename to ix86_indirect_branch_register

2018-01-15 Thread H.J. Lu
Rename the variable for -mindirect-branch-register to
ix86_indirect_branch_register to match the command-line option name.

* config/i386/constraints.md (Bs): Replace
ix86_indirect_branch_thunk_register with
ix86_indirect_branch_register.
(Bw): Likewise.
* config/i386/i386.md (indirect_jump): Likewise.
(tablejump): Likewise.
(*sibcall_memory): Likewise.
(*sibcall_value_memory): Likewise.
Peepholes of indirect call and jump via memory: Likewise.
* config/i386/i386.opt: Likewise.
* config/i386/predicates.md (indirect_branch_operand): Likewise.
(GOT_memory_operand): Likewise.
(call_insn_operand): Likewise.
(sibcall_insn_operand): Likewise.
(GOT32_symbol_operand): Likewise.
---
 gcc/config/i386/constraints.md |  8 
 gcc/config/i386/i386.md| 18 +-
 gcc/config/i386/i386.opt   |  2 +-
 gcc/config/i386/predicates.md  | 14 +++---
 4 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 5592c43073e..d6072b9bcd9 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -226,19 +226,19 @@
 (define_constraint "Bs"
   "@internal Sibcall memory operand."
   (ior (and (not (match_test "TARGET_X32
- || ix86_indirect_branch_thunk_register"))
+ || ix86_indirect_branch_register"))
(match_operand 0 "sibcall_memory_operand"))
(and (match_test "TARGET_X32 && Pmode == DImode
-&& !ix86_indirect_branch_thunk_register")
+&& !ix86_indirect_branch_register")
(match_operand 0 "GOT_memory_operand"
 
 (define_constraint "Bw"
   "@internal Call memory operand."
   (ior (and (not (match_test "TARGET_X32
- || ix86_indirect_branch_thunk_register"))
+ || ix86_indirect_branch_register"))
(match_operand 0 "memory_operand"))
(and (match_test "TARGET_X32 && Pmode == DImode
-&& !ix86_indirect_branch_thunk_register")
+&& !ix86_indirect_branch_register")
(match_operand 0 "GOT_memory_operand"
 
 (define_constraint "Bz"
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index fff73fe18e0..5cd3ec093cd 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12311,7 +12311,7 @@
   [(set (pc) (match_operand 0 "indirect_branch_operand"))]
   ""
 {
-  if (TARGET_X32 || ix86_indirect_branch_thunk_register)
+  if (TARGET_X32 || ix86_indirect_branch_register)
 operands[0] = convert_memory_address (word_mode, operands[0]);
   cfun->machine->has_local_indirect_jump = true;
 })
@@ -12365,7 +12365,7 @@
 OPTAB_DIRECT);
 }
 
-  if (TARGET_X32 || ix86_indirect_branch_thunk_register)
+  if (TARGET_X32 || ix86_indirect_branch_register)
 operands[0] = convert_memory_address (word_mode, operands[0]);
   cfun->machine->has_local_indirect_jump = true;
 })
@@ -12614,7 +12614,7 @@
   [(call (mem:QI (match_operand:W 0 "memory_operand" "m"))
 (match_operand 1))
(unspec [(const_int 0)] UNSPEC_PEEPSIB)]
-  "!TARGET_X32 && !ix86_indirect_branch_thunk_register"
+  "!TARGET_X32 && !ix86_indirect_branch_register"
   "* return ix86_output_call_insn (insn, operands[0]);"
   [(set_attr "type" "call")])
 
@@ -12624,7 +12624,7 @@
(call (mem:QI (match_dup 0))
 (match_operand 3))]
   "!TARGET_X32
-   && !ix86_indirect_branch_thunk_register
+   && !ix86_indirect_branch_register
&& SIBLING_CALL_P (peep2_next_insn (1))
&& !reg_mentioned_p (operands[0],
CALL_INSN_FUNCTION_USAGE (peep2_next_insn (1)))"
@@ -12639,7 +12639,7 @@
(call (mem:QI (match_dup 0))
 (match_operand 3))]
   "!TARGET_X32
-   && !ix86_indirect_branch_thunk_register
+   && !ix86_indirect_branch_register
&& SIBLING_CALL_P (peep2_next_insn (2))
&& !reg_mentioned_p (operands[0],
CALL_INSN_FUNCTION_USAGE (peep2_next_insn (2)))"
@@ -12737,7 +12737,7 @@
 (match_operand:W 1 "memory_operand"))
(set (pc) (match_dup 0))]
   "!TARGET_X32
-   && !ix86_indirect_branch_thunk_register
+   && !ix86_indirect_branch_register
&& peep2_reg_dead_p (2, operands[0])"
   [(set (pc) (match_dup 1))])
 
@@ -12819,7 +12819,7 @@
(call (mem:QI (match_operand:W 1 "memory_operand" "m"))
  (match_operand 2)))
(unspec [(const_int 0)] UNSPEC_PEEPSIB)]
-  "!TARGET_X32 && !ix86_indirect_branch_thunk_register"
+  "!TARGET_X32 && !ix86_indirect_branch_register"
   "* return ix86_output_call_insn (insn, operands[1]);"
   [(set_attr "type" "callv")])
 
@@ -12830,7 +12830,7 @@
(call (mem:QI (match_dup 0))
 (match_operand 3)))]
   "!TARGET_X32
-   && !ix86_indirect_branch_thunk_register
+   && !i

Re: [PATCH 3/5] x86: Add -mindirect-branch-register

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 8:54 AM, Uros Bizjak  wrote:
> On Mon, Jan 15, 2018 at 4:05 AM, H.J. Lu  wrote:
>> On Sun, Jan 14, 2018 at 1:23 PM, H.J. Lu  wrote:
>>> On Sun, Jan 14, 2018 at 10:52 AM, Uros Bizjak  wrote:
 On Sun, Jan 14, 2018 at 7:08 PM, H.J. Lu  wrote:
> On Sun, Jan 14, 2018 at 9:51 AM, Uros Bizjak  wrote:
>> -  (ior (and (not (match_test "TARGET_X32"))
>> +  (ior (and (not (match_test "TARGET_X32
>> +  || ix86_indirect_branch_thunk_register"))
>>  (match_operand 0 "sibcall_memory_operand"))
>> -   (and (match_test "TARGET_X32 && Pmode == DImode")
>> +   (and (match_test "TARGET_X32 && Pmode == DImode
>> + && !ix86_indirect_branch_thunk_register")
>>  (match_operand 0 "GOT_memory_operand"
>>
>> Is this patch just trying to disable the predicate when
>> ix86_indirect_branch_thunk_register is set? Because this is what this
>> convoluted logic does.
>
> Yes, we want to disable all indirect branch via memory with
> -mindirect-branch-register, just like -mx32.   We could do
>
> #idefine TARGET_INDIRECT_BRANCH_REGISTER \
>  (TARGER_X32 ||  ix86_indirect_branch_thunk_register)

 Index: predicates.md
 ===
 --- predicates.md   (revision 25)
 +++ predicates.md   (working copy)
 @@ -710,11 +710,10 @@
(ior (match_test "constant_call_address_operand
  (op, mode == VOIDmode ? mode : Pmode)")
 (match_operand 0 "call_register_no_elim_operand")
 -   (ior (and (not (match_test "TARGET_X32
 -  || 
 ix86_indirect_branch_thunk_register"))
 +   (and (not (match_test "ix86_indirect_branch_thunk_register"))
 +   (ior (and (not (match_test "TARGET_X32")))
  (match_operand 0 "memory_operand"))
 -   (and (match_test "TARGET_X32 && Pmode == DImode
 - && !ix86_indirect_branch_thunk_register")
 +(and (match_test "TARGET_X32 && Pmode == DImode")
  (match_operand 0 "GOT_memory_operand")

 or something like that.

>>>
>>> I am testing this patch.  OK for trunk if there is no regression?
>>>
>>
>> Here is the updated patch.  Tested on i686 and x86-64.  OK for
>> trunk?
>
> There are two of the same issues in constraints.md
>
> On further inspection, there are several new
> ix86_indirect_branch_thunk_register conditions sprinkled around
> predicates.md. The one in indirect_branch_operand is understandable
> (but should be written as:
>
>   (ior (match_operand 0 "register_operand")
>(and (not (match_test "ix86_indirect_thunk_register"))
> (not (match_test "TARGET_X32"))
> (match_operand 0 "memory_operand"
> )
>
> but the ones in GOT_memory_operand and GOT32_symbol_operand should
> *not* be there, since these are simple pattern matches. Now we have
> situation where e.g. call_got_x32 and sibcall_got_32 patterns never
> match, and should be disabled with
> ix86_indirect_branch_thunk_register. Please move
> ix86_indirect_branch_thunk_register conditions out of these two
> predicates.

I break them into 4 smaller patches:

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01361.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01360.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01362.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01359.html


-- 
H.J.


Re: Fwd: [PATCH] i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 10:00 AM, Jan Hubicka  wrote:
>> Hi Jan,
>>
>> Can you review this patch?  This blocks the GCC 7 backport.
>>
>> Thanks.
>>
>> H.J.
>>
>>
>> -- Forwarded message --
>> From: H.J. Lu 
>> Date: Mon, Jan 15, 2018 at 8:45 AM
>> Subject: [PATCH] i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO
>> To: gcc-patches@gcc.gnu.org
>> Cc: Uros Bizjak 
>>
>>
>> ASM_OUTPUT_DEF isn't defined for TARGET_MACHO.  Use ASM_OUTPUT_LABEL to
>> generate the __x86_return_thunk label, instead of the set directive.
>> Update testcase to remove the __x86_return_thunk label check.  Since
>> -fno-pic is ignored on Darwin, update testcases to sscan or "push"
>> only on Linux.
>>
>> Tested with a cross compiler to x86_64-apple-darwin10.4.0.  OK for
>> trunk?
>>
>> H.J.
>> ---
>> gcc/
>>
>> PR target/83839
>> * config/i386/i386.c (output_indirect_thunk_function): Use
>> ASM_OUTPUT_LABEL, instead of ASM_OUTPUT_DEF, for TARGET_MACHO
>> for  __x86.return_thunk.
>
> Hmm, we really ought to merge it with the way normal thunks are output from
> middle-end next stage1.
>
> OK
> Honza

I checked it in and opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83868

Thanks.

-- 
H.J.


Re: [PATCH v4] Ability to remap file names in __FILE__, etc (PR other/70268)

2018-01-15 Thread Joseph Myers
On Sat, 13 Jan 2018, Boris Kolpackov wrote:

> Joseph Myers  writes:
> 
> > Contrary to a previous review, you should *not* be removing RejectNegative 
> > from -fdebug-prefix-map=, and should be including it on both the new 
> > options. [...]
> > 
> > The patch is OK with that fixed.
> 
> Thanks for finding the time to review this on the last minute. Below is
> the new revision of the patch. The changes compared to the previous
> revision are:
> 
> 1. Rebase to the latest trunk.
> 
> 2. Add RejectNegative back for -fdebug-prefix-map as well as for the two
>new options.
> 
> 3. Use strrchr() instead of strchr() when looking for '=' in the option
>value. This change is suggested here[1] and the reasoning is that we
>(as project authors) can control paths inside our projects (not to
>have any '=') but not where the users choose to build them.
> 
>While strictly speaking this is a backwards-incompatible change, the
>old semantics in the presence of '=' is a broken build, so this is
>probably ok (but if not, let me know and I will revert this change).
> 
>I've also added this rationale to add_prefix_map() and a note in the
>ChangeLog.

This version is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] rs6000: Delete "delayed_cr" insn type

2018-01-15 Thread Segher Boessenkool
"delayed_cr" is just "cr_logical" with the second source operand not
equal to the destination operand.  This patch changes it to be
expressed as type "cr_logical", with a new boolean attribute
"cr_logical_3op" added.  This simplifies code.

Tested on powerpc64-linux {-m32,-m64}; I'll commit this later today.


Segher


2018-01-15  Segher Boessenkool  

* config/rs6000/rs6000.md (define_attr "type"): Remove delayed_cr.
(define_attr "cr_logical_3op"): New.
(cceq_ior_compare): Adjust.
(cceq_ior_compare_complement): Adjust.
(*cceq_rev_compare): Adjust.
* config/rs6000/rs6000.c (rs6000_adjust_cost): Adjust.
(is_cracked_insn): Adjust.
(insn_must_be_first_in_group): Adjust.
* config/rs6000/40x.md: Adjust.
* config/rs6000/440.md: Adjust.
* config/rs6000/476.md: Adjust.
* config/rs6000/601.md: Adjust.
* config/rs6000/603.md: Adjust.
* config/rs6000/6xx.md: Adjust.
* config/rs6000/7450.md: Adjust.
* config/rs6000/7xx.md: Adjust.
* config/rs6000/8540.md: Adjust.
* config/rs6000/cell.md: Adjust.
* config/rs6000/e300c2c3.md: Adjust.
* config/rs6000/e500mc.md: Adjust.
* config/rs6000/e500mc64.md: Adjust.
* config/rs6000/e5500.md: Adjust.
* config/rs6000/e6500.md: Adjust.
* config/rs6000/mpc.md: Adjust.
* config/rs6000/power4.md: Adjust.
* config/rs6000/power5.md: Adjust.
* config/rs6000/power6.md: Adjust.
* config/rs6000/power7.md: Adjust.
* config/rs6000/power8.md: Adjust.
* config/rs6000/power9.md: Adjust.
* config/rs6000/rs64.md: Adjust.
* config/rs6000/titan.md: Adjust.

---
 gcc/config/rs6000/40x.md  |  2 +-
 gcc/config/rs6000/440.md  |  2 +-
 gcc/config/rs6000/476.md  |  2 +-
 gcc/config/rs6000/601.md  |  2 +-
 gcc/config/rs6000/603.md  |  2 +-
 gcc/config/rs6000/6xx.md  |  4 ++--
 gcc/config/rs6000/7450.md |  2 +-
 gcc/config/rs6000/7xx.md  |  2 +-
 gcc/config/rs6000/8540.md |  2 +-
 gcc/config/rs6000/cell.md |  2 +-
 gcc/config/rs6000/e300c2c3.md |  2 +-
 gcc/config/rs6000/e500mc.md   |  2 +-
 gcc/config/rs6000/e500mc64.md |  2 +-
 gcc/config/rs6000/e5500.md|  2 +-
 gcc/config/rs6000/e6500.md|  2 +-
 gcc/config/rs6000/mpc.md  |  2 +-
 gcc/config/rs6000/power4.md   |  4 +++-
 gcc/config/rs6000/power5.md   |  4 +++-
 gcc/config/rs6000/power6.md   |  5 -
 gcc/config/rs6000/power7.md   |  7 +--
 gcc/config/rs6000/power8.md   |  2 +-
 gcc/config/rs6000/power9.md   |  2 +-
 gcc/config/rs6000/rs6000.c|  6 ++
 gcc/config/rs6000/rs6000.md   | 14 ++
 gcc/config/rs6000/rs64.md |  2 +-
 gcc/config/rs6000/titan.md|  2 +-
 26 files changed, 40 insertions(+), 42 deletions(-)

diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md
index 67df59d..5a36bd2 100644
--- a/gcc/config/rs6000/40x.md
+++ b/gcc/config/rs6000/40x.md
@@ -114,7 +114,7 @@ (define_insn_reservation "ppc403-jmpreg" 1
   "bpu_40x")
 
 (define_insn_reservation "ppc403-cr" 2
-  (and (eq_attr "type" "cr_logical,delayed_cr")
+  (and (eq_attr "type" "cr_logical")
(eq_attr "cpu" "ppc403,ppc405"))
   "bpu_40x")
 
diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md
index d78ee8d..fb5c372 100644
--- a/gcc/config/rs6000/440.md
+++ b/gcc/config/rs6000/440.md
@@ -95,7 +95,7 @@ (define_insn_reservation "ppc440-branch" 1
   "ppc440_issue,ppc440_i_pipe")
 
 (define_insn_reservation "ppc440-compare" 2
-  (and (ior (eq_attr "type" "cmp,cr_logical,delayed_cr,mfcr")
+  (and (ior (eq_attr "type" "cmp,cr_logical,mfcr")
(and (eq_attr "type" "add,logical,shift,exts")
 (eq_attr "dot" "yes")))
(eq_attr "cpu" "ppc440"))
diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md
index 9727a91..3ee92b8 100644
--- a/gcc/config/rs6000/476.md
+++ b/gcc/config/rs6000/476.md
@@ -71,7 +71,7 @@ (define_insn_reservation "ppc476-simple-integer" 1
ppc476_i_pipe|ppc476_lj_pipe")
 
 (define_insn_reservation "ppc476-complex-integer" 1
-  (and (eq_attr "type" 
"cmp,cr_logical,delayed_cr,cntlz,isel,isync,sync,trap,popcnt")
+  (and (eq_attr "type" "cmp,cr_logical,cntlz,isel,isync,sync,trap,popcnt")
(eq_attr "cpu" "ppc476"))
   "ppc476_issue,\
ppc476_i_pipe")
diff --git a/gcc/config/rs6000/601.md b/gcc/config/rs6000/601.md
index d92a518a..0e386e3 100644
--- a/gcc/config/rs6000/601.md
+++ b/gcc/config/rs6000/601.md
@@ -116,7 +116,7 @@ (define_insn_reservation "ppc601-mtcr" 4
   "iu_ppc601,bpu_ppc601")
 
 (define_insn_reservation "ppc601-crlogical" 4
-  (and (eq_attr "type" "cr_logical,delayed_cr")
+  (and (eq_attr "type" "cr_logical")
(eq_attr "cpu" "ppc601"))
   "bpu_ppc601")
 
diff --git a/gcc/config/rs6000/603.md b/gcc/config/rs6000/603.md
index 2167642..b27c31c 100644
--- a/gcc/config/rs6000/603.md
+++ b/gcc/config/rs6000/603.md
@@ -126,7 +126,7 @@ (define_insn_reservatio

Re: [patch, fortran] Change ABI for F2008 - minloc/maxloc BACK argument

2018-01-15 Thread Thomas Koenig

Hi Janne,


Here, s/BOUND/BACK/ I presume?


Yes.


Also, it seems in the library some of the back arguments are by value,
but some are still passed as pointers. Based on some quick grepping of
the patch they seem to come from  m4/iforeach.m4  (6 lines in total).

With these fixes, Ok for trunk.


Also fixed, committed as r256705.

Thanks a lot for the thorough review!

Regards

Thomas


Go patch committed: keep variables captured by defer alive

2018-01-15 Thread Ian Lance Taylor
Local variables captured by the deferred closure need to be live until
the function finishes, especially when the deferred function runs.
Function::build, for functions that have a defer, wraps the function
body in a try block.  So the backend sees the local variables only
live in the try block, without knowing that they are needed also in
the finally block where we invoke the deferred function.  This patch
to the Go frontend by Cherry Zhang fixes this by creating top-level
declarations for non-escaping address-taken locals when there is a
defer.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256655)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-4aa531c1765bba52848c6d71b9f57b593063d3ba
+afac7d7bed07ebe3add1784aaa9547c4d660d0ed
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 256593)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -5568,6 +5568,7 @@ Function::build(Gogo* gogo, Named_object
   // initial values.
   std::vector vars;
   std::vector var_inits;
+  std::vector var_decls_stmts;
   for (Bindings::const_definitions_iterator p =
 this->block_->bindings()->begin_definitions();
p != this->block_->bindings()->end_definitions();
@@ -5642,6 +5643,24 @@ Function::build(Gogo* gogo, Named_object
   vars.push_back(bvar);
   var_inits.push_back(init);
}
+  else if (this->defer_stack_ != NULL
+   && (*p)->is_variable()
+   && (*p)->var_value()->is_non_escaping_address_taken()
+   && !(*p)->var_value()->is_in_heap())
+{
+  // Local variable captured by deferred closure needs to be live
+  // until the end of the function. We create a top-level
+  // declaration for it.
+  // TODO: we don't need to do this if the variable is not captured
+  // by the defer closure. There is no easy way to check it here,
+  // so we do this for all address-taken variables for now.
+  Variable* var = (*p)->var_value();
+  Temporary_statement* ts =
+Statement::make_temporary(var->type(), NULL, var->location());
+  ts->set_is_address_taken();
+  var->set_toplevel_decl(ts);
+  var_decls_stmts.push_back(ts);
+}
 }
   if (!gogo->backend()->function_set_parameters(this->fndecl_, param_vars))
 {
@@ -5661,7 +5680,7 @@ Function::build(Gogo* gogo, Named_object
 {
   // Declare variables if necessary.
   Bblock* var_decls = NULL;
-
+  std::vector var_decls_bstmt_list;
   Bstatement* defer_init = NULL;
   if (!vars.empty() || this->defer_stack_ != NULL)
{
@@ -5675,6 +5694,14 @@ Function::build(Gogo* gogo, Named_object
  Translate_context dcontext(gogo, named_function, this->block_,
  var_decls);
   defer_init = this->defer_stack_->get_backend(&dcontext);
+  var_decls_bstmt_list.push_back(defer_init);
+  for (std::vector::iterator p = 
var_decls_stmts.begin();
+   p != var_decls_stmts.end();
+   ++p)
+{
+  Bstatement* bstmt = (*p)->get_backend(&dcontext);
+  var_decls_bstmt_list.push_back(bstmt);
+}
}
}
 
@@ -5693,8 +5720,6 @@ Function::build(Gogo* gogo, Named_object
   var_inits[i]);
   init.push_back(init_stmt);
}
-  if (defer_init != NULL)
-   init.push_back(defer_init);
   Bstatement* var_init = gogo->backend()->statement_list(init);
 
   // Initialize all variables before executing this code block.
@@ -5722,8 +5747,8 @@ Function::build(Gogo* gogo, Named_object
   // we built one.
   if (var_decls != NULL)
 {
-  std::vector code_stmt_list(1, code_stmt);
-  gogo->backend()->block_add_statements(var_decls, code_stmt_list);
+  var_decls_bstmt_list.push_back(code_stmt);
+  gogo->backend()->block_add_statements(var_decls, 
var_decls_bstmt_list);
   code_stmt = gogo->backend()->block_statement(var_decls);
 }
 


Go patch committed: Reclaim memory of escape analysis Nodes

2018-01-15 Thread Ian Lance Taylor
This patch by Cherry Zhang fixes the Go frontend to reclaim the memory
of escape analysis Nodes before kicking off the backend, as they are
not needed in get_backend.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256706)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-afac7d7bed07ebe3add1784aaa9547c4d660d0ed
+ff851e1190923f8612004c6c214a7c202471b0ba
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 256593)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -445,6 +445,17 @@ Node::state(Escape_context* context, Nam
   return this->state_;
 }
 
+Node::~Node()
+{
+  if (this->state_ != NULL)
+{
+  if (this->expr() == NULL || this->expr()->var_expression() == NULL)
+// Var expression Node is excluded since it shares state with the
+// underlying var Node.
+delete this->state_;
+}
+}
+
 int
 Node::encoding()
 {
@@ -552,6 +563,7 @@ Node::is_sink() const
 std::map Node::objects;
 std::map Node::expressions;
 std::map Node::statements;
+std::vector Node::indirects;
 
 // Make a object node or return a cached node for this object.
 
@@ -601,6 +613,7 @@ Node*
 Node::make_indirect_node(Node* child)
 {
   Node* n = new Node(child);
+  Node::indirects.push_back(n);
   return n;
 }
 
@@ -3335,3 +3348,39 @@ Gogo::tag_function(Escape_context* conte
   Escape_analysis_tag eat(context);
   eat.tag(fn);
 }
+
+// Reclaim memory of escape analysis Nodes.
+
+void
+Gogo::reclaim_escape_nodes()
+{
+  Node::reclaim_nodes();
+}
+
+void
+Node::reclaim_nodes()
+{
+  for (std::map::iterator p = Node::objects.begin();
+   p != Node::objects.end();
+   ++p)
+delete p->second;
+  Node::objects.clear();
+
+  for (std::map::iterator p = Node::expressions.begin();
+   p != Node::expressions.end();
+   ++p)
+delete p->second;
+  Node::expressions.clear();
+
+  for (std::map::iterator p = Node::statements.begin();
+   p != Node::statements.end();
+   ++p)
+delete p->second;
+  Node::statements.clear();
+
+  for (std::vector::iterator p = Node::indirects.begin();
+   p != Node::indirects.end();
+   ++p)
+delete *p;
+  Node::indirects.clear();
+}
Index: gcc/go/gofrontend/escape.h
===
--- gcc/go/gofrontend/escape.h  (revision 256593)
+++ gcc/go/gofrontend/escape.h  (working copy)
@@ -191,6 +191,8 @@ class Node
   child_(n)
   {}
 
+  ~Node();
+
   // Return this node's type.
   Type*
   type() const;
@@ -296,6 +298,10 @@ class Node
   static int
   note_inout_flows(int e, int index, Level level);
 
+  // Reclaim nodes.
+  static void
+  reclaim_nodes();
+
  private:
   // The classification of this Node.
   Node_classification classification_;
@@ -326,6 +332,10 @@ class Node
   static std::map objects;
   static std::map expressions;
   static std::map statements;
+
+  // Collection of all NODE_INDIRECT Nodes, used for reclaiming memory. This
+  // is not a cache -- each make_indirect_node will make a fresh Node.
+  static std::vector indirects;
 };
 
 // The amount of bits used for the escapement encoding.
Index: gcc/go/gofrontend/go.cc
===
--- gcc/go/gofrontend/go.cc (revision 256593)
+++ gcc/go/gofrontend/go.cc (working copy)
@@ -167,6 +167,9 @@ go_parse_input_files(const char** filena
   // Flatten the parse tree.
   ::gogo->flatten();
 
+  // Reclaim memory of escape analysis Nodes.
+  ::gogo->reclaim_escape_nodes();
+
   // Dump ast, use filename[0] as the base name
   ::gogo->dump_ast(filenames[0]);
 }
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 256593)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -682,6 +682,10 @@ class Gogo
   void
   tag_function(Escape_context*, Named_object*);
 
+  // Reclaim memory of escape analysis Nodes.
+  void
+  reclaim_escape_nodes();
+
   // Do all exports.
   void
   do_exports();


gcc-patches@gcc.gnu.org

2018-01-15 Thread Jonathan Wakely

The chi_squared_distribution::param(const param&) function should also
update the parameters of the gamma_distribution member.

PR libstdc++/83833
* include/bits/random.h (chi_squared_distribution::param): Update
gamma distribution parameter.
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc: New
test.

Tested powerpc64le-linux, committed to trunk.

(I know this is not a regression, but it's a small fix, and I also
plan to apply it on the branches).


commit 23aebfd022adb918bbe7e1c12efc00020a4299f6
Author: Jonathan Wakely 
Date:   Mon Jan 15 17:16:07 2018 +

PR libstdc++/83833 fix chi_squared_distribution::param(const param&)

PR libstdc++/83833
* include/bits/random.h (chi_squared_distribution::param): Update
gamma distribution parameter.
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc: 
New
test.

diff --git a/libstdc++-v3/include/bits/random.h 
b/libstdc++-v3/include/bits/random.h
index 655ee1df6d6..f812bbf18b1 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -2643,7 +2643,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   void
   param(const param_type& __param)
-  { _M_param = __param; }
+  {
+   _M_param = __param;
+   typedef typename std::gamma_distribution::param_type
+ param_type;
+   _M_gd.param(param_type{__param.n() / 2});
+  }
 
   /**
* @brief Returns the greatest lower bound value of the distribution.
diff --git 
a/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc 
b/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc
new file mode 100644
index 000..01667635b41
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  std::default_random_engine r1, r2;
+  using chi = std::chi_squared_distribution;
+  chi::param_type p(5);
+  chi d1(p);
+  chi d2;
+  d2.param(p);
+  VERIFY( d1(r1) == d2(r2) ); // PR libstdc++/83833
+}
+
+int
+main()
+{
+  test01();
+}


Re: [PATCH 3/5] x86: Add -mindirect-branch-register

2018-01-15 Thread Uros Bizjak
On Mon, Jan 15, 2018 at 7:11 PM, H.J. Lu  wrote:
> On Mon, Jan 15, 2018 at 8:54 AM, Uros Bizjak  wrote:
>> On Mon, Jan 15, 2018 at 4:05 AM, H.J. Lu  wrote:
>>> On Sun, Jan 14, 2018 at 1:23 PM, H.J. Lu  wrote:
 On Sun, Jan 14, 2018 at 10:52 AM, Uros Bizjak  wrote:
> On Sun, Jan 14, 2018 at 7:08 PM, H.J. Lu  wrote:
>> On Sun, Jan 14, 2018 at 9:51 AM, Uros Bizjak  wrote:
>>> -  (ior (and (not (match_test "TARGET_X32"))
>>> +  (ior (and (not (match_test "TARGET_X32
>>> +  || ix86_indirect_branch_thunk_register"))
>>>  (match_operand 0 "sibcall_memory_operand"))
>>> -   (and (match_test "TARGET_X32 && Pmode == DImode")
>>> +   (and (match_test "TARGET_X32 && Pmode == DImode
>>> + && !ix86_indirect_branch_thunk_register")
>>>  (match_operand 0 "GOT_memory_operand"
>>>
>>> Is this patch just trying to disable the predicate when
>>> ix86_indirect_branch_thunk_register is set? Because this is what this
>>> convoluted logic does.
>>
>> Yes, we want to disable all indirect branch via memory with
>> -mindirect-branch-register, just like -mx32.   We could do
>>
>> #idefine TARGET_INDIRECT_BRANCH_REGISTER \
>>  (TARGER_X32 ||  ix86_indirect_branch_thunk_register)
>
> Index: predicates.md
> ===
> --- predicates.md   (revision 25)
> +++ predicates.md   (working copy)
> @@ -710,11 +710,10 @@
>(ior (match_test "constant_call_address_operand
>  (op, mode == VOIDmode ? mode : Pmode)")
> (match_operand 0 "call_register_no_elim_operand")
> -   (ior (and (not (match_test "TARGET_X32
> -  || 
> ix86_indirect_branch_thunk_register"))
> +   (and (not (match_test "ix86_indirect_branch_thunk_register"))
> +   (ior (and (not (match_test "TARGET_X32")))
>  (match_operand 0 "memory_operand"))
> -   (and (match_test "TARGET_X32 && Pmode == DImode
> - && !ix86_indirect_branch_thunk_register")
> +(and (match_test "TARGET_X32 && Pmode == DImode")
>  (match_operand 0 "GOT_memory_operand")
>
> or something like that.
>

 I am testing this patch.  OK for trunk if there is no regression?

>>>
>>> Here is the updated patch.  Tested on i686 and x86-64.  OK for
>>> trunk?
>>
>> There are two of the same issues in constraints.md
>>
>> On further inspection, there are several new
>> ix86_indirect_branch_thunk_register conditions sprinkled around
>> predicates.md. The one in indirect_branch_operand is understandable
>> (but should be written as:
>>
>>   (ior (match_operand 0 "register_operand")
>>(and (not (match_test "ix86_indirect_thunk_register"))
>> (not (match_test "TARGET_X32"))
>> (match_operand 0 "memory_operand"
>> )
>>
>> but the ones in GOT_memory_operand and GOT32_symbol_operand should
>> *not* be there, since these are simple pattern matches. Now we have
>> situation where e.g. call_got_x32 and sibcall_got_32 patterns never
>> match, and should be disabled with
>> ix86_indirect_branch_thunk_register. Please move
>> ix86_indirect_branch_thunk_register conditions out of these two
>> predicates.
>
> I break them into 4 smaller patches:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01361.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01360.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01362.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01359.html

Hopefully, OK.

Thanks,
Uros.


Re: [PATCH] use shasum instead of sha512sum on FreeBSD and DragonFly

2018-01-15 Thread Andreas Tobler

On 15.01.18 13:59, Jonathan Wakely wrote:

boru on Freenode's #gcc channel pointed out that
contrib/download_prerequisites should use shasum for FreeBSD, not
sha512sum (which comes from GNU coreutils on GNU/Linux).  I checked
FreeBSD 11.0 and 10.2 and neither has sha512sum, not does DragonFly
4.2, another FreeBSD derivative.

OK for trunk?


Works here on FreeBSD, thanks.
Andreas



[PATCH] Drop unused parameter of insert_save()

2018-01-15 Thread A. Skrobov
The only caller passes `before_p=1`, and I cannot imagine a use case
for it with `before_p=0`

2018-01-15  Artyom Skrobov tyomi...@gmail.com

* caller-save.c: Drop unused parameter of insert_save()

---
 gcc/caller-save.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index df1c9691e0c2..db1ab2caee9a 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -88,7 +88,7 @@ static void mark_set_regs (rtx, const_rtx, void *);
 static void mark_referenced_regs (rtx *, refmarker_fn *mark, void *mark_arg);
 static refmarker_fn mark_reg_as_referenced;
 static refmarker_fn replace_reg_with_saved_mem;
-static int insert_save (struct insn_chain *, int, int, HARD_REG_SET *,
+static int insert_save (struct insn_chain *, int, HARD_REG_SET *,
  machine_mode *);
 static int insert_restore (struct insn_chain *, int, int, int,
machine_mode *);
@@ -861,7 +861,7 @@ save_call_clobbered_regs (void)

   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
  if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
-  regno += insert_save (chain, 1, regno, &hard_regs_to_save, save_mode);
+  regno += insert_save (chain, regno, &hard_regs_to_save, save_mode);

   /* Must recompute n_regs_saved.  */
   n_regs_saved = 0;
@@ -1252,7 +1252,7 @@ insert_restore (struct insn_chain *chain, int
before_p, int regno,
 /* Like insert_restore above, but save registers instead.  */

 static int
-insert_save (struct insn_chain *chain, int before_p, int regno,
+insert_save (struct insn_chain *chain, int regno,
  HARD_REG_SET *to_save, machine_mode *save_mode)
 {
   int i;
@@ -1314,7 +1314,7 @@ insert_save (struct insn_chain *chain, int
before_p, int regno,

   pat = gen_rtx_SET (mem, gen_rtx_REG (GET_MODE (mem), regno));
   code = reg_save_code (regno, GET_MODE (mem));
-  new_chain = insert_one_insn (chain, before_p, code, pat);
+  new_chain = insert_one_insn (chain, 1, code, pat);

   /* Set hard_regs_saved and dead_or_set for all the registers we saved.  */
   for (k = 0; k < numregs; k++)


[PATCH] Fix warn_if_not_align ICE (PR c/83844)

2018-01-15 Thread Jakub Jelinek
Hi!

As the testcase shows, handle_warn_if_not_align ICEs on
fields with types with warn_if_not_align attribute in variable length
structures.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-01-15  Jakub Jelinek  

PR c/83844
* stor-layout.c (handle_warn_if_not_align): Use byte_position and
multiple_of_p instead of unchecked tree_to_uhwi and UHWI check.

* gcc.dg/pr83844.c: New test.

--- gcc/stor-layout.c.jj2018-01-14 17:16:55.590836141 +0100
+++ gcc/stor-layout.c   2018-01-15 10:31:42.403874423 +0100
@@ -1150,11 +1150,9 @@ handle_warn_if_not_align (tree field, un
 warning (opt_w, "alignment %u of %qT is less than %u",
 record_align, context, warn_if_not_align);
 
-  unsigned HOST_WIDE_INT off
-= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
-   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
-  if ((off % warn_if_not_align) != 0)
-warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
+  tree off = byte_position (field);
+  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
+warning (opt_w, "%q+D offset %E in %qT isn't aligned to %u",
 field, off, context, warn_if_not_align);
 }
 
--- gcc/testsuite/gcc.dg/pr83844.c.jj   2018-01-15 10:36:34.986655853 +0100
+++ gcc/testsuite/gcc.dg/pr83844.c  2018-01-15 10:36:02.120669000 +0100
@@ -0,0 +1,28 @@
+/* PR c/83844 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -Wall" } */
+
+typedef unsigned long long __u64 
__attribute__((aligned(4),warn_if_not_aligned(8)));
+void bar (void *, void *);
+
+void
+foo (int n)
+{
+  struct A
+  {
+int i1;
+int i2;
+int i3[n];
+__u64 x;   /* { dg-warning "in 'struct A' isn't aligned to 8" } */
+  } __attribute__((aligned (8)));
+  struct B
+  {
+int i1;
+int i2;
+long long i3[n];
+__u64 x;
+  } __attribute__((aligned (8)));
+  struct A a;
+  struct B b;
+  bar (&a, &b);
+}

Jakub


[PATCH] Fix store-merging for ~ of bswap (PR tree-optimization/83843)

2018-01-15 Thread Jakub Jelinek
Hi!

When using the bswap pass infrastructure, BIT_NOT_EXPRs aren't allowed in
the middle, but due to the way process_store handles those it can appear
around the value, which is something output_merged_store didn't handle.

Fixed thusly, where we handle not just the case when the bswap (or nop)
value needs inversion as whole, but also cases where only a few portions of
it need xoring with some mask.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-15  Jakub Jelinek  

PR tree-optimization/83843
* gimple-ssa-store-merging.c
(imm_store_chain_info::output_merged_store): Handle bit_not_p on
store_immediate_info for bswap/nop orig_stores.

* gcc.dg/store_merging_18.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2018-01-04 00:43:17.629703230 +0100
+++ gcc/gimple-ssa-store-merging.c  2018-01-15 12:29:14.105789381 +0100
@@ -3619,6 +3619,15 @@ imm_store_chain_info::output_merged_stor
  gimple_seq_add_stmt_without_update (&seq, stmt);
  src = gimple_assign_lhs (stmt);
}
+ inv_op = invert_op (split_store, 2, int_type, xor_mask);
+ if (inv_op != NOP_EXPR)
+   {
+ stmt = gimple_build_assign (make_ssa_name (int_type),
+ inv_op, src, xor_mask);
+ gimple_set_location (stmt, loc);
+ gimple_seq_add_stmt_without_update (&seq, stmt);
+ src = gimple_assign_lhs (stmt);
+   }
  break;
default:
  src = ops[0];
--- gcc/testsuite/gcc.dg/store_merging_18.c.jj  2018-01-15 12:43:49.607227365 
+0100
+++ gcc/testsuite/gcc.dg/store_merging_18.c 2018-01-15 12:43:24.882245004 
+0100
@@ -0,0 +1,51 @@
+/* PR tree-optimization/83843 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-store-merging" } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 3 "store-merging" { 
target store_merge } } } */
+
+__attribute__((noipa)) void
+foo (unsigned char *buf, unsigned char *tab)
+{
+  unsigned v = tab[1] ^ (tab[0] << 8);
+  buf[0] = ~(v >> 8);
+  buf[1] = ~v;
+}
+
+__attribute__((noipa)) void
+bar (unsigned char *buf, unsigned char *tab)
+{
+  unsigned v = tab[1] ^ (tab[0] << 8);
+  buf[0] = (v >> 8);
+  buf[1] = ~v;
+}
+
+__attribute__((noipa)) void
+baz (unsigned char *buf, unsigned char *tab)
+{
+  unsigned v = tab[1] ^ (tab[0] << 8);
+  buf[0] = ~(v >> 8);
+  buf[1] = v;
+}
+
+int
+main ()
+{
+  volatile unsigned char l1 = 0;
+  volatile unsigned char l2 = 1;
+  unsigned char buf[2];
+  unsigned char tab[2] = { l1 + 1, l2 * 2 };
+  foo (buf, tab);
+  if (buf[0] != (unsigned char) ~1 || buf[1] != (unsigned char) ~2)
+__builtin_abort ();
+  buf[0] = l1 + 7;
+  buf[1] = l2 * 8;
+  bar (buf, tab);
+  if (buf[0] != 1 || buf[1] != (unsigned char) ~2)
+__builtin_abort ();
+  buf[0] = l1 + 9;
+  buf[1] = l2 * 10;
+  baz (buf, tab);
+  if (buf[0] != (unsigned char) ~1 || buf[1] != 2)
+__builtin_abort ();
+  return 0;
+}

Jakub


[C++ PATCH] Fix ICE in member_vec_dedup (PR c++/83825)

2018-01-15 Thread Jakub Jelinek
Hi!

As the testcase shows, calls to member_vec_dedup and qsort are just guarded
by the vector being non-NULL, which doesn't mean it must be non-empty,
so we can't do (*member_vec)[0] on it.  Fixed by the second hunk, the
rest is just a small cleanup to use the vec.h methods.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-15  Jakub Jelinek  

PR c++/83825
* name-lookup.c (member_vec_dedup): Return early if len is 0.
(resort_type_member_vec, set_class_bindings,
insert_late_enum_def_bindings): Use vec qsort method instead of
calling qsort directly.

* g++.dg/template/pr83825.C: New test.

--- gcc/cp/name-lookup.c.jj 2018-01-03 13:16:38.537848205 +0100
+++ gcc/cp/name-lookup.c2018-01-15 14:07:09.494576944 +0100
@@ -1520,8 +1520,7 @@ resort_type_member_vec (void *obj, void
 {
   resort_data.new_value = new_value;
   resort_data.cookie = cookie;
-  qsort (member_vec->address (), member_vec->length (),
-sizeof (tree), resort_member_name_cmp);
+  member_vec->qsort (resort_member_name_cmp);
 }
 }
 
@@ -1597,6 +1596,9 @@ member_vec_dedup (vec *memb
   unsigned len = member_vec->length ();
   unsigned store = 0;
 
+  if (!len)
+return;
+
   tree current = (*member_vec)[0], name = OVL_NAME (current);
   tree next = NULL_TREE, next_name = NULL_TREE;
   for (unsigned jx, ix = 0; ix < len;
@@ -1712,8 +1714,7 @@ set_class_bindings (tree klass, unsigned
   if (member_vec)
 {
   CLASSTYPE_MEMBER_VEC (klass) = member_vec;
-  qsort (member_vec->address (), member_vec->length (),
-sizeof (tree), member_name_cmp);
+  member_vec->qsort (member_name_cmp);
   member_vec_dedup (member_vec);
 }
 }
@@ -1741,8 +1742,7 @@ insert_late_enum_def_bindings (tree klas
   else
member_vec_append_class_fields (member_vec, klass);
   CLASSTYPE_MEMBER_VEC (klass) = member_vec;
-  qsort (member_vec->address (), member_vec->length (),
-sizeof (tree), member_name_cmp);
+  member_vec->qsort (member_name_cmp);
   member_vec_dedup (member_vec);
 }
 }
--- gcc/testsuite/g++.dg/template/pr83825.C.jj  2018-01-15 14:16:55.289432205 
+0100
+++ gcc/testsuite/g++.dg/template/pr83825.C 2018-01-15 14:14:03.172490348 
+0100
@@ -0,0 +1,13 @@
+// PR c++/83825
+// { dg-do compile }
+
+template 
+class A {};// { dg-error "shadows template parameter" }
+
+template 
+class B
+{ 
+  void foo () { A  a; }
+};
+
+template void B <0>::foo ();

Jakub


[committed] Fix OpenMP atomic expansion (PR middle-end/83837)

2018-01-15 Thread Jakub Jelinek
Hi!

As the patch shows, expand_omp_atomic* was relying on the
gimple_omp_atomic_load_rhs () pointer to be pointer to the type we want to
atomically load.  That doesn't work too well, because pointer conversions
are useless in GIMPLE and so we can end up with a pointer to a different
type like void.

Fixed by ignoring the addr type and instead finding the type from the
reg we want to load into.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
should fix a couple of libgomp fortran tests.

2018-01-15  Jakub Jelinek  

PR middle-end/83837
* omp-expand.c (expand_omp_atomic_pipeline): Use loaded_val
type rather than type addr's type points to.
(expand_omp_atomic_mutex): Likewise.
(expand_omp_atomic): Likewise.

--- gcc/omp-expand.c.jj 2018-01-03 10:19:54.483533850 +0100
+++ gcc/omp-expand.c2018-01-15 16:07:27.626734592 +0100
@@ -6283,7 +6283,7 @@ expand_omp_atomic_pipeline (basic_block
int index)
 {
   tree loadedi, storedi, initial, new_storedi, old_vali;
-  tree type, itype, cmpxchg, iaddr;
+  tree type, itype, cmpxchg, iaddr, atype;
   gimple_stmt_iterator si;
   basic_block loop_header = single_succ (load_bb);
   gimple *phi, *stmt;
@@ -6297,7 +6297,8 @@ expand_omp_atomic_pipeline (basic_block
   cmpxchg = builtin_decl_explicit (fncode);
   if (cmpxchg == NULL_TREE)
 return false;
-  type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (addr)));
+  type = TYPE_MAIN_VARIANT (TREE_TYPE (loaded_val));
+  atype = type;
   itype = TREE_TYPE (TREE_TYPE (cmpxchg));
 
   if (!can_compare_and_swap_p (TYPE_MODE (itype), true)
@@ -6317,6 +6318,7 @@ expand_omp_atomic_pipeline (basic_block
 
   iaddr = create_tmp_reg (build_pointer_type_for_mode (itype, ptr_mode,
   true));
+  atype = itype;
   iaddr_val
= force_gimple_operand_gsi (&si,
fold_convert (TREE_TYPE (iaddr), addr),
@@ -6337,13 +6339,17 @@ expand_omp_atomic_pipeline (basic_block
   tree loaddecl = builtin_decl_explicit (fncode);
   if (loaddecl)
 initial
-  = fold_convert (TREE_TYPE (TREE_TYPE (iaddr)),
+  = fold_convert (atype,
  build_call_expr (loaddecl, 2, iaddr,
   build_int_cst (NULL_TREE,
  MEMMODEL_RELAXED)));
   else
-initial = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (iaddr)), iaddr,
- build_int_cst (TREE_TYPE (iaddr), 0));
+{
+  tree off
+   = build_int_cst (build_pointer_type_for_mode (atype, ptr_mode,
+ true), 0);
+  initial = build2 (MEM_REF, atype, iaddr, off);
+}
 
   initial
 = force_gimple_operand_gsi (&si, initial, true, NULL_TREE, true,
@@ -6495,15 +6501,20 @@ expand_omp_atomic_mutex (basic_block loa
   t = build_call_expr (t, 0);
   force_gimple_operand_gsi (&si, t, true, NULL_TREE, true, GSI_SAME_STMT);
 
-  stmt = gimple_build_assign (loaded_val, build_simple_mem_ref (addr));
+  tree mem = build_simple_mem_ref (addr);
+  TREE_TYPE (mem) = TREE_TYPE (loaded_val);
+  TREE_OPERAND (mem, 1)
+= fold_convert (build_pointer_type_for_mode (TREE_TYPE (mem), ptr_mode,
+true),
+   TREE_OPERAND (mem, 1));
+  stmt = gimple_build_assign (loaded_val, mem);
   gsi_insert_before (&si, stmt, GSI_SAME_STMT);
   gsi_remove (&si, true);
 
   si = gsi_last_nondebug_bb (store_bb);
   gcc_assert (gimple_code (gsi_stmt (si)) == GIMPLE_OMP_ATOMIC_STORE);
 
-  stmt = gimple_build_assign (build_simple_mem_ref (unshare_expr (addr)),
- stored_val);
+  stmt = gimple_build_assign (unshare_expr (mem), stored_val);
   gsi_insert_before (&si, stmt, GSI_SAME_STMT);
 
   t = builtin_decl_explicit (BUILT_IN_GOMP_ATOMIC_END);
@@ -6532,7 +6543,7 @@ expand_omp_atomic (struct omp_region *re
   tree loaded_val = gimple_omp_atomic_load_lhs (load);
   tree addr = gimple_omp_atomic_load_rhs (load);
   tree stored_val = gimple_omp_atomic_store_val (store);
-  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (addr)));
+  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (loaded_val));
   HOST_WIDE_INT index;
 
   /* Make sure the type is one of the supported sizes.  */

Jakub


[committed] xfail two assertions due to bug 74762 (PR 83869)

2018-01-15 Thread Martin Sebor

The c-c++-common/attr-nonstring-3.c test has run afoul of c++
bug 74762 (missing uninitialized warning (C++, parenthesized
expr, TREE_NO_WARNING)).  Until that bug is fixed, I've
committed r256709 and xfailed the two assertions that started
failing after r256683, as a result of gating middle-end warnings
on the no-warning bit being clear.

Martin


[PATCH] Preserve CROSSING_JUMP_P in peephole2 (PR rtl-optimization/83213)

2018-01-15 Thread Jakub Jelinek
Hi!

On the testcase in the PR (too large and creduce not making sufficient
progress) we ICE because i386.md:
;; Combining simple memory jump instruction

(define_peephole2
  [(set (match_operand:W 0 "register_operand")
(match_operand:W 1 "memory_operand"))
   (set (pc) (match_dup 0))]
  "!TARGET_X32
   && !ix86_indirect_branch_thunk_register
   && peep2_reg_dead_p (2, operands[0])"
  [(set (pc) (match_dup 1))])

peephole2 triggers on a CROSSING_JUMP_P jump, but nothing actually
copies that bit over from the old to the new JUMP_INSN.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-01-15  Jakub Jelinek  

PR rtl-optimization/83213
* recog.c (peep2_attempt): Copy over CROSSING_JUMP_P from peepinsn
to last if both are JUMP_INSNs.

--- gcc/recog.c.jj  2018-01-09 08:58:14.594002069 +0100
+++ gcc/recog.c 2018-01-15 16:37:13.279196178 +0100
@@ -3446,6 +3446,8 @@ peep2_attempt (basic_block bb, rtx_insn
   last = emit_insn_after_setloc (attempt,
 peep2_insn_data[i].insn,
 INSN_LOCATION (peepinsn));
+  if (JUMP_P (peepinsn) && JUMP_P (last))
+CROSSING_JUMP_P (last) = CROSSING_JUMP_P (peepinsn);
   before_try = PREV_INSN (insn);
   delete_insn_chain (insn, peep2_insn_data[i].insn, false);
 

Jakub


[C++ PATCH] Fix checking ICE in pt.c (PR c++/83817)

2018-01-15 Thread Jakub Jelinek
Hi!

function in this case can be either a CALL_EXPR or AGGR_INIT_EXPR.
CALL_FROM_THUNK_P macro is defined in tree.h and so knows just about the
generic CALL_EXPR, and the C++ FE adds AGGR_INIT_FROM_THUNK_P macro, which
is defined the same (protected_flag) except for the checking, one requires
a CALL_EXPR, another one AGGR_INIT_EXPR.  So, this spot seemed to do the
right thing actually when doing release checking, just in non-release
checking it would ICE if function is AGGR_INIT_EXPR.  From the
AGGR_INIT_FROM_THUNK_P flag we later on set CALL_FROM_THUNK_P when we later
generate the CALL_EXPR.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-15  Jakub Jelinek  

PR c++/83817
* pt.c (tsubst_copy_and_build) : If function
is AGGR_INIT_EXPR rather than CALL_EXPR, set AGGR_INIT_FROM_THUNK_P
instead of CALL_FROM_THUNK_P.

* g++.dg/cpp1y/pr83817.C: New test.

--- gcc/cp/pt.c.jj  2018-01-11 18:58:48.365391793 +0100
+++ gcc/cp/pt.c 2018-01-15 18:32:44.433150762 +0100
@@ -17819,7 +17819,10 @@ tsubst_copy_and_build (tree t,
CALL_EXPR_REVERSE_ARGS (function) = rev;
if (thk)
  {
-   CALL_FROM_THUNK_P (function) = true;
+   if (TREE_CODE (function) == CALL_EXPR)
+ CALL_FROM_THUNK_P (function) = true;
+   else
+ AGGR_INIT_FROM_THUNK_P (function) = true;
/* The thunk location is not interesting.  */
SET_EXPR_LOCATION (function, UNKNOWN_LOCATION);
  }
--- gcc/testsuite/g++.dg/cpp1y/pr83817.C.jj 2018-01-15 18:34:37.494143930 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/pr83817.C2018-01-15 18:34:05.212145878 
+0100
@@ -0,0 +1,17 @@
+// PR c++/83817
+// { dg-do compile { target c++14 } }
+
+struct A;
+struct B { template  using C = A; };
+struct D : B { struct F { typedef C E; }; };
+struct G {
+  struct I { I (D, A &); } h;
+  D::F::E &k ();
+  D j;
+  G (G &&) : h (j, k ()) {}
+};
+struct N { G l; };
+typedef N (*M)(N &);
+struct H { const char *o; M s; };
+N foo (N &);
+H r { "", [](auto &x) { return foo (x); }};

Jakub


[PATCH] Bump minimum value for max-sched-ready-insns param to 1 (PR rtl-optimization/86620)

2018-01-15 Thread Jakub Jelinek
Hi!

This param allows minimum of 0, which doesn't make much sense.
On the i386/pr83620.c test (when used with the =0 value) we ICE
because ix86_adjust_priority which has code to prevent moving of likely
spilled hard regs doesn't have a chance to do anything, since we don't
consider any other insns as ready.

This patch bumps the minimum to 1, so that there is at least something
considered.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-15  Jakub Jelinek  

PR rtl-optimization/86620
* params.def (max-sched-ready-insns): Bump minimum value to 1.

* gcc.dg/pr64935-2.c: Use --param=max-sched-ready-insns=1
instead of --param=max-sched-ready-insns=0.
* gcc.target/i386/pr83620.c: New test.
* gcc.dg/pr83620.c: New test.

--- gcc/params.def.jj   2018-01-14 17:16:57.471836055 +0100
+++ gcc/params.def  2018-01-15 18:53:24.122124325 +0100
@@ -744,7 +744,7 @@ DEFPARAM (PARAM_MAX_FIELDS_FOR_FIELD_SEN
 DEFPARAM(PARAM_MAX_SCHED_READY_INSNS,
 "max-sched-ready-insns",
 "The maximum number of instructions ready to be issued to be 
considered by the scheduler during the first scheduling pass.",
-100, 0, 0)
+100, 1, 0)
 
 /* This is the maximum number of active local stores RTL DSE will consider.  */
 DEFPARAM (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES,
--- gcc/testsuite/gcc.dg/pr64935-2.c.jj 2017-06-19 08:27:46.126467108 +0200
+++ gcc/testsuite/gcc.dg/pr64935-2.c2018-01-15 18:52:23.987124863 +0100
@@ -1,6 +1,6 @@
 /* PR rtl-optimization/64935 */
 /* { dg-do compile } */
-/* { dg-options "-O -fschedule-insns --param=max-sched-ready-insns=0 
-fcompare-debug" } */
+/* { dg-options "-O -fschedule-insns --param=max-sched-ready-insns=1 
-fcompare-debug" } */
 /* { dg-require-effective-target scheduling } */
 /* { dg-xfail-if "" { powerpc-ibm-aix* } } */
 
--- gcc/testsuite/gcc.target/i386/pr83620.c.jj  2018-01-15 18:53:43.267124153 
+0100
+++ gcc/testsuite/gcc.target/i386/pr83620.c 2018-01-15 19:17:31.053208498 
+0100
@@ -0,0 +1,15 @@
+/* PR rtl-optimization/86620 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -flive-range-shrinkage --param=max-sched-ready-insns=1 
-Wno-psabi -mno-avx" } */
+
+typedef unsigned __int128 V __attribute__ ((vector_size (64)));
+
+V u, v;
+
+V
+foo (char c, short d, int e, long f, __int128 g)
+{
+  f >>= c & 63;
+  v = (V){f} == u;
+  return e + g + v;
+}
--- gcc/testsuite/gcc.dg/pr83620.c.jj   2018-01-15 19:16:31.953190203 +0100
+++ gcc/testsuite/gcc.dg/pr83620.c  2018-01-15 19:16:16.499185414 +0100
@@ -0,0 +1,9 @@
+/* PR rtl-optimization/86620 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -flive-range-shrinkage --param=max-sched-ready-insns=0" } 
*/
+/* { dg-error "minimum value of parameter 'max-sched-ready-insns' is 1" "" { 
target *-*-* } 0 } */
+
+void
+foo (void)
+{
+}

Jakub


Re: [PATCH] rs6000: Wrap diff of immediates in const (PR83629)

2018-01-15 Thread Segher Boessenkool
On Wed, Jan 10, 2018 at 02:55:07PM +, Segher Boessenkool wrote:
> In various of our 32-bit load_toc patterns we take the difference of
> two immediates (labels) as a term to something bigger; but this isn't
> canonical RTL, it needs to be wrapped in CONST.
> 
> This fixes it.  Tested on powerpc64-linux {-m32,-m64}.  Committing.

I backported this (and its followup tweak, adding ilp32 to the new
testcase) to the GCC 7 branch.


Segher


> 2018-01-10  Segher Boessenkool  
> 
>   PR target/83629
>   * config/rs6000/rs6000.md (load_toc_v4_PIC_2, load_toc_v4_PIC_3b,
>   load_toc_v4_PIC_3c): Wrap const term in CONST RTL.
> 
> testsuite/
>   PR target/83629
>   * gcc.target/powerpc/pr83629.c: New testcase.


[PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-15 Thread Bill Schmidt
Hi,

This patch supercedes v2: 
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01204.html,
and fixes the problems noted in its review.  It also adds the test cases from
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01261.html and adjusts them 
according
to the results of the review.

There is more function to be provided in a future patch:  Sibling calls for all 
ABIs,
and indirect calls for non-ELFv2 ABIs.  I'm getting close on that, but I think 
it's
better to keep that separate at this point.

Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu with no
regressions.  Is this okay for trunk?

Thanks,
Bill


[gcc]

2018-01-15  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
-mspeculate-indirect-jumps.
* config/rs6000/rs6000.md (*call_indirect_elfv2): Disable
for -mno-speculate-indirect-jumps.
(*call_indirect_elfv2_nospec): New define_insn.
(*call_value_indirect_elfv2): Disable for
-mno-speculate-indirect-jumps.
(*call_value_indirect_elfv2_nospec): New define_insn.
(indirect_jump): Emit different RTL for
-mno-speculate-indirect-jumps.
(*indirect_jump): Disable for
-mno-speculate-indirect-jumps.
(*indirect_jump_nospec): New define_insn.
(tablejump): Emit different RTL for
-mno-speculate-indirect-jumps.
(tablejumpsi): Disable for -mno-speculate-indirect-jumps.
(tablejumpsi_nospec): New define_expand.
(tablejumpdi): Disable for -mno-speculate-indirect-jumps.
(tablejumpdi_nospec): New define_expand.
(*tablejump_internal1): Disable for
-mno-speculate-indirect-jumps.
(*tablejump_internal1_nospec): New define_insn.
* config/rs6000/rs6000.opt (mspeculate-indirect-jumps): New
option.

[gcc/testsuite]

2018-01-15  Bill Schmidt  

* gcc.target/powerpc/safe-indirect-jump-1.c: New file.
* gcc.target/powerpc/safe-indirect-jump-2.c: New file.
* gcc.target/powerpc/safe-indirect-jump-3.c: New file.
* gcc.target/powerpc/safe-indirect-jump-4.c: New file.
* gcc.target/powerpc/safe-indirect-jump-5.c: New file.
* gcc.target/powerpc/safe-indirect-jump-6.c: New file.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 256364)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -36726,6 +36726,9 @@ static struct rs6000_opt_var const rs6000_opt_vars
   { "sched-epilog",
 offsetof (struct gcc_options, x_TARGET_SCHED_PROLOG),
 offsetof (struct cl_target_option, x_TARGET_SCHED_PROLOG), },
+  { "speculate-indirect-jumps",
+offsetof (struct gcc_options, x_rs6000_speculate_indirect_jumps),
+offsetof (struct cl_target_option, x_rs6000_speculate_indirect_jumps), },
 };
 
 /* Inner function to handle attribute((target("..."))) and #pragma GCC target
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256364)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -11222,11 +11222,22 @@
 (match_operand 1 "" "g,g"))
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
-  "DEFAULT_ABI == ABI_ELFv2"
+  "DEFAULT_ABI == ABI_ELFv2 && rs6000_speculate_indirect_jumps"
   "b%T0l\; 2,%2(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "8")])
 
+;; Variant with deliberate misprediction.
+(define_insn "*call_indirect_elfv2_nospec"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
+(match_operand 1 "" "g,g"))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_ELFv2 && !rs6000_speculate_indirect_jumps"
+  "crset eq\;beq%T0l-\; 2,%2(1)"
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "12")])
+
 (define_insn "*call_value_indirect_elfv2"
   [(set (match_operand 0 "" "")
(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
@@ -11233,11 +11244,22 @@
  (match_operand 2 "" "g,g")))
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
-  "DEFAULT_ABI == ABI_ELFv2"
+  "DEFAULT_ABI == ABI_ELFv2 && rs6000_speculate_indirect_jumps"
   "b%T1l\; 2,%3(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "8")])
 
+; Variant with deliberate misprediction.
+(define_insn "*call_value_indirect_elfv2_nospec"
+  [(set (match_operand 0 "" "")
+   (call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
+ (match_operand 2 "" "g,g")))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_ELFv2 && !rs6000_speculate_indirect_jumps"
+  "crset 

[PATCH 0/5] GCC 7: x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread H.J. Lu
This set of patches for GCC 7, backported from trunk, mitigates variant
#2 of the speculative execution vulnerabilities on x86 processors
identified by CVE-2017-5715, aka Spectre.  They convert indirect branches
and function returns to call and return thunks to avoid speculative
execution via indirect call, jmp and ret.

Testd on Linux/i686 and Linux/x86-64.  Know issues:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839

There are many test failures on Solaris due to lack of comdat support 
in Solaris linker.


H.J. Lu (5):
  x86: Add -mindirect-branch=
  x86: Add -mfunction-return=
  x86: Add -mindirect-branch-register
  x86: Add 'V' register operand modifier
  x86: Disallow -mindirect-branch=/-mfunction-return= with
-mcmodel=large

 gcc/config/i386/constraints.md |   6 +-
 gcc/config/i386/i386-opts.h|  13 +
 gcc/config/i386/i386-protos.h  |   2 +
 gcc/config/i386/i386.c | 823 -
 gcc/config/i386/i386.h |  10 +
 gcc/config/i386/i386.md|  69 +-
 gcc/config/i386/i386.opt   |  28 +
 gcc/config/i386/predicates.md  |  21 +-
 gcc/doc/extend.texi|  22 +
 gcc/doc/invoke.texi|  41 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-1.c   |  20 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-10.c  |   7 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-2.c   |  20 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-3.c   |  21 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-4.c   |  21 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-5.c   |  17 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-6.c   |  18 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-7.c   |  44 ++
 gcc/testsuite/gcc.target/i386/indirect-thunk-8.c   |   7 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-9.c   |   7 +
 .../gcc.target/i386/indirect-thunk-attr-1.c|  23 +
 .../gcc.target/i386/indirect-thunk-attr-10.c   |   9 +
 .../gcc.target/i386/indirect-thunk-attr-11.c   |   9 +
 .../gcc.target/i386/indirect-thunk-attr-2.c|  21 +
 .../gcc.target/i386/indirect-thunk-attr-3.c|  23 +
 .../gcc.target/i386/indirect-thunk-attr-4.c|  22 +
 .../gcc.target/i386/indirect-thunk-attr-5.c|  22 +
 .../gcc.target/i386/indirect-thunk-attr-6.c|  21 +
 .../gcc.target/i386/indirect-thunk-attr-7.c|  44 ++
 .../gcc.target/i386/indirect-thunk-attr-8.c|  42 ++
 .../gcc.target/i386/indirect-thunk-attr-9.c|   9 +
 .../gcc.target/i386/indirect-thunk-bnd-1.c |  20 +
 .../gcc.target/i386/indirect-thunk-bnd-2.c |  21 +
 .../gcc.target/i386/indirect-thunk-bnd-3.c |  19 +
 .../gcc.target/i386/indirect-thunk-bnd-4.c |  20 +
 .../gcc.target/i386/indirect-thunk-extern-1.c  |  19 +
 .../gcc.target/i386/indirect-thunk-extern-2.c  |  19 +
 .../gcc.target/i386/indirect-thunk-extern-3.c  |  20 +
 .../gcc.target/i386/indirect-thunk-extern-4.c  |  20 +
 .../gcc.target/i386/indirect-thunk-extern-5.c  |  16 +
 .../gcc.target/i386/indirect-thunk-extern-6.c  |  17 +
 .../gcc.target/i386/indirect-thunk-extern-7.c  |  43 ++
 .../gcc.target/i386/indirect-thunk-inline-1.c  |  20 +
 .../gcc.target/i386/indirect-thunk-inline-2.c  |  20 +
 .../gcc.target/i386/indirect-thunk-inline-3.c  |  21 +
 .../gcc.target/i386/indirect-thunk-inline-4.c  |  21 +
 .../gcc.target/i386/indirect-thunk-inline-5.c  |  17 +
 .../gcc.target/i386/indirect-thunk-inline-6.c  |  18 +
 .../gcc.target/i386/indirect-thunk-inline-7.c  |  44 ++
 .../gcc.target/i386/indirect-thunk-register-1.c|  22 +
 .../gcc.target/i386/indirect-thunk-register-2.c|  20 +
 .../gcc.target/i386/indirect-thunk-register-3.c|  19 +
 .../gcc.target/i386/indirect-thunk-register-4.c|  13 +
 gcc/testsuite/gcc.target/i386/ret-thunk-1.c|  13 +
 gcc/testsuite/gcc.target/i386/ret-thunk-10.c   |  23 +
 gcc/testsuite/gcc.target/i386/ret-thunk-11.c   |  23 +
 gcc/testsuite/gcc.target/i386/ret-thunk-12.c   |  22 +
 gcc/testsuite/gcc.target/i386/ret-thunk-13.c   |  22 +
 gcc/testsuite/gcc.target/i386/ret-thunk-14.c   |  22 +
 gcc/testsuite/gcc.target/i386/ret-thunk-15.c   |  22 +
 gcc/testsuite/gcc.target/i386/ret-thunk-16.c   |  18 +
 gcc/testsuite/gcc.target/i386/ret-thunk-17.c   |   7 +
 gcc/testsuite/gcc.target/i386/ret-thunk-18.c   |   8 +
 gcc/testsuite/gcc.target/i386/ret-thunk-19.c   |   8 +
 gcc/testsuite/gcc.target/i386/ret-thunk-2.c|  13 +
 gcc/testsuite/gcc.target/i386/ret-thunk-20.c   |   9 +
 gcc/testsuite/gcc.target/i386/ret-thunk-21.c   |   9 +
 gcc/testsuite/gcc.target/i386/ret-thunk-3.c|  12 +
 gcc/testsuite/gcc.target/i386/ret-thunk-4.c|  12 +
 gcc/testsuite/gcc.target/i386/ret-thunk-5.c|  15 +
 gcc/testsuite/gcc.

[PATCH 5/5] GCC 7: x86: Disallow -mindirect-branch=/-mfunction-return= with -mcmodel=large

2018-01-15 Thread H.J. Lu
Since the thunk function may not be reachable in large code model,
-mcmodel=large is incompatible with -mindirect-branch=thunk,
-mindirect-branch=thunk-extern, -mfunction-return=thunk and
-mfunction-return=thunk-extern.  Issue an error when they are used with
-mcmodel=large.

gcc/

Backport from mainline
* config/i386/i386.c (ix86_set_indirect_branch_type): Disallow
-mcmodel=large with -mindirect-branch=thunk,
-mindirect-branch=thunk-extern, -mfunction-return=thunk and
-mfunction-return=thunk-extern.
* doc/invoke.texi: Document -mcmodel=large is incompatible with
-mindirect-branch=thunk, -mindirect-branch=thunk-extern,
-mfunction-return=thunk and -mfunction-return=thunk-extern.

gcc/testsuite/

Backport from mainline
* gcc.target/i386/indirect-thunk-10.c: New test.
* gcc.target/i386/indirect-thunk-8.c: Likewise.
* gcc.target/i386/indirect-thunk-9.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-10.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-11.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-9.c: Likewise.
* gcc.target/i386/ret-thunk-17.c: Likewise.
* gcc.target/i386/ret-thunk-18.c: Likewise.
* gcc.target/i386/ret-thunk-19.c: Likewise.
* gcc.target/i386/ret-thunk-20.c: Likewise.
* gcc.target/i386/ret-thunk-21.c: Likewise.
---
 gcc/config/i386/i386.c | 26 ++
 gcc/doc/invoke.texi| 11 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-10.c  |  7 ++
 gcc/testsuite/gcc.target/i386/indirect-thunk-8.c   |  7 ++
 gcc/testsuite/gcc.target/i386/indirect-thunk-9.c   |  7 ++
 .../gcc.target/i386/indirect-thunk-attr-10.c   |  9 
 .../gcc.target/i386/indirect-thunk-attr-11.c   |  9 
 .../gcc.target/i386/indirect-thunk-attr-9.c|  9 
 gcc/testsuite/gcc.target/i386/ret-thunk-17.c   |  7 ++
 gcc/testsuite/gcc.target/i386/ret-thunk-18.c   |  8 +++
 gcc/testsuite/gcc.target/i386/ret-thunk-19.c   |  8 +++
 gcc/testsuite/gcc.target/i386/ret-thunk-20.c   |  9 
 gcc/testsuite/gcc.target/i386/ret-thunk-21.c   |  9 
 13 files changed, 126 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-9.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-attr-9.c
 create mode 100644 gcc/testsuite/gcc.target/i386/ret-thunk-17.c
 create mode 100644 gcc/testsuite/gcc.target/i386/ret-thunk-18.c
 create mode 100644 gcc/testsuite/gcc.target/i386/ret-thunk-19.c
 create mode 100644 gcc/testsuite/gcc.target/i386/ret-thunk-20.c
 create mode 100644 gcc/testsuite/gcc.target/i386/ret-thunk-21.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1bbdd0cc3f8..e758387dacd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -7187,6 +7187,19 @@ ix86_set_indirect_branch_type (tree fndecl)
}
   else
cfun->machine->indirect_branch_type = ix86_indirect_branch;
+
+  /* -mcmodel=large is not compatible with -mindirect-branch=thunk
+nor -mindirect-branch=thunk-extern.  */
+  if ((ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)
+ && ((cfun->machine->indirect_branch_type
+  == indirect_branch_thunk_extern)
+ || (cfun->machine->indirect_branch_type
+ == indirect_branch_thunk)))
+   error ("%<-mindirect-branch=%s%> and %<-mcmodel=large%> are not "
+  "compatible",
+  ((cfun->machine->indirect_branch_type
+== indirect_branch_thunk_extern)
+   ? "thunk-extern" : "thunk"));
 }
 
   if (cfun->machine->function_return_type == indirect_branch_unset)
@@ -7212,6 +7225,19 @@ ix86_set_indirect_branch_type (tree fndecl)
}
   else
cfun->machine->function_return_type = ix86_function_return;
+
+  /* -mcmodel=large is not compatible with -mfunction-return=thunk
+nor -mfunction-return=thunk-extern.  */
+  if ((ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)
+ && ((cfun->machine->function_return_type
+  == indirect_branch_thunk_extern)
+ || (cfun->machine->function_return_type
+ == indirect_branch_thunk)))
+   error ("%<-mfunction-return=%s%> and %<-mcmodel=large%> are not "
+  "compatible",
+  ((cfun->machine->function_return_type
+== indirect_branch_thunk_extern)
+   ? "thunk-extern" : "thunk"));
 }
 }
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1e572b1f9a2..6f3c3

[PATCH 4/5] GCC 7: x86: Add 'V' register operand modifier

2018-01-15 Thread H.J. Lu
Add 'V', a special modifier which prints the name of the full integer
register without '%'.  For

extern void (*func_p) (void);

void
foo (void)
{
  asm ("call __x86_indirect_thunk_%V0" : : "a" (func_p));
}

it generates:

foo:
movqfunc_p(%rip), %rax
call__x86_indirect_thunk_rax
ret

gcc/

Backport from mainline
* config/i386/i386.c (print_reg): Print the name of the full
integer register without '%'.
(ix86_print_operand): Handle 'V'.
 * doc/extend.texi: Document 'V' modifier.

gcc/testsuite/

Backport from mainline
* gcc.target/i386/indirect-thunk-register-4.c: New test.
---
 gcc/config/i386/i386.c| 13 -
 gcc/doc/extend.texi   |  3 +++
 gcc/testsuite/gcc.target/i386/indirect-thunk-register-4.c | 13 +
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-register-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8fb89027d97..1bbdd0cc3f8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -17941,6 +17941,7 @@ put_condition_code (enum rtx_code code, machine_mode 
mode, bool reverse,
If CODE is 'h', pretend the reg is the 'high' byte register.
If CODE is 'y', print "st(0)" instead of "st", if the reg is stack op.
If CODE is 'd', duplicate the operand for AVX instruction.
+   If CODE is 'V', print naked full integer register name without %.
  */
 
 void
@@ -17951,7 +17952,7 @@ print_reg (rtx x, int code, FILE *file)
   unsigned int regno;
   bool duplicated;
 
-  if (ASSEMBLER_DIALECT == ASM_ATT)
+  if (ASSEMBLER_DIALECT == ASM_ATT && code != 'V')
 putc ('%', file);
 
   if (x == pc_rtx)
@@ -17999,6 +18000,14 @@ print_reg (rtx x, int code, FILE *file)
   return;
 }
 
+  if (code == 'V')
+{
+  if (GENERAL_REGNO_P (regno))
+   msize = GET_MODE_SIZE (word_mode);
+  else
+   error ("'V' modifier on non-integer register");
+}
+
   duplicated = code == 'd' && TARGET_AVX;
 
   switch (msize)
@@ -18118,6 +18127,7 @@ print_reg (rtx x, int code, FILE *file)
& -- print some in-use local-dynamic symbol name.
H -- print a memory address offset by 8; used for sse high-parts
Y -- print condition for XOP pcom* instruction.
+   V -- print naked full integer register name without %.
+ -- print a branch hint as 'cs' or 'ds' prefix
; -- print a semicolon (after prefixes due to bug in older gas).
~ -- print "i" if TARGET_AVX2, "f" otherwise.
@@ -18342,6 +18352,7 @@ ix86_print_operand (FILE *file, rtx x, int code)
case 'X':
case 'P':
case 'p':
+   case 'V':
  break;
 
case 's':
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 46e0a3623a6..9db9e0e27e9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8778,6 +8778,9 @@ The table below shows the list of supported modifiers and 
their effects.
 @tab @code{2}
 @end multitable
 
+@code{V} is a special modifier which prints the name of the full integer
+register without @code{%}.
+
 @anchor{x86floatingpointasmoperands}
 @subsubsection x86 Floating-Point @code{asm} Operands
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-register-4.c 
b/gcc/testsuite/gcc.target/i386/indirect-thunk-register-4.c
new file mode 100644
index 000..f0cd9b75be8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-register-4.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mindirect-branch=keep -fno-pic" } */
+
+extern void (*func_p) (void);
+
+void
+foo (void)
+{
+  asm("call __x86_indirect_thunk_%V0" : : "a" (func_p));
+}
+
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_eax" { target 
ia32 } } } */
+/* { dg-final { scan-assembler "call\[ \t\]*__x86_indirect_thunk_rax" { target 
{ ! ia32 } } } } */
-- 
2.14.3



[PATCH 3/5] GCC 7: x86: Add -mindirect-branch-register

2018-01-15 Thread H.J. Lu
Add -mindirect-branch-register to force indirect branch via register.
This is implemented by disabling patterns of indirect branch via memory,
similar to TARGET_X32.

-mindirect-branch= and -mfunction-return= tests are updated with
-mno-indirect-branch-register to avoid false test failures when
-mindirect-branch-register is added to RUNTESTFLAGS for "make check".

gcc/

Backport from mainline
* config/i386/constraints.md (Bs): Disallow memory operand for
-mindirect-branch-register.
(Bw): Likewise.
* config/i386/predicates.md (indirect_branch_operand): Likewise.
(GOT_memory_operand): Likewise.
(call_insn_operand): Likewise.
(sibcall_insn_operand): Likewise.
(GOT32_symbol_operand): Likewise.
* config/i386/i386.md (indirect_jump): Call convert_memory_address
for -mindirect-branch-register.
(tablejump): Likewise.
(*sibcall_memory): Likewise.
(*sibcall_value_memory): Likewise.
Disallow peepholes of indirect call and jump via memory for
-mindirect-branch-register.
(*call_pop): Replace m with Bw.
(*call_value_pop): Likewise.
(*sibcall_pop_memory): Replace m with Bs.
* config/i386/i386.opt (mindirect-branch-register): New option.
* doc/invoke.texi: Document -mindirect-branch-register option.

gcc/testsuite/

Backport from mainline
* gcc.target/i386/indirect-thunk-1.c (dg-options): Add
-mno-indirect-branch-register.
* gcc.target/i386/indirect-thunk-2.c: Likewise.
* gcc.target/i386/indirect-thunk-3.c: Likewise.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-5.c: Likewise.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-3.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-4.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-1.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-2.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-3.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-5.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-6.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-7.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-1.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-2.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-3.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-4.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-5.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-6.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
* gcc.target/i386/ret-thunk-10.c: Likewise.
* gcc.target/i386/ret-thunk-11.c: Likewise.
* gcc.target/i386/ret-thunk-12.c: Likewise.
* gcc.target/i386/ret-thunk-13.c: Likewise.
* gcc.target/i386/ret-thunk-14.c: Likewise.
* gcc.target/i386/ret-thunk-15.c: Likewise.
* gcc.target/i386/ret-thunk-9.c: Likewise.
* gcc.target/i386/indirect-thunk-register-1.c: New test.
* gcc.target/i386/indirect-thunk-register-2.c: Likewise.
* gcc.target/i386/indirect-thunk-register-3.c: Likewise.

i386: Rename to ix86_indirect_branch_register

Rename the variable for -mindirect-branch-register to
ix86_indirect_branch_register to match the command-line option name.

Backport from mainline
* config/i386/constraints.md (Bs): Replace
ix86_indirect_branch_thunk_register with
ix86_indirect_branch_register.
(Bw): Likewise.
* config/i386/i386.md (indirect_jump): Likewise.
(tablejump): Likewise.
(*sibcall_memory): Likewise.
(*sibcall_value_memory): Likewise.
Peepholes of indirect call and jump via memory: Likewise.
* config/i386/i386.opt: Likewise.
* config/i386/predicates.md (indirect_branch_operand): Likewise.
(GOT_memory_operand): Likewise.
(call_insn_operand): Likewise.
(sibcall_insn_operand): Likewise.
(GOT32_symbol_operand): Likewise.

x86: Rewrite ix86_indirect_branch_register logic

Rewrite ix86_indirect_branch_register logic with

(and (not (match_test "ix86_indirect_branch_register"))

[PATCH 2/5] GCC 7: x86: Add -mfunction-return=

2018-01-15 Thread H.J. Lu
Add -mfunction-return= option to convert function return to call and
return thunks.  The default is 'keep', which keeps function return
unmodified.  'thunk' converts function return to call and return thunk.
'thunk-inline' converts function return to inlined call and return thunk.
'thunk-extern' converts function return to external call and return
thunk provided in a separate object file.  You can control this behavior
for a specific function by using the function attribute function_return.

Function return thunk is the same as memory thunk for -mindirect-branch=
where the return address is at the top of the stack:

__x86_return_thunk:
call L2
L1:
pause
lfence
jmp L1
L2:
lea 8(%rsp), %rsp|lea 4(%esp), %esp
ret

and function return becomes

jmp __x86_return_thunk

-mindirect-branch= tests are updated with -mfunction-return=keep to
avoid false test failures when -mfunction-return=thunk is added to
RUNTESTFLAGS for "make check".

gcc/

Backport from mainline
* config/i386/i386-protos.h (ix86_output_function_return): New.
* config/i386/i386.c (ix86_set_indirect_branch_type): Also
set function_return_type.
(indirect_thunk_name): Add ret_p to indicate thunk for function
return.
(output_indirect_thunk_function): Pass false to
indirect_thunk_name.
(ix86_output_indirect_branch_via_reg): Likewise.
(ix86_output_indirect_branch_via_push): Likewise.
(output_indirect_thunk_function): Create alias for function
return thunk if regno < 0.
(ix86_output_function_return): New function.
(ix86_handle_fndecl_attribute): Handle function_return.
(ix86_attribute_table): Add function_return.
* config/i386/i386.h (machine_function): Add
function_return_type.
* config/i386/i386.md (simple_return_internal): Use
ix86_output_function_return.
(simple_return_internal_long): Likewise.
* config/i386/i386.opt (mfunction-return=): New option.
(indirect_branch): Mention -mfunction-return=.
* doc/extend.texi: Document function_return function attribute.
* doc/invoke.texi: Document -mfunction-return= option.

gcc/testsuite/

Backport from mainline
* gcc.target/i386/indirect-thunk-1.c (dg-options): Add
-mfunction-return=keep.
* gcc.target/i386/indirect-thunk-2.c: Likewise.
* gcc.target/i386/indirect-thunk-3.c: Likewise.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-5.c: Likewise.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-3.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-4.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-8.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-1.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-2.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-3.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-5.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-6.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-7.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-1.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-2.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-3.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-4.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-5.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-6.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-7.c: Likewise.
* gcc.target/i386/ret-thunk-1.c: New test.
* gcc.target/i386/ret-thunk-10.c: Likewise.
* gcc.target/i386/ret-thunk-11.c: Likewise.
* gcc.target/i386/ret-thunk-12.c: Likewise.
* gcc.target/i386/ret-thunk-13.c: Likewise.
* gcc.target/i386/ret-thunk-14.c: Likewise.
* gcc.target/i386/ret-thunk-15.c: Likewise.
* gcc.target/i386/ret-thunk-16.c: Likewise.
* gcc.target/i386/ret-thunk-2.c: Likewise.
* gcc.target/i386/ret-thunk-3.c: Likewise.
* gcc.target/i386/ret-thunk-4.c: Likewise.
* gcc.target/i386/ret-thunk-5.c: Likewise.
* gcc.target/i386/ret-thunk-6.c:

[PATCH 1/5] GCC 7: x86: Add -mindirect-branch=

2018-01-15 Thread H.J. Lu
Add -mindirect-branch= option to convert indirect call and jump to call
and return thunks.  The default is 'keep', which keeps indirect call and
jump unmodified.  'thunk' converts indirect call and jump to call and
return thunk.  'thunk-inline' converts indirect call and jump to inlined
call and return thunk.  'thunk-extern' converts indirect call and jump to
external call and return thunk provided in a separate object file.  You
can control this behavior for a specific function by using the function
attribute indirect_branch.

2 kinds of thunks are geneated.  Memory thunk where the function address
is at the top of the stack:

__x86_indirect_thunk:
call L2
L1:
pause
lfence
jmp L1
L2:
lea 8(%rsp), %rsp|lea 4(%esp), %esp
ret

Indirect jmp via memory, "jmp mem", is converted to

push memory
jmp __x86_indirect_thunk

Indirect call via memory, "call mem", is converted to

jmp L2
L1:
push [mem]
jmp __x86_indirect_thunk
L2:
call L1

Register thunk where the function address is in a register, reg:

__x86_indirect_thunk_reg:
callL2
L1:
pause
lfence
jmp L1
L2:
movq%reg, (%rsp)|movl%reg, (%esp)
ret

where reg is one of (r|e)ax, (r|e)dx, (r|e)cx, (r|e)bx, (r|e)si, (r|e)di,
(r|e)bp, r8, r9, r10, r11, r12, r13, r14 and r15.

Indirect jmp via register, "jmp reg", is converted to

jmp __x86_indirect_thunk_reg

Indirect call via register, "call reg", is converted to

call __x86_indirect_thunk_reg

gcc/

Backport from mainline
* config/i386/i386-opts.h (indirect_branch): New.
* config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
* config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
with local indirect jump when converting indirect call and jump.
(ix86_set_indirect_branch_type): New.
(ix86_set_current_function): Call ix86_set_indirect_branch_type.
(indirectlabelno): New.
(indirect_thunk_needed): Likewise.
(indirect_thunk_bnd_needed): Likewise.
(indirect_thunks_used): Likewise.
(indirect_thunks_bnd_used): Likewise.
(INDIRECT_LABEL): Likewise.
(indirect_thunk_name): Likewise.
(output_indirect_thunk): Likewise.
(output_indirect_thunk_function): Likewise.
(ix86_output_indirect_branch_via_reg): Likewise.
(ix86_output_indirect_branch_via_push): Likewise.
(ix86_output_indirect_branch): Likewise.
(ix86_output_indirect_jmp): Likewise.
(ix86_code_end): Call output_indirect_thunk_function if needed.
(ix86_output_call_insn): Call ix86_output_indirect_branch if
needed.
(ix86_handle_fndecl_attribute): Handle indirect_branch.
(ix86_attribute_table): Add indirect_branch.
* config/i386/i386.h (machine_function): Add indirect_branch_type
and has_local_indirect_jump.
* config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
to true.
(tablejump): Likewise.
(*indirect_jump): Use ix86_output_indirect_jmp.
(*tablejump_1): Likewise.
(simple_return_indirect_internal): Likewise.
* config/i386/i386.opt (mindirect-branch=): New option.
(indirect_branch): New.
(keep): Likewise.
(thunk): Likewise.
(thunk-inline): Likewise.
(thunk-extern): Likewise.
* doc/extend.texi: Document indirect_branch function attribute.
* doc/invoke.texi: Document -mindirect-branch= option.

gcc/testsuite/

Backport from mainline
* gcc.target/i386/indirect-thunk-1.c: New test.
* gcc.target/i386/indirect-thunk-2.c: Likewise.
* gcc.target/i386/indirect-thunk-3.c: Likewise.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-5.c: Likewise.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-3.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-4.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-7.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-8.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-1.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-2.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-3.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
* gcc.target/i386/indirect-

GCC 7 backport [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread H.J. Lu
On Mon, Jan 15, 2018 at 8:53 AM, H.J. Lu  wrote:
> On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
>> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
>>  wrote:
>>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
 Now my patch set has been checked into trunk.  Here is a patch set
 to move struct ix86_frame to machine_function on GCC 7, which is
 needed to backport the patch set to GCC 7:

 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html

 OK for gcc-7-branch?
>>>
>>> Yes, backporting is ok - please watch for possible fallout on trunk and make
>>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
>>> Wednesday now with the final release about a week later if no issue shows
>>> up.
>>>
>>
>> Backport is blocked by
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>>
>> There are many test failures due to lack of comdat support in linker on 
>> Solaris.
>> I can limit these tests to Linux.
>
> These are testcase issues and shouldn't block backport to GCC 7.
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
>>
>> Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  I
>> uploaded a patch:
>>
>> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
>>
>> There is no confirmation on it.  Also there may be test failures on Dardwin
>> due to difference in assembly output.
>
> I posted a patch for Darwin build:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html
>
> This needs to be checked into trunk before I can start backport to GCC 7.

Darwin build has been fixed.  I believe that Solaris issue should be
addressed by Solaris linker.

I backported all Spectre related patches to GCC 7:

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01389.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01388.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01386.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01387.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01384.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01385.html

OK for gcc-7-branch?

Thanks.

-- 
H.J.


Fortran, committed: ICE on CLASS(*) function result (pr 82257)

2018-01-15 Thread Louis Krupp
Fixed in revision 256720.   





   
  
 




Re: [C++ PATCH] Fix checking ICE in pt.c (PR c++/83817)

2018-01-15 Thread Jason Merrill
OK.

On Mon, Jan 15, 2018 at 4:58 PM, Jakub Jelinek  wrote:
> Hi!
>
> function in this case can be either a CALL_EXPR or AGGR_INIT_EXPR.
> CALL_FROM_THUNK_P macro is defined in tree.h and so knows just about the
> generic CALL_EXPR, and the C++ FE adds AGGR_INIT_FROM_THUNK_P macro, which
> is defined the same (protected_flag) except for the checking, one requires
> a CALL_EXPR, another one AGGR_INIT_EXPR.  So, this spot seemed to do the
> right thing actually when doing release checking, just in non-release
> checking it would ICE if function is AGGR_INIT_EXPR.  From the
> AGGR_INIT_FROM_THUNK_P flag we later on set CALL_FROM_THUNK_P when we later
> generate the CALL_EXPR.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-01-15  Jakub Jelinek  
>
> PR c++/83817
> * pt.c (tsubst_copy_and_build) : If function
> is AGGR_INIT_EXPR rather than CALL_EXPR, set AGGR_INIT_FROM_THUNK_P
> instead of CALL_FROM_THUNK_P.
>
> * g++.dg/cpp1y/pr83817.C: New test.
>
> --- gcc/cp/pt.c.jj  2018-01-11 18:58:48.365391793 +0100
> +++ gcc/cp/pt.c 2018-01-15 18:32:44.433150762 +0100
> @@ -17819,7 +17819,10 @@ tsubst_copy_and_build (tree t,
> CALL_EXPR_REVERSE_ARGS (function) = rev;
> if (thk)
>   {
> -   CALL_FROM_THUNK_P (function) = true;
> +   if (TREE_CODE (function) == CALL_EXPR)
> + CALL_FROM_THUNK_P (function) = true;
> +   else
> + AGGR_INIT_FROM_THUNK_P (function) = true;
> /* The thunk location is not interesting.  */
> SET_EXPR_LOCATION (function, UNKNOWN_LOCATION);
>   }
> --- gcc/testsuite/g++.dg/cpp1y/pr83817.C.jj 2018-01-15 18:34:37.494143930 
> +0100
> +++ gcc/testsuite/g++.dg/cpp1y/pr83817.C2018-01-15 18:34:05.212145878 
> +0100
> @@ -0,0 +1,17 @@
> +// PR c++/83817
> +// { dg-do compile { target c++14 } }
> +
> +struct A;
> +struct B { template  using C = A; };
> +struct D : B { struct F { typedef C E; }; };
> +struct G {
> +  struct I { I (D, A &); } h;
> +  D::F::E &k ();
> +  D j;
> +  G (G &&) : h (j, k ()) {}
> +};
> +struct N { G l; };
> +typedef N (*M)(N &);
> +struct H { const char *o; M s; };
> +N foo (N &);
> +H r { "", [](auto &x) { return foo (x); }};
>
> Jakub


Re: [C++ PATCH] Fix ICE in member_vec_dedup (PR c++/83825)

2018-01-15 Thread Jason Merrill
OK.

On Mon, Jan 15, 2018 at 4:46 PM, Jakub Jelinek  wrote:
> Hi!
>
> As the testcase shows, calls to member_vec_dedup and qsort are just guarded
> by the vector being non-NULL, which doesn't mean it must be non-empty,
> so we can't do (*member_vec)[0] on it.  Fixed by the second hunk, the
> rest is just a small cleanup to use the vec.h methods.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-01-15  Jakub Jelinek  
>
> PR c++/83825
> * name-lookup.c (member_vec_dedup): Return early if len is 0.
> (resort_type_member_vec, set_class_bindings,
> insert_late_enum_def_bindings): Use vec qsort method instead of
> calling qsort directly.
>
> * g++.dg/template/pr83825.C: New test.
>
> --- gcc/cp/name-lookup.c.jj 2018-01-03 13:16:38.537848205 +0100
> +++ gcc/cp/name-lookup.c2018-01-15 14:07:09.494576944 +0100
> @@ -1520,8 +1520,7 @@ resort_type_member_vec (void *obj, void
>  {
>resort_data.new_value = new_value;
>resort_data.cookie = cookie;
> -  qsort (member_vec->address (), member_vec->length (),
> -sizeof (tree), resort_member_name_cmp);
> +  member_vec->qsort (resort_member_name_cmp);
>  }
>  }
>
> @@ -1597,6 +1596,9 @@ member_vec_dedup (vec *memb
>unsigned len = member_vec->length ();
>unsigned store = 0;
>
> +  if (!len)
> +return;
> +
>tree current = (*member_vec)[0], name = OVL_NAME (current);
>tree next = NULL_TREE, next_name = NULL_TREE;
>for (unsigned jx, ix = 0; ix < len;
> @@ -1712,8 +1714,7 @@ set_class_bindings (tree klass, unsigned
>if (member_vec)
>  {
>CLASSTYPE_MEMBER_VEC (klass) = member_vec;
> -  qsort (member_vec->address (), member_vec->length (),
> -sizeof (tree), member_name_cmp);
> +  member_vec->qsort (member_name_cmp);
>member_vec_dedup (member_vec);
>  }
>  }
> @@ -1741,8 +1742,7 @@ insert_late_enum_def_bindings (tree klas
>else
> member_vec_append_class_fields (member_vec, klass);
>CLASSTYPE_MEMBER_VEC (klass) = member_vec;
> -  qsort (member_vec->address (), member_vec->length (),
> -sizeof (tree), member_name_cmp);
> +  member_vec->qsort (member_name_cmp);
>member_vec_dedup (member_vec);
>  }
>  }
> --- gcc/testsuite/g++.dg/template/pr83825.C.jj  2018-01-15 14:16:55.289432205 
> +0100
> +++ gcc/testsuite/g++.dg/template/pr83825.C 2018-01-15 14:14:03.172490348 
> +0100
> @@ -0,0 +1,13 @@
> +// PR c++/83825
> +// { dg-do compile }
> +
> +template 
> +class A {};// { dg-error "shadows template parameter" }
> +
> +template 
> +class B
> +{
> +  void foo () { A  a; }
> +};
> +
> +template void B <0>::foo ();
>
> Jakub


Re: [PATCH] handle multiple flexible array members (PR 83588)

2018-01-15 Thread Jason Merrill
OK.

On Sun, Jan 14, 2018 at 6:47 PM, Martin Sebor  wrote:
> The attached patch fixes PR c++/83588 - struct with two flexible
> arrays causes an internal compiler error.  The ICE is caused by
> the same assertion in varasm.c that has led to other similar
> reports in the past:
>
>   /* Given a non-empty initialization, this field had better
>  be last.  Given a flexible array member, the next field
>  on the chain is a TYPE_DECL of the enclosing struct.  */
>   const_tree next = DECL_CHAIN (local->field);
>   gcc_assert (!fieldsize || !next || TREE_CODE (next) != FIELD_DECL);
>
> The fix is simply to also detect when a class defines more than
> one flexible array member and treat the subsequent array as any
> other member, and reject such class definitions to make sure they
> never reach the assertion above.
>
> Martin


[PATCH] RISC-V: Increase mult/div cost if not implemented in hardware.

2018-01-15 Thread Jim Wilson
This increases the cost of multiply and divide when not present.  This makes it
more likely that a multiply by constant gets replaced by a sequence of shift
and adds which is faster than a call to a libgcc routine.  The divide cost
change doesn't do anything useful at present, but is added for consistency.

This was tested with a make check-gcc.  There were no regressions.

Committed.

Jim

2018-01-15  Andrew Waterman  
gcc/
* config/riscv/riscv.c (riscv_rtx_costs) : Increase cost if
!TARGET_MUL.
: Increase cost if !TARGET_DIV.
---
 gcc/config/riscv/riscv.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index d260c0ebae1..19a01e0825a 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -1615,6 +1615,9 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 case MULT:
   if (float_mode_p)
*total = tune_info->fp_mul[mode == DFmode];
+  else if (!TARGET_MUL)
+   /* Estimate the cost of a library call.  */
+   *total = COSTS_N_INSNS (speed ? 32 : 6);
   else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
*total = 3 * tune_info->int_mul[0] + COSTS_N_INSNS (2);
   else if (!speed)
@@ -1635,7 +1638,10 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 
 case UDIV:
 case UMOD:
-  if (speed)
+  if (!TARGET_DIV)
+   /* Estimate the cost of a library call.  */
+   *total = COSTS_N_INSNS (speed ? 32 : 6);
+  else if (speed)
*total = tune_info->int_div[mode == DImode];
   else
*total = COSTS_N_INSNS (1);
-- 
2.14.1



Re: [PATCH] handle multiple flexible array members (PR 83588)

2018-01-15 Thread Martin Sebor

On 01/15/2018 07:10 PM, Jason Merrill wrote:

OK.


Thanks.  I keep forgetting to get approval to backport these
simple bug fixes.  Is this one okay for the 7 and 6 branches?

Martin



On Sun, Jan 14, 2018 at 6:47 PM, Martin Sebor  wrote:

The attached patch fixes PR c++/83588 - struct with two flexible
arrays causes an internal compiler error.  The ICE is caused by
the same assertion in varasm.c that has led to other similar
reports in the past:

  /* Given a non-empty initialization, this field had better
 be last.  Given a flexible array member, the next field
 on the chain is a TYPE_DECL of the enclosing struct.  */
  const_tree next = DECL_CHAIN (local->field);
  gcc_assert (!fieldsize || !next || TREE_CODE (next) != FIELD_DECL);

The fix is simply to also detect when a class defines more than
one flexible array member and treat the subsequent array as any
other member, and reject such class definitions to make sure they
never reach the assertion above.

Martin




[PATCH, doc] NDS32: Add -mext-perf -mext-perf2 and -mext-string in the documentation

2018-01-15 Thread Chung-Ju Wu

Hi, all,

In this patch of nds32 port:
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01585.html

We add new options for NDS32 target.
So we need to update documentation as well.
The patch is attached and the plaintext ChangeLog is as follow:

gcc/ChangeLog

* doc/invoke.texi (NDS32 Options): Add -mext-perf, -mext-perf2 and
-mext-string options.


Committed as https://gcc.gnu.org/r256564

I will also cover these changes in the release notes at
htdocs/gcc-8/changes.html with another wwwdoc patch soon.

Best regards,
jasonwucj
>From 8b225b1b3f3b425ea180ba4a932e4201a95903c0 Mon Sep 17 00:00:00 2001
From: Chung-Ju Wu 
Date: Fri, 10 Nov 2017 11:42:00 +0800
Subject: [PATCH] Add new nds32 options "-mext-perf", "-mext-perf2", and
 "-mext-string" in the documentation.

---
 gcc/doc/invoke.texi | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c443c66..a9449a8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -941,7 +941,9 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-mbig-endian  -mlittle-endian @gol
 -mreduced-regs  -mfull-regs @gol
 -mcmov  -mno-cmov @gol
--mperf-ext  -mno-perf-ext @gol
+-mext-perf  -mno-ext-perf @gol
+-mext-perf2  -mno-ext-perf2 @gol
+-mext-string  -mno-ext-string @gol
 -mv3push  -mno-v3push @gol
 -m16bit  -mno-16bit @gol
 -misr-vector-size=@var{num} @gol
@@ -21304,14 +21306,30 @@ Generate conditional move instructions.
 @opindex mno-cmov
 Do not generate conditional move instructions.
 
-@item -mperf-ext
+@item -mext-perf
 @opindex mperf-ext
 Generate performance extension instructions.
 
-@item -mno-perf-ext
+@item -mno-ext-perf
 @opindex mno-perf-ext
 Do not generate performance extension instructions.
 
+@item -mext-perf2
+@opindex mperf-ext
+Generate performance extension 2 instructions.
+
+@item -mno-ext-perf2
+@opindex mno-perf-ext
+Do not generate performance extension 2 instructions.
+
+@item -mext-string
+@opindex mperf-ext
+Generate string extension instructions.
+
+@item -mno-ext-string
+@opindex mno-perf-ext
+Do not generate string extension instructions.
+
 @item -mv3push
 @opindex mv3push
 Generate v3 push25/pop25 instructions.
-- 
1.8.3.1



Re: [PATCH] handle multiple flexible array members (PR 83588)

2018-01-15 Thread Jason Merrill
On Mon, Jan 15, 2018 at 10:05 PM, Martin Sebor  wrote:
> On 01/15/2018 07:10 PM, Jason Merrill wrote:
>>
>> OK.
>
> Thanks.  I keep forgetting to get approval to backport these
> simple bug fixes.  Is this one okay for the 7 and 6 branches?

Yes.

Jason


Re: [PATCH][arm] XFAIL advsimd-intrinsics/vld1x2.c

2018-01-15 Thread Kugan Vivekanandarajah
Hi Kyrill,

Sorry for the breakage and thanks for fixing the testcase.

Thanks,
Kugan

On 12 January 2018 at 02:33, Kyrill Tkachov 
wrote:

> Hi all,
>
> This recently added test fails on arm. We haven't implemented these
> intrinsics for arm
> (any volunteers?) so for now let's XFAIL these on that target.
> Also, the float64 versions of these intrinsics are not supposed to be
> available on arm
> so this patch slightly adjusts the test to not include them for aarch32.
> In any case the entire test is XFAILed on arm, so this doesn't have any
> noticeable
> effect.
>
> The same number of tests (PASS) still occur on aarch64 but now they appear
> as XFAIL
> rather than FAIL on arm.
>
> Ok for trunk? (from an aarch64 perspective).
>
> Thanks,
> Kyrill
>
> 2018-01-11  Kyrylo Tkachov  
>
> * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Make float64
> tests specific to aarch64.  XFAIL test on arm.
>


Re: GCC 7 backport [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-15 Thread Jan Hubicka
> On Mon, Jan 15, 2018 at 8:53 AM, H.J. Lu  wrote:
> > On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
> >> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
> >>  wrote:
> >>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>  Now my patch set has been checked into trunk.  Here is a patch set
>  to move struct ix86_frame to machine_function on GCC 7, which is
>  needed to backport the patch set to GCC 7:
> 
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
> 
>  OK for gcc-7-branch?
> >>>
> >>> Yes, backporting is ok - please watch for possible fallout on trunk and 
> >>> make
> >>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
> >>> Wednesday now with the final release about a week later if no issue shows
> >>> up.
> >>>
> >>
> >> Backport is blocked by
> >>
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
> >>
> >> There are many test failures due to lack of comdat support in linker on 
> >> Solaris.
> >> I can limit these tests to Linux.
> >
> > These are testcase issues and shouldn't block backport to GCC 7.
> >
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
> >>
> >> Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  
> >> I
> >> uploaded a patch:
> >>
> >> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
> >>
> >> There is no confirmation on it.  Also there may be test failures on Dardwin
> >> due to difference in assembly output.
> >
> > I posted a patch for Darwin build:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html
> >
> > This needs to be checked into trunk before I can start backport to GCC 7.
> 
> Darwin build has been fixed.  I believe that Solaris issue should be
> addressed by Solaris linker.
> 
> I backported all Spectre related patches to GCC 7:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01389.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01388.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01386.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01387.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01384.html
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01385.html
> 
> OK for gcc-7-branch?

OK,
thanks!
Honza
> 
> Thanks.
> 
> -- 
> H.J.


  1   2   >