Re: [102/nnn] poly_int: vect_permute_load/store_chain

2017-11-20 Thread Jeff Law
On 10/23/2017 11:41 AM, Richard Sandiford wrote:
> The GET_MODE_NUNITS patch made vect_grouped_store_supported and
> vect_grouped_load_supported check for a constant number of elements,
> so vect_permute_store_chain and vect_permute_load_chain can assert
> for that.  This patch adds commentary to that effect; the actual
> asserts will be added by a later, more mechanical, patch.
> 
> The patch also reorganises the function so that the asserts
> are linked specifically to code that builds permute vectors
> element-by-element.  This allows a later patch to add support
> for some variable-length permutes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-data-refs.c (vect_permute_store_chain): Reorganize
>   so that both the length == 3 and length != 3 cases set up their
>   own permute vectors.  Add comments explaining why we know the
>   number of elements is constant.
>   (vect_permute_load_chain): Likewise.
This is OK.  Obviously it can't go in until if/when the whole it is ack'd.

Jeff


Re: [099/nnn] poly_int: struct_value_size

2017-11-20 Thread Jeff Law
On 10/23/2017 11:40 AM, Richard Sandiford wrote:
> This patch makes calls.c treat struct_value_size (one of the
> operands to a call pattern) as polynomial.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * calls.c (emit_call_1, expand_call): Change struct_value_size from
>   a HOST_WIDE_INT to a poly_int64.
This is OK.  Obviously it can't go in until if/when the whole it is ack'd.

Jeff


Re: [105/nnn] poly_int: expand_assignment

2017-11-20 Thread Jeff Law
On 10/23/2017 11:42 AM, Richard Sandiford wrote:
> This patch makes the CONCAT handing in expand_assignment cope with
> polynomial mode sizes.  The mode of the CONCAT must be complex,
> so we can base the tests on the sizes of the real and imaginary
> components.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * expr.c (expand_assignment): Cope with polynomial mode sizes
>   when assigning to a CONCAT.
This is OK.  Obviously it can't go in until if/when the whole it is ack'd.

Jeff


Re: [106/nnn] poly_int: GET_MODE_BITSIZE

2017-11-20 Thread Jeff Law
On 10/23/2017 11:43 AM, Richard Sandiford wrote:
> This patch changes GET_MODE_BITSIZE from an unsigned short
> to a poly_uint16.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * machmode.h (mode_to_bits): Return a poly_uint16 rather than an
>   unsigned short.
>   (GET_MODE_BITSIZE): Return a constant if ONLY_FIXED_SIZE_MODES,
>   or if measurement_type is polynomial.
>   * calls.c (shift_return_value): Treat GET_MODE_BITSIZE as polynomial.
>   * combine.c (make_extraction): Likewise.
>   * dse.c (find_shift_sequence): Likewise.
>   * dwarf2out.c (mem_loc_descriptor): Likewise.
>   * expmed.c (store_integral_bit_field, extract_bit_field_1): Likewise.
>   (extract_bit_field, extract_low_bits): Likewise.
>   * expr.c (convert_move, convert_modes, emit_move_insn_1): Likewise.
>   (optimize_bitfield_assignment_op, expand_assignment): Likewise.
>   (store_field, expand_expr_real_1): Likewise.
>   * fold-const.c (optimize_bit_field_compare, merge_ranges): Likewise.
>   * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise.
>   * reload.c (find_reloads): Likewise.
>   * reload1.c (alter_reg): Likewise.
>   * stor-layout.c (bitwise_mode_for_mode, compute_record_mode): Likewise.
>   * targhooks.c (default_secondary_memory_needed_mode): Likewise.
>   * tree-if-conv.c (predicate_mem_writes): Likewise.
>   * tree-ssa-strlen.c (handle_builtin_memcmp): Likewise.
>   * tree-vect-patterns.c (adjust_bool_pattern): Likewise.
>   * tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.
>   * valtrack.c (dead_debug_insert_temp): Likewise.
>   * varasm.c (mergeable_constant_section): Likewise.
>   * config/sh/sh.h (LOCAL_ALIGNMENT): Use as_a .
> 
> gcc/ada/
>   * gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_BITSIZE
>   as polynomial.
> 
> gcc/c-family/
>   * c-ubsan.c (ubsan_instrument_shift): Treat GET_MODE_BITSIZE
>   as polynomial.
This is OK.  Obviously it can't go in until if/when the whole it is ack'd.

Jeff


Re: [107/nnn] poly_int: GET_MODE_SIZE

2017-11-20 Thread Jeff Law
On 10/23/2017 11:43 AM, Richard Sandiford wrote:
> This patch changes GET_MODE_SIZE from unsigned short to poly_uint16.
> The non-mechanical parts were handled by previous patches.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * machmode.h (mode_size): Change from unsigned short to
>   poly_uint16_pod.
>   (mode_to_bytes): Return a poly_uint16 rather than an unsigned short.
>   (GET_MODE_SIZE): Return a constant if ONLY_FIXED_SIZE_MODES,
>   or if measurement_type is not polynomial.
>   (fixed_size_mode::includes_p): Check for constant-sized modes.
>   * genmodes.c (emit_mode_size_inline): Make mode_size_inline
>   return a poly_uint16 rather than an unsigned short.
>   (emit_mode_size): Change the type of mode_size from unsigned short
>   to poly_uint16_pod.  Use ZERO_COEFFS for the initializer.
>   (emit_mode_adjustments): Cope with polynomial vector sizes.
>   * lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
>   for GET_MODE_SIZE.
>   * lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
>   for GET_MODE_SIZE.
>   * auto-inc-dec.c (try_merge): Treat GET_MODE_SIZE as polynomial.
>   * builtins.c (expand_ifn_atomic_compare_exchange_into_call): Likewise.
>   * caller-save.c (setup_save_areas): Likewise.
>   (replace_reg_with_saved_mem): Likewise.
>   * calls.c (emit_library_call_value_1): Likewise.
>   * combine-stack-adj.c (combine_stack_adjustments_for_block): Likewise.
>   * combine.c (simplify_set, make_extraction, simplify_shift_const_1)
>   (gen_lowpart_for_combine): Likewise.
>   * convert.c (convert_to_integer_1): Likewise.
>   * cse.c (equiv_constant, cse_insn): Likewise.
>   * cselib.c (autoinc_split, cselib_hash_rtx): Likewise.
>   (cselib_subst_to_values): Likewise.
>   * dce.c (word_dce_process_block): Likewise.
>   * df-problems.c (df_word_lr_mark_ref): Likewise.
>   * dwarf2cfi.c (init_one_dwarf_reg_size): Likewise.
>   * dwarf2out.c (multiple_reg_loc_descriptor, mem_loc_descriptor)
>   (concat_loc_descriptor, concatn_loc_descriptor, loc_descriptor)
>   (rtl_for_decl_location): Likewise.
>   * emit-rtl.c (gen_highpart, widen_memory_access): Likewise.
>   * expmed.c (extract_bit_field_1, extract_integral_bit_field): Likewise.
>   * expr.c (emit_group_load_1, clear_storage_hints): Likewise.
>   (emit_move_complex, emit_move_multi_word, emit_push_insn): Likewise.
>   (expand_expr_real_1): Likewise.
>   * function.c (assign_parm_setup_block_p, assign_parm_setup_block)
>   (pad_below): Likewise.
>   * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise.
>   * gimple-ssa-store-merging.c (rhs_valid_for_store_merging_p): Likewise.
>   * ira.c (get_subreg_tracking_sizes): Likewise.
>   * ira-build.c (ira_create_allocno_objects): Likewise.
>   * ira-color.c (coalesced_pseudo_reg_slot_compare): Likewise.
>   (ira_sort_regnos_for_alter_reg): Likewise.
>   * ira-costs.c (record_operand_costs): Likewise.
>   * lower-subreg.c (interesting_mode_p, simplify_gen_subreg_concatn)
>   (resolve_simple_move): Likewise.
>   * lra-constraints.c (get_reload_reg, operands_match_p): Likewise.
>   (process_addr_reg, simplify_operand_subreg, lra_constraints): Likewise.
>   (CONST_POOL_OK_P): Reject variable-sized modes.
>   * lra-spills.c (slot, assign_mem_slot, pseudo_reg_slot_compare)
>   (add_pseudo_to_slot, lra_spill): Likewise.
>   * omp-low.c (omp_clause_aligned_alignment): Likewise.
>   * optabs-query.c (get_best_extraction_insn): Likewise.
>   * optabs-tree.c (expand_vec_cond_expr_p): Likewise.
>   * optabs.c (expand_vec_perm, expand_vec_cond_expr): Likewise.
>   (expand_mult_highpart, valid_multiword_target_p): Likewise.
>   * recog.c (offsettable_address_addr_space_p): Likewise.
>   * regcprop.c (maybe_mode_change): Likewise.
>   * reginfo.c (choose_hard_reg_mode, record_subregs_of_mode): Likewise.
>   * regrename.c (build_def_use): Likewise.
>   * regstat.c (dump_reg_info): Likewise.
>   * reload.c (complex_word_subreg_p, push_reload, find_dummy_reload)
>   (find_reloads, find_reloads_subreg_address): Likewise.
>   * reload1.c (eliminate_regs_1): Likewise.
>   * rtlanal.c (for_each_inc_dec_find_inc_dec, rtx_cost): Likewise.
>   * simplify-rtx.c (avoid_constant_pool_reference): Likewise.
>   (simplify_binary_operation_1, simplify_subreg): Likewise.
>   * targhooks.c (default_function_arg_padding): Likewise.
>   (default_hard_regno_nregs, default_class_max_nregs): Likewise.
>   * tree-cfg.c (verify_gimple_assign_binary): Likewise.
>   (verify_gimple_assign_ternary): Likewise.
>   * tree-inline.c (estimate_move_cost): Likewise.
>   * 

Re: [patch] remove cilk-plus

2017-11-20 Thread Jeff Law
On 11/16/2017 10:02 AM, Koval, Julia wrote:
> Thanks for your comments, fixed it.
> 
> 2017-11-16  Julia Koval  
>   Sebastian Peryt  
> 
>   * Makefile.def (target_modules): Remove libcilkrts.
>   * Makefile.in: Ditto.
>   * configure: Ditto.
>   * configure.ac: Ditto.
> 
> contrib/
>   * contrib/gcc_update: Ditto.
> 
> gcc/
>   * Makefile.in (cilkplus.def, cilk-builtins.def, c-family/cilk.o, 
>   c-family/c-cilkplus.o, c-family/array-notation-common.o,
>   cilk-common.o, cilk.h, cilk-common.c): Remove.
>   * builtin-types.def
>   (BT_FN_INT_PTR_PTR_PTR_FTYPE_BT_INT_BT_PTR_BT_PTR_BT_PTR): Remove.
>   * builtins.c (is_builtin_name): Remove cilkplus condition.
>   (BUILT_IN_CILK_DETACH, BUILT_IN_CILK_POP_FRAME): Remove.
>   * builtins.def (DEF_CILK_BUILTIN_STUB, DEF_CILKPLUS_BUILTIN,
>   cilk-builtins.def, cilkplus.def): Remove.
>   * cif-code.def (CILK_SPAWN): Remove.
>   * cilk-builtins.def: Delete.
>   * cilk-common.c: Ditto.
>   * cilk.h: Ditto.
>   * cilkplus.def: Ditto.
>   * config/darwin.h (fcilkplus): Delete.
>   * cppbuiltin.c: Ditto.
>   * doc/extend.texi: Remove cilkplus doc.
>   * doc/generic.texi: Ditto.
>   * doc/invoke.texi: Ditto.
>   * doc/passes.texi: Ditto.
>   * gcc.c (fcilkplus): Remove.
>   * gengtype.c (cilk.h): Remove.
>   * gimple-pretty-print.c (dump_gimple_omp_for): Remove cilkplus support.
>   * gimple.h (GF_OMP_FOR_KIND_CILKFOR, GF_OMP_FOR_KIND_CILKSIMD): Remove.
>   * gimplify.c (gimplify_return_expr, maybe_fold_stmt, gimplify_call_expr,
>   is_gimple_stmt, gimplify_modify_expr, gimplify_scan_omp_clauses,
>   gimplify_adjust_omp_clauses, gimplify_omp_for, gimplify_expr): Remove
>   cilkplus conditions.
>   * ipa-fnsummary.c (ipa_dump_fn_summary, compute_fn_summary,
>   inline_read_section): Ditto.
>   * ipa-inline-analysis.c (cilk.h): Remove.
>   * ira.c (ira_setup_eliminable_regset): Remove cilkplus support.
>   * lto-wrapper.c (merge_and_complain, append_compiler_options,
>   append_linker_options): Remove condition for fcilkplus.
>   * lto/lto-lang.c (cilk.h): Remove.
>   (lto_init): Remove condition for fcilkplus.
>   * omp-expand.c (expand_cilk_for_call): Delete.
>   (expand_omp_taskreg, expand_omp_for_static_chunk,
>   expand_omp_for): Remove cilkplus
>   conditions.
>   (expand_cilk_for): Delete.
>   * omp-general.c (omp_extract_for_data): Remove cilkplus support.
>   * omp-low.c (scan_sharing_clauses, create_omp_child_function,
>   execute_lower_omp, diagnose_sb_0): Ditto.
>   * omp-simd-clone.c (simd_clone_clauses_extract): Ditto.
>   * tree-core.h (OMP_CLAUSE__CILK_FOR_COUNT_): Delete.
>   * tree-nested.c: Ditto.
>   * tree-pretty-print.c (dump_omp_clause): Remove cilkplus support.
>   (dump_generic_node): Ditto.
>   * tree.c (OMP_CLAUSE__CILK_FOR_COUNT_): Delete.
>   * tree.def (cilk_simd, cilk_for, cilk_spawn_stmt, cilk_sync_stmt): 
> Delete.
>   * tree.h (CILK_SPAWN_FN, EXPR_CILK_SPAWN): Delete.
> 
> gcc/c-family/
>   * array-notation-common.c: Delete.
>   * c-cilkplus.c: Ditto.
>   * c-common.c (_Cilk_spawn, _Cilk_sync, _Cilk_for): Remove.
>   * c-common.def (ARRAY_NOTATION_REF): Remove.
>   * c-common.h (RID_CILK_SPAWN, build_array_notation_expr,
>   build_array_notation_ref, C_ORT_CILK, c_check_cilk_loop,
>   c_validate_cilk_plus_loop, cilkplus_an_parts, 
> cilk_ignorable_spawn_rhs_op,
>   cilk_recognize_spawn): Remove.
>   * c-gimplify.c (CILK_SPAWN_STMT): Remove.
>   * c-omp.c: Remove CILK_SIMD check.
>   * c-pragma.c: Ditto.
>   * c-pragma.h: Remove CILK related pragmas.
>   * c-pretty-print.c (c_pretty_printer::postfix_expression): Remove
>   ARRAY_NOTATION_REF condition.
>   (c_pretty_printer::expression): Ditto.
>   * c.opt (fcilkplus): Remove.
>   * cilk.c: Delete.
> 
> gcc/c/
>   * Make-lang.in (c/c-array-notation.o): Remove.
>   * c-array-notation.c: Delete.
>   * c-decl.c: Remove cilkplus condition.
>   * c-parser.c (c_parser_cilk_simd, c_parser_cilk_for,
>   c_parser_cilk_verify_simd, c_parser_array_notation,
>   c_parser_cilk_clause_vectorlength, c_parser_cilk_grainsize,
>   c_parser_cilk_simd_fn_vector_attrs,
>   c_finish_cilk_simd_fn_tokens): Delete.
>   (c_parser_declaration_or_fndef): Remove cilkplus condition.
>   (c_parser_direct_declarator_inner): Ditto.
>   (CILK_SIMD_FN_CLAUSE_MASK): Delete.
>   (c_parser_attributes, c_parser_compound_statement,
>   c_parser_statement_after_labels, c_parser_if_statement,
>   c_parser_switch_statement, c_parser_while_statement,
>   c_parser_do_statement, c_parser_for_statement,
>   c_parser_unary_expression, c_parser_postfix_expression,
>   c_parser_postfix_expression_after_primary,
>  

[PING] Plugin support on Windows/MinGW

2017-11-20 Thread Boris Kolpackov
Hi,

I would like to ping this patch:

https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01040.html

The changes are fairly conservative: they do not touch much of the
existing module implementation and plugin support on MinGW is disabled
by default (there are also fixes for a couple of bugs as a bonus).
I would really liked to have this patch considered for GCC 8, if
possible.

Thanks,
Boris


libgo patch committed: Fix Makefile bug setting LD_LIBRARY_PATH

2017-11-20 Thread Ian Lance Taylor
This patch by Than McIntosh fixes a small bug in the libgo Makefile
recipe that constructs the directory from which to pick up
libgcc_s.so; the gccgo invocation with -print-libgcc-file-name was
missing the flags, which meant that for -m32 builds we'd see the
64-bit libgcc dir.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 254983)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5485b3faed476f6d051833d1790b5f77be9d1efc
+fecb92bda0aa6d70c89d14635ff568df77d2bb5f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 254729)
+++ libgo/Makefile.am   (working copy)
@@ -1001,7 +1001,7 @@ CHECK = \
export MAKE; \
NM="$(NM)"; \
export NM; \
-   libgccdir=`${GOC} -print-libgcc-file-name | sed -e 's|/[^/]*$$||'`; \
+   libgccdir=`${GOC} ${GOCFLAGS} -print-libgcc-file-name | sed -e 
's|/[^/]*$$||'`; \

LD_LIBRARY_PATH="`${PWD_COMMAND}`/.libs:$${libgccdir}:${LD_LIBRARY_PATH}"; \
LD_LIBRARY_PATH=`echo $${LD_LIBRARY_PATH} | sed 
's,::*,:,g;s,^:*,,;s,:*$$,,'`; \
export LD_LIBRARY_PATH; \


Go patch committed: report error for ++/-- applied to a non-numeric type

2017-11-20 Thread Ian Lance Taylor
This patch to the Go frontend reports an error when ++ or -- is used
with a non-numeric type.  This avoids a later compiler crash.  This
fixes GCC PR 83071.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 254748)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-cb5dc1ce98857884a2215c461dd1d7de530f9f5e
+5485b3faed476f6d051833d1790b5f77be9d1efc
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc (revision 254748)
+++ gcc/go/gofrontend/statements.cc (working copy)
@@ -1826,6 +1826,11 @@ Statement*
 Inc_dec_statement::do_lower(Gogo*, Named_object*, Block*, Statement_inserter*)
 {
   Location loc = this->location();
+  if (!this->expr_->type()->is_numeric_type())
+{
+  this->report_error("increment or decrement of non-numeric type");
+  return Statement::make_error_statement(loc);
+}
   Expression* oexpr = Expression::make_integer_ul(1, this->expr_->type(), loc);
   Operator op = this->is_inc_ ? OPERATOR_PLUSEQ : OPERATOR_MINUSEQ;
   return Statement::make_assignment_operation(op, this->expr_, oexpr, loc);


Re: [PATCH 1/7]: SVE: Add CLOBBER_HIGH expression

2017-11-20 Thread Jeff Law
On 11/20/2017 08:04 AM, Alan Hayward wrote:
>>>
>>> Yes, that was an early alternative option for the patch.
>>>
>>> With that it would effect every operation that uses SVE registers. A simple
>>> add of two registers now has 4 inputs and two outputs. It would get in the
>>> way when debugging any sve dumps and be generally annoying.
>>> Possible that the code for that in would all be in the aarch64 target,
>>> (making everyone else happy!) But I suspect that there would be still be
>>> strange dependency issues that’d need sorting in the common code.
>>>
>>> Whereas with this patch, there are no new oddities in non-tls 
>>> compiles/dumps.
>>> Although the patch touches a lot of files, the changes are mostly restricted
>>> to places where standard clobbers were already being checked.
>> I'm not entirely sure that it would require doubling the number of
>> inputs/outputs.  It's not conceptually much different than how we
>> describe DImode operations on 32bit targets.  The mode selects one or
>> more consecutive registers, so you don't actually need anything weird in
>> your patterns.  This is pretty standard stuff.
> 
> Ok, fair enough.
> 
>>
>>
>> It would be painful in that the Neon regs would have to interleave with
>> the upper part of the SVE regs in terms of register numbers.  It would
>> also mean that you couldn't tie together multiple neon regs into
>> something wider.  I'm not sure if the latter would be an issue or not.
> 
> And there’s also the weirdness that the register would not be split evenly - 
> it’ll be a TI
> reg followed by a reg of the size of multiple TIs.
> 
> All of that has the potential to complicate all non-sve aarch64 code.
Agreed.  Let's drop this line of exploration.


> 
>>
>> You might also look at TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  I'd
>> totally forgotten about it.  And in fact it seems to come pretty close
>> to what you need…
> 
> Yes, some of the code is similar to the way
> TARGET_HARD_REGNO_CALL_PART_CLOBBERED works. Both that code and the
> CLOBBER expr code served as a starting point for writing the patch. The main 
> difference
> here, is that _PART_CLOBBERED is around all calls and is not tied to a 
> specific Instruction,
> it’s part of the calling abi. Whereas clobber_high is explicitly tied to an 
> expression (tls_desc).
> It meant there wasn’t really any opportunity to resume any existing code.
Understood.  Though your first patch mentions that you're trying to
describe partial preservation "around TLS calls".  Presumably those are
represented as normal insns, not call_insn.

That brings me back to Richi's idea of exposing a set of the low subreg
to itself using whatever mode is wide enough to cover the neon part of
the register.

That should tell the generic parts of the compiler that you're just
clobbering the upper part and at least in theory you can implement in
the aarch64 backend and the rest of the compiler should "just work"
because that's the existing semantics of a subreg store.

The only worry would be if a pass tried to get overly smart and
considered that kind of set a nop -- but I think I'd argue that's simply
wrong given the semantics of a partial store.

jeff







Re: [PATCH] Fix ICEs from expand_mul_overflow (PR target/82981)

2017-11-20 Thread Jeff Law
On 11/20/2017 02:07 PM, Jakub Jelinek wrote:
> Hi!
> 
> Apparently ARM can do the widening SImode multiply only using a libcall,
> so the recently changed expand_mul_overflow ignores it at first, then
> sees possibility to use the HImode code, but uses expand_simple_binop
> with OPTAB_DIRECT which requires actual HImode optabs, which ARM doesn't
> have, it needs to use widening to SImode.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
> 
> 2017-11-20  Jakub Jelinek  
> 
>   PR target/82981
>   * internal-fn.c (expand_mul_overflow): Use OPTAB_WIDEN instead of
>   OPTAB_DIRECT in calls to expand_simple_binop.
OK.
jeff


Re: [PATCH] Fix ICE with __RTL and -g (PR debug/82933)

2017-11-20 Thread Jeff Law
On 11/20/2017 02:10 PM, Jakub Jelinek wrote:
> Hi!
> 
> The dwarf2out.c code relies on the assembly_start debug hook being
> invoked before any RTL is processed from final.c (and needs it to be
> done just once).  Normally it is called from cgraphunit.c, but when
> __RTL is seen with a starting pass, run_rtl_passes is called already
> from the FE.  While it would be better to defer the rtl finalization
> until cgraph says so, it might be quite hard, so instead this patch
> hacks dwarf2out_assembly_start so that it can be invoked multiple times
> (and does nothing on the 2nd+ call) and invokes it from the run_rtl_passes
> function too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-11-20  Jakub Jelinek  
> 
>   PR debug/82933
>   * run-rtl-passes.c: Include debug.h.
>   (run_rtl_passes): Call debug_hooks->assembly_start.
>   * dwarf2out.c (dwarf2out_assembly_start): Return early if invoked
>   multiple times.
> 
>   * gcc.dg/rtl/x86_64/pr82933.c: New test.
OK.
jeff


Re: Patch ping

2017-11-20 Thread Jim Wilson

On 11/20/2017 02:58 PM, Jim Wilson wrote:
The dwarf2out.c patch looks good to me, though the testcase does not 
fail on unpatched mainline anymore.  I had to go back to the 2017-10-22 
snapshot to see the failure.  There was a followup from Mark Wielaard 
mentioning that elfutils still fails on mainline, so maybe we can get a 
testcase from there.


The testcase was "broken" by Jan's big patch on Nov 3
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00104.html
It didn't seem useful to look into that, so I tried running 
multidelta/creduce on the elfutils sources, and managed to get a small 
testcase that still fails with mainline.  I then fixed it to get rid of 
the warnings.


#include 
extern int c (void);
int
a(int b) {
  if (c())
abort();
}

There is no guarantee that this testcase will still show the problem in 
the future, but it seems good enough for now.


OK with this testcase.  And with your testcase too if you think that 
still has value.


Jim



Re: [PING][patch] PR81794: have "would be stringified in traditional C" warning in libcpp/macro.c be controlled by -Wtraditional

2017-11-20 Thread David Malcolm
On Wed, 2017-10-25 at 14:45 -0400, Eric Gallager wrote:
> On Sat, Sep 30, 2017 at 8:05 PM, Eric Gallager 
> wrote:
> > On Fri, Sep 29, 2017 at 11:15 AM, David Malcolm  > m> wrote:
> > > On Sun, 2017-09-17 at 20:00 -0400, Eric Gallager wrote:
> > > > Attached is a version of
> > > > https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00481.html that
> > > > contains
> > > > a combination of both the fix and the testcase update, as
> > > > requested
> > > > in
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81794#c2
> > > > 
> > > > I had to use a different computer than I usually use to send
> > > > this
> > > > email, as the hard drive that originally had this patch is
> > > > currently
> > > > unresponsive. Since it's also the one with my ssh keys on it, I
> > > > can't
> > > > commit with it. Sorry if the ChangeLogs get mangled.
> > > 
> > > Thanks for putting this together; sorry about the delay in
> > > reviewing
> > > it.
> > > 
> > > The patch mostly looks good.
> > > 
> > > Did you perform a full bootstrap and run of the testsuite with
> > > this
> > > patch?  If so, it's best to state this in the email, so that we
> > > know
> > > that the patch has survived this level of testing.
> > 
> > Yes, I bootstrapped with it, but I haven't done a full run of the
> > testsuite with it yet; just the one testcase I updated.
> 
> Update: I've now run the testsuite with it; test results are here:
> https://gcc.gnu.org/ml/gcc-testresults/2017-10/msg01751.html
> I'm pretty sure all the FAILs are unrelated to this patch.
> 
> > 
> > > 
> > > Some nits below:
> > > 
> > > > libcpp/ChangeLog:
> > > > 
> > > > 2017-03-24  Eric Gallager  
> > > > 
> > > >  * macro.c (check_trad_stringification): Have warning be
> > > > controlled by
> > > >  -Wtraditional.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > 2017-09-17  Eric Gallager  
> > > > 
> > > > PR preprocessor/81794
> > > > * gcc.dg/pragma-diag-7.c: Update to include check for
> > > > stringification.
> > > > 
> > > > On Sat, May 6, 2017 at 11:33 AM, Eric Gallager  > > > u.edu>
> > > > wrote:
> > > > > Pinging this: https://gcc.gnu.org/ml/gcc-patches/2017-03/msg0
> > > > > 1325.h
> > > > > tml
> > > > > 
> > > > > On 3/24/17, Eric Gallager  wrote:
> > > > > > It seemed odd to me that gcc was issuing a warning about
> > > > > > compatibility
> > > > > > with traditional C that I couldn't turn off by
> > > > > > pushing/popping
> > > > > > -Wtraditional over the problem area, so I made the attached
> > > > > > (minor)
> > > > > > patch to fix it. Survives bootstrap, but the only testing
> > > > > > I've
> > > > > > done
> > > > > > with it has been compiling the one file that was giving me
> > > > > > issues
> > > > > > previously, which I'd need to reduce further to turn it
> > > > > > into a
> > > > > > proper
> > > > > > test case.
> > > > > > 
> > > > > > Thanks,
> > > > > > Eric Gallager
> > > > > > 
> > > > > > libcpp/ChangeLog:
> > > > > > 
> > > > > > 2017-03-24  Eric Gallager  
> > > > > > 
> > > > > >   * macro.c (check_trad_stringification): Have warning
> > > > > > be
> > > > > > controlled by
> > > > > >   -Wtraditional.
> > > > > > 
> > > > > 
> > > > > So I did the reducing I mentioned above and now have a
> > > > > testcase for
> > > > > it; it was pretty similar to the one from here:
> > > > > https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01319.html
> > > > > so I combined them into a single testcase and have attached
> > > > > the
> > > > > combined version. I can confirm that the testcase passes with
> > > > > my
> > > > > patch
> > > > > applied.
> > > 
> > > [...]
> > > 
> > > > diff --git a/gcc/testsuite/gcc.dg/pragma-diag-7.c
> > > > b/gcc/testsuite/gcc.dg/pragma-diag-7.c
> > > > index 402ee56..e06c410 100644
> > > > --- a/gcc/testsuite/gcc.dg/pragma-diag-7.c
> > > > +++ b/gcc/testsuite/gcc.dg/pragma-diag-7.c
> > > > @@ -7,3 +7,16 @@ unsigned long bad = 1UL; /* { dg-warning
> > > > "suffix" } */
> > > >  /* Note the extra space before the pragma on this next line:
> > > > */
> > > >   #pragma GCC diagnostic pop
> > > >  unsigned long ok_again = 2UL; /* { dg-bogus "suffix" } */
> > > > +
> > > > +/* Redundant with the previous pop, but just shows that it
> > > > fails to stop the
> > > > + * following warning with an unpatched GCC: */
> > > > +#pragma GCC diagnostic ignored "-Wtraditional"
> > > > +
> > > > +/* { dg-bogus "would be stringified" .+1 } */
> > > 
> > > As far as I can tell, this dg-bogus line doesn't actually get
> > > matched;
> > > when I run the testsuite without the libcpp fix, I get:
> > > 
> > >   FAIL: gcc.dg/pragma-diag-7.c (test for excess errors)
> > > 
> > > If I update the dg-bogus line to read:
> > > 
> > >   /* { dg-bogus "would be stringified" "" { target *-*-* } .+1 }
> > > */
> > > 
> > > then it's matched, and I get:
> > > 
> > > 

Re: [PATCH] detect nonstring arguments to string functions (PR 82945)

2017-11-20 Thread Jeff Law
On 11/19/2017 03:26 PM, Martin Sebor wrote:
>>> I have seen it in the early IL but this late I couldn't come
>>> up with a test case where such a loop would be necessary.
>>> The COMPONENT_REF always refers to the member array.  If you
>>> can show me an example to test with I'll add the handling if
>>> possible.  There are cases where it isn't, such as in
>>>
>>>     struct A { char a[4] __attribute__ ((nonstring)); } *pa;
>>>
>>>     strlen (pa->a + 1);
>>>
>>> where pa->a is represented as a MEM_REF (char[4], pa, 1) with
>>> the member information having been lost in translation.  (This
>>> is the same problem that I had solved for the -Warray-bounds
>>> warning for out-of bounds offsets by implementing it in
>>> tree-ssa-forwprop.c: because that's where the member is folded
>>> into a reference to the base object.)
>> Set up a nested structure then reference it like a.b.c.d.
> 
> I did that.  From my reply to Jakub:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01671.html
> 
> with
> 
>   struct A { char a[4], b[4] __attribute__ ((nonstring)); };
>   struct B { struct A a; };
>   struct C { struct B b; };
> 
>   void f (struct C *p)
>   {
>     __builtin_strcpy (p->b.a.a, p->b.a.b);
>   }
> 
> in get_attr_nonstring_decl() where EXPR is p->a.b.b it is
> a COMPONENT_REF (COMPONENT_REF (..,), FIELD_DECL (b)).  I.e.,
> the outermost FIELD_DECL is the one of interest.
> 
> The code extracts TREE_OPERAND (decl, 1) from this outermost
> COMPONENT_REF, which is FIELD_DECL (b).
> 
> What test case gives a structure where it's necessary to loop
> over the components to get at the field?  I can't think of one.
Bah.  It's so damn easy to forget that the outermost COMPONENT_REF
refers to the last component in the expression.

Can you incorporate the lazy initialization from Jakub and commit?

Jeff


Re: [PATCH] Don't split call from its call arg location

2017-11-20 Thread Jeff Law
On 11/20/2017 04:04 AM, Claudiu Zissulescu wrote:
> When a port is using hardware loops (like ARC) makes use of
> reorg_loops to find and analyze loops that end in loop_end
> instructions. The very same function can be set to reorder the cfg
> such that the loop end occurs after the loop start. This task is
> performed by reorder_loops function which at its turn calls
> cfg_layout_finalize -> fixup_reoreder_chain ->
> force_nonfallthru_and_redirect (cfgrtl.c:1476).   However, the latter
> is splitting a call and its corresponding CALL_ARG_LOCATION note,
> leading to an assert in dwarf2out_var_location() at dwarf2out.c:26391.
> 
> Original post:
> https://gcc.gnu.org/ml/gcc/2017-11/msg00110.html
> 
> Ok to apply?
> Claudiu
> 
> gcc/
> 2017-11-20  Claudiu Zissulescu  
> 
>   * cfgrtl.c (force_nonfallthru_and_redirect): Don't split a call
>   and its corresponding call arg location note.
OK, but please add the test from the original message you posted to
gcc.target/arc with suitable dg-options.

jeff


Re: [014/nnn] poly_int: indirect_refs_may_alias_p

2017-11-20 Thread Jeff Law
On 11/20/2017 06:00 AM, Richard Sandiford wrote:
> Jeff Law  writes:
>> On 10/23/2017 11:05 AM, Richard Sandiford wrote:
>>> This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
>>> rather than ranges_overlap_p.  Unlike the former, the latter can handle
>>> negative offsets, so the fix for PR44852 should no longer be necessary.
>>> It can also handle offset_int, so avoids unchecked truncations to
>>> HOST_WIDE_INT.
>>>
>>>
>>> 2017-10-23  Richard Sandiford  
>>> Alan Hayward  
>>> David Sherwood  
>>>
>>> gcc/
>>> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
>>> (indirect_refs_may_alias_p): Use ranges_may_overlap_p
>>> instead of ranges_overlap_p.
>> OK.
>>
>> Note that this highlighted a nit in patch 001 -- namely that there's new
>> function templates that aren't mentioned in the ChangeLog.
> 
> Do you mean ranges_may_overlap_p?  I can add that and the other new
> poly-int.h functions to the changelog if you think it's useful,
> but I thought for new files it was more usual just to do:
> 
>   * foo.h: New file.
That's fine.  I was just having trouble finding it when I wanted to look
at it.  My mailer won't unwrap a compressed patch :-)

Jeff


[Patch][aarch64][committed] Fix test broken by patch for PR target/81356

2017-11-20 Thread Steve Ellcey
My patch for PR target/81356 caused the test gfortran.dg/pr45636.f90
to generate an XPASS instead of an XFAIL.  This change seems correct
since we no longer define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
so I am going to check this patch in as obvious in order to make
the test a PASS instead of XPASS by removing aarch64 from the list
of expected failures.

Steve Ellcey
sell...@cavium.com


2017-11-20  Steve Ellcey  

PR target/81356
* gfortran.dg/pr45636.f90 (aarch64*-*-*): Remove from xfail list.


diff --git a/gcc/testsuite/gfortran.dg/pr45636.f90 
b/gcc/testsuite/gfortran.dg/pr45636.f90
index 38ec634..958833c 100644
--- a/gcc/testsuite/gfortran.dg/pr45636.f90
+++ b/gcc/testsuite/gfortran.dg/pr45636.f90
@@ -12,4 +12,4 @@ program main
 end program main
 ! This test will fail on targets which prefer memcpy/memset over
 ! move_by_pieces/store_by_pieces.
-! { dg-final { scan-tree-dump-times "memset" 0 "forwprop2" { xfail { { 
hppa*-*-* && { ! lp64 } } || { { mips*-*-* && { ! nomips16 } } || { 
aarch64*-*-* } } } } } }
+! { dg-final { scan-tree-dump-times "memset" 0 "forwprop2" { xfail { { 
hppa*-*-* && { ! lp64 } } || { mips*-*-* && { ! nomips16 } } } } } }


Re: [PATCH][aarch64] Fix pr81356 - copy empty string with wrz, not a ldrb/strb

2017-11-20 Thread Steve Ellcey
On Mon, 2017-11-20 at 15:24 +0100, Christophe Lyon wrote:
> 
> As a result of this patch, we now have:
> XPASS: gfortran.dg/pr45636.f90   -O   scan-tree-dump-times forwprop2
> "memset" 0
> instead of:
> XFAIL: gfortran.dg/pr45636.f90   -O   scan-tree-dump-times forwprop2
> "memset" 0 (found 2 times)
> 
> Is it expected?
> 
> Christophe

I think it is.  I hadn't seen this failure before but I may not have
tested my change with Fortran.  I am going to consider fixing this an
obvious change and will check in the fix (removing aarch64 from the
list of xfails) today.

Steve Ellcey
sell...@cavium.com


Re: Patch ping

2017-11-20 Thread Jim Wilson

On 11/19/2017 11:55 PM, Jakub Jelinek wrote:

I'd like to ping the following patches:

   http://gcc.gnu.org/ml/gcc-patches/2017-10/msg01895.html
   PR debug/82718
   Fix DWARF5 .debug_loclist handling with hot/cold partitioning


I already responded to this one.
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01268.html

The dwarf2out.c patch looks good to me, though the testcase does not 
fail on unpatched mainline anymore.  I had to go back to the 2017-10-22 
snapshot to see the failure.  There was a followup from Mark Wielaard 
mentioning that elfutils still fails on mainline, so maybe we can get a 
testcase from there.


Jim


Small C++ PATCH to avoid duplicate visibility warning

2017-11-20 Thread Jason Merrill
We've been warning about a base with different visibility while
walking the fields list as well as the base list.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit de7f9f640df9dc1e181284009ba48ac281bef825
Author: Jason Merrill 
Date:   Fri Nov 17 17:13:40 2017 -0500

Avoid duplicate visibility warning.

* decl2.c (constrain_class_visibility): Don't warn about artificial
fields.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index bc0dbde..13e7b1d180a 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2553,7 +2553,8 @@ constrain_class_visibility (tree type)
 vis = VISIBILITY_INTERNAL;
 
   for (t = TYPE_FIELDS (type); t; t = DECL_CHAIN (t))
-if (TREE_CODE (t) == FIELD_DECL && TREE_TYPE (t) != error_mark_node)
+if (TREE_CODE (t) == FIELD_DECL && TREE_TYPE (t) != error_mark_node
+   && !DECL_ARTIFICIAL (t))
   {
tree ftype = strip_pointer_or_array_types (TREE_TYPE (t));
int subvis = type_visibility (ftype);
diff --git a/gcc/testsuite/g++.dg/ext/visibility/warn6.C 
b/gcc/testsuite/g++.dg/ext/visibility/warn6.C
new file mode 100644
index 000..256ebd2ad2f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/visibility/warn6.C
@@ -0,0 +1,2 @@
+struct __attribute ((visibility ("hidden"))) A { int i; };
+struct B: A { }; // { dg-warning "base" }


[PATCH, rs6000] (v2) testcase updates for fold-vec-abs-* for power9 codegen

2017-11-20 Thread Will Schmidt
Hi,

Add additional scan-assembler entries for those tests with
simple variations in codegen. (for power9, versus power8, etc).


2017-11-20  Will Schmidt  

[testsuite]

* fold-vec-abs-char-fwrapv.c: Add xxspltib insn to expected output.
* fold-vec-abs-char.c: Add xxspltib insn to expected output list.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
index 739f06e..8b9887d 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
@@ -1,7 +1,7 @@
 /* Verify that overloaded built-ins for vec_abs with char
-   inputs produce the right results.  */
+   inputs produce the right code.  */
 
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec -O2 -fwrapv" } */
 
@@ -11,8 +11,8 @@ vector signed char
 test2 (vector signed char x)
 {
   return vec_abs (x);
 }
 
-/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vspltisw|vxor|xxspltib" 1 } } */
 /* { dg-final { scan-assembler-times "vsububm" 1 } } */
 /* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
index 239c919..1f303b8 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
@@ -1,7 +1,7 @@
 /* Verify that overloaded built-ins for vec_abs with char
-   inputs produce the right results.  */
+   inputs produce the right code.  */
 
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec -O2" } */
 
@@ -11,8 +11,8 @@ vector signed char
 test2 (vector signed char x)
 {
   return vec_abs (x);
 }
 
-/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vspltisw|vxor|xxspltib" 1 } } */
 /* { dg-final { scan-assembler-times "vsububm" 1 } } */
 /* { dg-final { scan-assembler-times "vmaxsb" 1 } } */




[PATCH, rs6000] (v2) fold-vec-ld* testcase updates for power9

2017-11-20 Thread Will Schmidt
Hi, 
Update the scan-assembler stanza to include those instructions
generated for a power9 target.

2017-11-20  Will Schmidt  

[testsuite]

* fold-vec-ld-char.c: Add lxv insn to expected output.
* fold-vec-ld-double.c: Add lxv insn to expected output.
* fold-vec-ld-float.c: Add lxv insn to expected output.
* fold-vec-ld-int.c: Add lxv insn to expected output.
* fold-vec-ld-longlong.c: Add lxv insn to expected output.
* fold-vec-ld-short.c: Add lxv insn to expected output.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-char.c
index f9ef3e0..9ed1a94 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-char.c
@@ -65,7 +65,7 @@ vector bool char
 testld_cst_vbc (vector bool char vbc2)
 {
   return vec_ld (80, );
 }
 
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M}  10 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M|\mlxv\M} 
10 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-double.c
index 9c6fbb2..1577fb8 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-double.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-double.c
@@ -17,6 +17,6 @@ vector double
 testld_cst_vd (long long ll1, vector double vd)
 {
   return vec_ld (16, );
 }
 
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-float.c
index eca847a..ff6b965 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-float.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-float.c
@@ -31,7 +31,7 @@ testld_cst_f (float f2)
   return vec_ld (16, );
 }
 
 // lvx - generated by ll_vf and ll_f
 // lxvd2x - generated by cst_vf and cst_f
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M} 4 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-int.c
index 5dc6df6..8a477d0 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-int.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-int.c
@@ -65,7 +65,7 @@ vector bool int
 testld_cst_vbi (vector bool int vbi2)
 {
   return vec_ld (80, );
 }
 
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M} 10 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M|\mlxv\M} 
10 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c
index db4a879..9b4a952 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c
@@ -41,7 +41,7 @@ vector bool long long
 testld_cst_vbl (vector bool long vbl2)
 {
   return vec_ld (48, );
 }
 
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M} 6 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-short.c
index 5e42844..bb4835b 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-short.c
@@ -65,7 +65,7 @@ vector bool short
 testld_cst_vbs (vector bool short vbs2)
 {
   return vec_ld (80, );
 }
 
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M} 10 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxvw4x\M|\mlxv\M} 
10 } } */
 





[PATCH, rs6000] (v2) fold-vec-splat* testcase updates for power9

2017-11-20 Thread Will Schmidt
Hi,

Update the scan-assembler stanza to include those instructions
generated for a power9 target.

2017-11-20  Will Schmidt  

[testsuite]
* fold-vec-splat-8.c : Add vspltisb to expected output.
* fold-vec-splats-int.c : Add mtvsrws to expected output.
* fold-vec-splats-longlong.c : Add mtvsrdd to expected output.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-8.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-8.c
index 679fcb3..64d6320 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-8.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-8.c
@@ -41,6 +41,6 @@ vector unsigned char
 testuc_3 ()
 {
   return vec_splat_u8 (15);
 }
 
-/* { dg-final { scan-assembler-times "vspltisb" 6 } } */
+/* { dg-final { scan-assembler-times "xxspltib|vspltisb" 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-int.c
index 6671523..01b95c5 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-int.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-int.c
@@ -17,6 +17,6 @@ vector unsigned int
 test3u (unsigned int x)
 {
   return vec_splats (x);
 }
 
-/* { dg-final { scan-assembler-times {\mvspltw\M|\mxxspltw\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvspltw\M|\mxxspltw\M|\mmtvsrws\M} 2 } 
} */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-longlong.c
index c5884ba..2dbf48e 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-longlong.c
@@ -17,6 +17,6 @@ vector unsigned long long
 test3u (unsigned long long x)
 {
   return vec_splats (x);
 }
 
-/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
+/* { dg-final { scan-assembler-times "xxpermdi|mtvsrdd" 2 } } */




Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Michael Matz
Hi,

On Mon, 20 Nov 2017, Richard Biener wrote:

> Also we cannot raise alignment of commons and thus vectorization is 
> pessimized (all vectorizer testcases use - fno-common).

That would be a simple oversight then.  That's one of the nice things with 
common symbols, they contain their own alignment which you can freely 
adjust, you don't have to care for something like section alignment.

> IIRC LTO promotes commons to locals. 

Might be, but if so, probably for no good reason.


Ciao,
Michael.


Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Michael Matz
Hi,

On Mon, 20 Nov 2017, Sandra Loosemore wrote:

> > I have a hard time imaging that, so can you give details?  FWIW I've 
> > personally always considered using common symbols nicer.
> 
> The optimization case I'm familiar with is for targets that support -G 
> and GP-relative addressing modes to address small data.  Generally you 
> can only do that if the variable is allocated in the same compilation 
> unit as the reference, so the compiler knows definitely where it's 
> placed

For this to work it doesn't have to be from the same compilation unit, but 
rather needs to be put into something like .sdata or .sbss.  That in turn 
can be placed near whereever GP points to (GOT-like usually) by the link 
editor and then benefit from gp-relative addressing because the working 
theory is that by putting small stuff into .sdata it doesn't become larger 
than whatever the allowed offset for this addressing mode can be.  What 
I'm getting at is that this needs help from the link editor.

Same compilation unit doesn't really enter the picture, except if you 
indeed have different GPs per compilation unit.  But sure, making 
those symbols SHN_COMMON (let's speak ELF :) ) indeed defeats this (well, 
except if the linker puts small SHN_COMMONs into the equivalent of .sbss).  
Could also be done by the symtab placing routines, remove DECL_COMMON and 
give it .sbss section or equivalent.  I can see how just not enabling 
-fcommon is the easier path of course.


Ciao,
Michael.


Re: [PATCH] Avoid static initialization in the strlen pass

2017-11-20 Thread Jakub Jelinek
On Mon, Nov 20, 2017 at 02:25:35PM -0700, Martin Sebor wrote:
> On 11/20/2017 02:14 PM, Jakub Jelinek wrote:
> > Hi!
> > 
> > All the hash_maps in tree-ssa-strlen.c except for the newly added one
> > were pointers to hash maps, which were constructed either lazily or during
> > the pass.  But strlen_to_stridx is now constructed at the compiler start,
> > which is something I'd prefer to avoid, it affects even -O0 that way and
> > empty/small file compilation, something e.g. the kernel folks care so much
> > about.
> > 
> > Apparently the hash map is only needed when one of the two warnings
> > is enabled, so this patch initializes it only in that case and otherwise
> > doesn't fill it or query it.
> 
> This same change is also in the latest patch I posted for 82945
> just yesterday:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01670.html

Oops, sorry for missing that.  But you still initialize it unconditionally
in the pass and compute it even when it is only needed for the warning.

Jakub


Re: [PATCH] avoid assuming strncat/strncpy don't change length of source string (PR 83075)

2017-11-20 Thread Jakub Jelinek
On Mon, Nov 20, 2017 at 02:38:10PM -0700, Martin Sebor wrote:
> > I think it is better to have a dg-do run testcase and not scan for abort.
> > We can be able to optimize at some point n0 to 3 and n1 to 6 and optimize
> > away the abort.  In the testcase I've added to the PR there was a separate
> > function, you could add __attribute__((noipa)) to it to make sure that
> > it isn't IPA optimized.
> > Similarly for the strncat test.
> 
> Sure.  Attached is an update to make tests runnable.
> 
> > 
> > > Index: gcc/tree-ssa-strlen.c
> > > ===
> > > --- gcc/tree-ssa-strlen.c (revision 254961)
> > > +++ gcc/tree-ssa-strlen.c (working copy)
> > > @@ -1931,10 +1931,9 @@ handle_builtin_stxncpy (built_in_function, gimple_
> > >int sidx = get_stridx (src);
> > >strinfo *sisrc = sidx > 0 ? get_strinfo (sidx) : NULL;
> > > 
> > > -  /* Strncpy() et al. cannot modify the source string.  Prevent the rest
> > > - of the pass from invalidating the strinfo data.  */
> > > -  if (sisrc)
> > > -sisrc->dont_invalidate = true;
> > > +  /* Strncat() and strncpy() can modify the source string by specifying
> > > + as a destination SRC + strlen(SRC) and a bound <= strlen(SRC) so
> > > + SISRC->DONT_INVALIDATE must be left clear.  */
> > 
> > The destination could be SRC + N and as long as LEN <= N, you shouldn't
> > invalidate SRC.  Though, if LEN < strlen(SRC) I think you return early.
> 
> I admit I'm intrigued by the idea.  At the same time, for strncpy
> and strncat, these are the cases that trigger the truncation warning.
> They don't seem like something worth putting effort into optimizing.
> 
> I don't expect copying into the same array to be a common use of
> string functions but perhaps it's worth thinking about for memcpy
> and/or memmove?

I wasn't suggesting to optimize anything, just to clarify the
comment.  The 3rd argument is called LEN in the code, so instead
of talking about bound it should talk about LEN.  And the destination
can be closer to SRC than the strlen.  So I'd just change above your
"SRC + strlen(SRC) and a bound <= strlen(SRC)" to
"SRC + N for some N >= LEN" or similar.

> --- gcc/testsuite/gcc.dg/tree-ssa/strncat.c   (nonexistent)
> +++ gcc/testsuite/gcc.dg/tree-ssa/strncat.c   (working copy)
> @@ -0,0 +1,25 @@
> +/* PR tree-optimization/83075 - Invalid strncpy optimization
> +   { dg-do run }
> +   { dg-options "-O2 -Wno-stringop-overflow" } */
> +
> +void __attribute__ ((noipa))
> +test_overlapping_strncpy (void)

In this case the noipa attribute is pretty useless.
What I was suggesting is that you have a noipa
function with 2 char * arguments and the initialization and strcpy
is done in the caller.  Then it better tests what we care about,
it doesn't see for sure that the two pointers are related, the caller shows
they might.  If everything is in one function, then a future version
of the compiler might optimize everything away.

Jakub


Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread Jonathan Wakely

On 20/11/17 21:07 +, Jonathan Wakely wrote:

On 20/11/17 21:01 +, Jonathan Wakely wrote:

On 20/11/17 21:43 +0100, Christophe Lyon wrote:

On 20 November 2017 at 17:02, David Edelsohn  wrote:

This patch has introduced new regressions on at least PowerPC and AArch64.

FAIL: ext/special_functions/hyperg/check_value.cc execution test
FAIL: tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc
execution test

Thanks, David


On AArch64 and ARM, I have also noticed
FAIL: special_functions/18_riemann_zeta/check_value.cc (test for excess errors)
UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc
compilation failed to produce executable
because:
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
In function 'void test(const testcase_riemann_zeta (&)[Num],
Ret)':
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
error: 'riemann_zeta' is not a member of 'std'
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
note: suggested alternative: 'remainder'
compiler exited with status 1


The problem is that { dg-addition-options } was changed to dg-options,
and so the first dg-options that enables the special functions is not
used:

--- a/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
+++ b/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
@@ -21,7 +21,7 @@
//  riemann_zeta

// This can take long on simulators, timing out the test.
-// { dg-additional-options "-DMAX_ITERATIONS=5" { target simulator } }
+// { dg-options "-DMAX_ITERATIONS=5" { target simulator } }




I have a script to check dejagnu directives, and it says:

testsuite/tr1/5_numerical_facilities/special_functions/20_riemann_zeta/check_value_neg.cc
 has multiple dg-options directives
testsuite/ext/special_functions/airy_ai/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/hyperg/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/conf_hyperg/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/airy_bi/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/02_assoc_legendre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/14_expint/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/12_ellint_2/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/09_cyl_bessel_k/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/21_sph_neumann/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/15_hermite/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/19_sph_bessel/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/05_comp_ellint_2/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/11_ellint_1/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/17_legendre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/10_cyl_neumann/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/06_comp_ellint_3/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/06_comp_ellint_3/pr66689.cc has dg-options after 
dg-add-options
testsuite/special_functions/01_assoc_laguerre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/16_laguerre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/13_ellint_3/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/13_ellint_3/pr66689.cc has dg-options after 
dg-add-options
testsuite/special_functions/07_cyl_bessel_i/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/18_riemann_zeta/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/18_riemann_zeta/check_value.cc has multiple 
dg-options directives
testsuite/special_functions/08_cyl_bessel_j/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/04_comp_ellint_1/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/03_beta/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/20_sph_legendre/check_nan.cc has dg-options after 
dg-add-options

For now I'll just fix the multiple dg-options one causing the FAILs.



I've committed this patch, which should help for Christophe's cases.
It won't help the AIX execution FAILs.

commit 1d810d9fd1fbe11a144907a74686d0a03945a1a1
Author: Jonathan Wakely 
Date:   Mon Nov 20 21:31:27 2017 +

Fix failing tests caused by duplicate dg-options

* testsuite/special_functions/18_riemann_zeta/check_value.cc: Fix
duplicate dg-options directive.
* testsuite/tr1/5_numerical_facilities/special_functions/

Re: [PATCH] avoid assuming strncat/strncpy don't change length of source string (PR 83075)

2017-11-20 Thread Martin Sebor

On 11/20/2017 12:39 PM, Jakub Jelinek wrote:

On Mon, Nov 20, 2017 at 12:15:09PM -0700, Martin Sebor wrote:

PR tree-optimization/83075
* tree-ssa-strlen.c (handle_builtin_stxncpy): Avoid assuming
strncat/strncpy don't change length of source string.

gcc/testsuite/ChangeLog:

PR tree-optimization/83075
* gcc.dg/tree-ssa/strncat.c: New test.
* gcc.dg/tree-ssa/strncpy-2.c: Same.

Index: gcc/testsuite/gcc.dg/tree-ssa/strncat.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strncat.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strncat.c (working copy)
@@ -0,0 +1,22 @@
+/* PR tree-optimization/83075 - Invalid strncpy optimization
+   { dg-do compile }
+   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
+
+void test_overlapping_strncpy (void)
+{
+  char a[8] = "";
+
+  __builtin_strcpy (a, "123");
+
+  unsigned n0 = __builtin_strlen (a);
+
+  __builtin_strncat (a + 3, a, n0);
+
+  unsigned n1 = __builtin_strlen (a);
+
+  if (n1 == n0)
+__builtin_abort ();
+}
+
+/* Verify the call to abort hasn't been eliminated.
+   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c   (working copy)
@@ -0,0 +1,22 @@
+/* PR tree-optimization/83075 - Invalid strncpy optimization
+   { dg-do compile }
+   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
+
+void test_overlapping_strncpy (void)
+{
+  char a[8] = "";
+
+  __builtin_strcpy (a, "123");
+
+  unsigned n0 = __builtin_strlen (a);
+
+  __builtin_strncpy (a + 3, a, n0);
+
+  unsigned n1 = __builtin_strlen (a);
+
+  if (n1 == n0)
+__builtin_abort ();
+}
+
+/* Verify the call to abort hasn't been eliminated.
+   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */


I think it is better to have a dg-do run testcase and not scan for abort.
We can be able to optimize at some point n0 to 3 and n1 to 6 and optimize
away the abort.  In the testcase I've added to the PR there was a separate
function, you could add __attribute__((noipa)) to it to make sure that
it isn't IPA optimized.
Similarly for the strncat test.


Sure.  Attached is an update to make tests runnable.




Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   (revision 254961)
+++ gcc/tree-ssa-strlen.c   (working copy)
@@ -1931,10 +1931,9 @@ handle_builtin_stxncpy (built_in_function, gimple_
   int sidx = get_stridx (src);
   strinfo *sisrc = sidx > 0 ? get_strinfo (sidx) : NULL;

-  /* Strncpy() et al. cannot modify the source string.  Prevent the rest
- of the pass from invalidating the strinfo data.  */
-  if (sisrc)
-sisrc->dont_invalidate = true;
+  /* Strncat() and strncpy() can modify the source string by specifying
+ as a destination SRC + strlen(SRC) and a bound <= strlen(SRC) so
+ SISRC->DONT_INVALIDATE must be left clear.  */


The destination could be SRC + N and as long as LEN <= N, you shouldn't
invalidate SRC.  Though, if LEN < strlen(SRC) I think you return early.


I admit I'm intrigued by the idea.  At the same time, for strncpy
and strncat, these are the cases that trigger the truncation warning.
They don't seem like something worth putting effort into optimizing.

I don't expect copying into the same array to be a common use of
string functions but perhaps it's worth thinking about for memcpy
and/or memmove?

void f (void)
{
  char a[10] = "";

  __builtin_strcpy (a, "123");

  unsigned n0 = __builtin_strlen (a);

  __builtin_memcpy (a + 4, a, 4);

  unsigned n1 = __builtin_strlen (a);

  if (n1 != n0)
__builtin_abort ();
}

Martin
PR tree-optimization/83075 - Invalid strncpy optimization

gcc/ChangeLog:

	PR tree-optimization/83075
	* tree-ssa-strlen.c (handle_builtin_stxncpy): Avoid assuming
	strncat/strncpy don't change length of source string.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83075
	* gcc.dg/tree-ssa/strncat.c: New test.
	* gcc.dg/tree-ssa/strncpy-2.c: Same.

Index: gcc/testsuite/gcc.dg/tree-ssa/strncat.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strncat.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strncat.c	(working copy)
@@ -0,0 +1,25 @@
+/* PR tree-optimization/83075 - Invalid strncpy optimization
+   { dg-do run }
+   { dg-options "-O2 -Wno-stringop-overflow" } */
+
+void __attribute__ ((noipa))
+test_overlapping_strncpy (void)
+{
+  char a[8] = "";
+
+  __builtin_strcpy (a, "123");
+
+  unsigned n0 = __builtin_strlen (a);
+
+  __builtin_strncat (a + 3, a, n0);
+
+  unsigned n1 = __builtin_strlen (a);
+
+  if (n1 == n0)
+__builtin_abort ();
+}
+
+int main (void)
+{
+  test_overlapping_strncpy ();
+}

Re: [PATCH] Avoid static initialization in the strlen pass

2017-11-20 Thread Martin Sebor

On 11/20/2017 02:14 PM, Jakub Jelinek wrote:

Hi!

All the hash_maps in tree-ssa-strlen.c except for the newly added one
were pointers to hash maps, which were constructed either lazily or during
the pass.  But strlen_to_stridx is now constructed at the compiler start,
which is something I'd prefer to avoid, it affects even -O0 that way and
empty/small file compilation, something e.g. the kernel folks care so much
about.

Apparently the hash map is only needed when one of the two warnings
is enabled, so this patch initializes it only in that case and otherwise
doesn't fill it or query it.


This same change is also in the latest patch I posted for 82945
just yesterday:

  https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01670.html

Martin



[PATCH] Avoid static initialization in the strlen pass

2017-11-20 Thread Jakub Jelinek
Hi!

All the hash_maps in tree-ssa-strlen.c except for the newly added one
were pointers to hash maps, which were constructed either lazily or during
the pass.  But strlen_to_stridx is now constructed at the compiler start,
which is something I'd prefer to avoid, it affects even -O0 that way and
empty/small file compilation, something e.g. the kernel folks care so much
about.

Apparently the hash map is only needed when one of the two warnings
is enabled, so this patch initializes it only in that case and otherwise
doesn't fill it or query it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-20  Jakub Jelinek  

* tree-ssa-strlen.c (strlen_to_stridx): Change into a pointer to
hash_map.
(handle_builtin_strlen, strlen_optimize_stmt): Only access it
if non-NULL, instead of . use ->.
(handle_builtin_stxncpy): Return early if strlen_to_stridx
is NULL.  Spelling fix.  Instead of . use ->.
(pass_strlen::execute): Allocate strlen_to_stridx if
warn_stringop_{truncation,overflow}.  Instead of calling empty on it
delete it and clear it at the end of the pass.

--- gcc/tree-ssa-strlen.c.jj2017-11-15 09:40:03.0 +0100
+++ gcc/tree-ssa-strlen.c   2017-11-20 18:10:42.565458585 +0100
@@ -153,7 +153,7 @@ struct decl_stridxlist_map
 static hash_map *decl_to_stridxlist_htab;
 
 typedef std::pair stridx_strlenloc;
-static hash_map strlen_to_stridx;
+static hash_map *strlen_to_stridx;
 
 /* Obstack for struct stridxlist and struct decl_stridxlist_map.  */
 static struct obstack stridx_obstack;
@@ -1207,8 +1207,11 @@ handle_builtin_strlen (gimple_stmt_itera
  gcc_assert (si->full_string_p);
}
 
- location_t loc = gimple_location (stmt);
- strlen_to_stridx.put (lhs, stridx_strlenloc (idx, loc));
+ if (strlen_to_stridx)
+   {
+ location_t loc = gimple_location (stmt);
+ strlen_to_stridx->put (lhs, stridx_strlenloc (idx, loc));
+   }
  return;
}
 }
@@ -1253,8 +1256,11 @@ handle_builtin_strlen (gimple_stmt_itera
   set_strinfo (idx, si);
   find_equal_ptrs (src, idx);
 
-  location_t loc = gimple_location (stmt);
-  strlen_to_stridx.put (lhs, stridx_strlenloc (idx, loc));
+  if (strlen_to_stridx)
+   {
+ location_t loc = gimple_location (stmt);
+ strlen_to_stridx->put (lhs, stridx_strlenloc (idx, loc));
+   }
 }
 }
 
@@ -1909,6 +1915,9 @@ maybe_diag_stxncpy_trunc (gimple_stmt_it
 static void
 handle_builtin_stxncpy (built_in_function, gimple_stmt_iterator *gsi)
 {
+  if (strlen_to_stridx == NULL)
+return;
+
   gimple *stmt = gsi_stmt (*gsi);
 
   bool with_bounds = gimple_call_with_bounds_p (stmt);
@@ -1917,9 +1926,9 @@ handle_builtin_stxncpy (built_in_functio
   tree len = gimple_call_arg (stmt, with_bounds ? 3 : 2);
 
   /* If the length argument was computed from strlen(S) for some string
- S retrieve the strinfo index for the string (PSS->FIRST) alonng with
+ S retrieve the strinfo index for the string (PSS->FIRST) along with
  the location of the strlen() call (PSS->SECOND).  */
-  stridx_strlenloc *pss = strlen_to_stridx.get (len);
+  stridx_strlenloc *pss = strlen_to_stridx->get (len);
   if (!pss || pss->first <= 0)
 {
   if (maybe_diag_stxncpy_trunc (*gsi, src, len))
@@ -2966,9 +2975,12 @@ strlen_optimize_stmt (gimple_stmt_iterat
  fold_strstr_to_strncmp (gimple_assign_rhs1 (stmt),
  gimple_assign_rhs2 (stmt), stmt);
 
-   tree rhs1 = gimple_assign_rhs1 (stmt);
-   if (stridx_strlenloc *ps = strlen_to_stridx.get (rhs1))
- strlen_to_stridx.put (lhs, stridx_strlenloc (*ps));
+   if (strlen_to_stridx)
+ {
+   tree rhs1 = gimple_assign_rhs1 (stmt);
+   if (stridx_strlenloc *ps = strlen_to_stridx->get (rhs1))
+ strlen_to_stridx->put (lhs, stridx_strlenloc (*ps));
+ }
   }
 else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
{
@@ -3202,6 +3214,9 @@ pass_strlen::execute (function *fun)
   ssa_ver_to_stridx.safe_grow_cleared (num_ssa_names);
   max_stridx = 1;
 
+  if (warn_stringop_truncation || warn_stringop_overflow)
+strlen_to_stridx = new hash_map (64);
+
   calculate_dominance_info (CDI_DOMINATORS);
 
   /* String length optimization is implemented as a walk of the dominator
@@ -3220,7 +3235,11 @@ pass_strlen::execute (function *fun)
   laststmt.len = NULL_TREE;
   laststmt.stridx = 0;
 
-  strlen_to_stridx.empty ();
+  if (strlen_to_stridx)
+{
+  delete strlen_to_stridx;
+  strlen_to_stridx = NULL;
+}
 
   return 0;
 }

Jakub


Re: [C/C++ PATCH] Fix ICE in get_atomic_generic_size (PR c++/83059)

2017-11-20 Thread Nathan Sidwell

On 11/20/2017 04:03 PM, Jakub Jelinek wrote:

Hi!

Not all INTEGER_CSTs satisfy tree_fits_uhwi_p, and those that don't
ICE in tree_to_uhwi.  But, what the memmodel_base function actually
does is mask only the 16 low bits, the upper bits are reserved for
target dependent flags and -Winvalid-memory-model already warns
about those during expansion.  So it makes no sense to warn about say
(-1 << 16) here, thus I'm using TREE_INT_CST_LOW instead of tree_to_*hwi,
because we really care just about the low 16 bits thereof.




- int i = tree_to_uhwi (p);
- if (i < 0 || (memmodel_base (i) >= MEMMODEL_LAST))
-   {
- warning_at (loc, OPT_Winvalid_memory_model,
- "invalid memory model argument %d of %qE", x + 1,
- function);
-   }
+ if (memmodel_base (TREE_INT_CST_LOW (p)) >= MEMMODEL_LAST)
+   warning_at (loc, OPT_Winvalid_memory_model,
+   "invalid memory model argument %d of %qE", x + 1,
+   function);


Ok with a comment mentioning memmodel_base only looks at low-order bits.

nathan

--
Nathan Sidwell


[PATCH] Fix ICE with __RTL and -g (PR debug/82933)

2017-11-20 Thread Jakub Jelinek
Hi!

The dwarf2out.c code relies on the assembly_start debug hook being
invoked before any RTL is processed from final.c (and needs it to be
done just once).  Normally it is called from cgraphunit.c, but when
__RTL is seen with a starting pass, run_rtl_passes is called already
from the FE.  While it would be better to defer the rtl finalization
until cgraph says so, it might be quite hard, so instead this patch
hacks dwarf2out_assembly_start so that it can be invoked multiple times
(and does nothing on the 2nd+ call) and invokes it from the run_rtl_passes
function too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-20  Jakub Jelinek  

PR debug/82933
* run-rtl-passes.c: Include debug.h.
(run_rtl_passes): Call debug_hooks->assembly_start.
* dwarf2out.c (dwarf2out_assembly_start): Return early if invoked
multiple times.

* gcc.dg/rtl/x86_64/pr82933.c: New test.

--- gcc/run-rtl-passes.c.jj 2017-01-24 23:29:09.0 +0100
+++ gcc/run-rtl-passes.c2017-11-20 17:36:31.320854900 +0100
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
 #include "bitmap.h"
 #include "df.h"
 #include "regs.h"
+#include "debug.h" /* for debug_hooks.  */
 #include "insn-attr-common.h" /* for INSN_SCHEDULING.  */
 #include "insn-attr.h" /* for init_sched_attrs.  */
 #include "run-rtl-passes.h"
@@ -43,6 +44,9 @@ run_rtl_passes (char *initial_pass_name)
   cfun->pass_startwith = initial_pass_name;
   max_regno = max_reg_num ();
 
+  /* cgraphunit.c normally handles this.  */
+  (*debug_hooks->assembly_start) ();
+
   /* Pass "expand" normally sets this up.  */
 #ifdef INSN_SCHEDULING
   init_sched_attrs ();
--- gcc/dwarf2out.c.jj  2017-11-15 09:38:26.0 +0100
+++ gcc/dwarf2out.c 2017-11-20 17:31:48.222394813 +0100
@@ -27507,6 +27507,9 @@ dwarf2out_init (const char *filename ATT
 static void
 dwarf2out_assembly_start (void)
 {
+  if (text_section_line_info)
+return;
+
 #ifndef DWARF2_LINENO_DEBUGGING_INFO
   ASM_GENERATE_INTERNAL_LABEL (text_section_label, TEXT_SECTION_LABEL, 0);
   ASM_GENERATE_INTERNAL_LABEL (text_end_label, TEXT_END_LABEL, 0);
--- gcc/testsuite/gcc.dg/rtl/x86_64/pr82933.c.jj2017-11-20 
17:34:54.680063313 +0100
+++ gcc/testsuite/gcc.dg/rtl/x86_64/pr82933.c   2017-11-20 17:35:26.361667161 
+0100
@@ -0,0 +1,4 @@
+/* { dg-do run { target { { i?86-*-* x86_64-*-* } && lp64 } } } */
+/* { dg-options "-g" } */
+
+#include "into-cfglayout.c"

Jakub


Re: [PATCH 1/3] [ARM] Add bus_width_bits to tune_params

2017-11-20 Thread Charles Baylis
On 15 September 2017 at 18:01, Kyrill  Tkachov
 wrote:

> From what I can tell Ramana and Richard preferred to encode this attribute
> as
> a tuning struct property rather than an inline conditional based on
> arm_arch7.
> I agree that if we want to use that information, it should be encoded this
> way.
> What I'm not convinced about is whether we do want this parameter in the
> first place.
>
> The cost tables already encode information about the costs of different
> sized loads/stores.
> In patch 2, for example, you add the cost for extra_cost->ldst.load which is
> nominally just
> the cost of a normal 32-bit ldr. But we also have costs for ldst.ldrd which
> is the 64-bit two-register load
> which should reflect any extra cost due to a narrower bus in it. We also
> have costs for ldst.loadf (for 32-bit
> VFP loads) and ldst.loadd (for 64-bit VFP D-register loads). So I think we
> should use those cost fields
> depending on the mode class and size instead of using ldst.load
> unconditionally and adding a new bus_size parameter.
>
> So I think the way forward is to drop this patch and modify patch 2/3 to use
> the extra_cost->ldst fields as described above.
>
> Sorry for the back-and-forth. I think this is the best approach because it
> uses the existing fields more naturally and
> doesn't add new parameters that partly duplicate the information encoded in
> the existing fields.
> Ramana, Richard: if you prefer the bus_width approach I won't block it, but
> could you clarify your preference?
> If we do end up adding the bus_width parameter then this patch and patch 2/3
> look ok.
> Thanks,
> Kyrill
>
> P.S. I'm going on a 4-week holiday from today, so I won't be able to do any
> further review in that timeframe.
> As I said, if we go with the bus_size approach then these patches are ok. If
> we go with my suggestion, this would
> be dropped and patch 2 would be extended to select the appropriate
> extra_cost->ldst field depending on mode.

OK, I agree with dropping this patch. I have posted an updated patch 2
which does not require it.


Re: [PATCH 2/3] [ARM] Refactor costs calculation for MEM.

2017-11-20 Thread Charles Baylis
On 15 September 2017 at 18:01, Kyrill  Tkachov
 wrote:
>
> On 15/09/17 16:38, Charles Baylis wrote:
>>
>> On 13 September 2017 at 10:02, Kyrill  Tkachov
>>  wrote:
>>>
>>> Hi Charles,
>>>
>>> On 12/09/17 09:34, charles.bay...@linaro.org wrote:

 From: Charles Baylis 

 This patch moves the calculation of costs for MEM into a
 separate function, and reforms the calculation into two
 parts. Firstly any additional cost of the addressing mode
 is calculated, and then the cost of the memory access itself
 is added.

 In this patch, the calculation of the cost of the addressing
 mode is left as a placeholder, to be added in a subsequent
 patch.

>>> Can you please mention how has this series been tested?
>>> A bootstrap and test run on arm-none-linux-gnueabihf is required at
>>> least.
>>
>> It has been tested with make check on arm-unknown-linux-gnueabihf with
>> no regressions. I've successfully bootstrapped the next spin.
>
>
> Thanks.
>
>>> Also, do you have any benchmarking results for this?
>>> I agree that generating the addressing modes in the new tests is
>>> desirable.
>>> So I'm not objecting to the goal of this patch, but a check to make sure
>>> that this doesn't regress SPEC
>>> would be great.  Further comments on the patch inline.
>>
>> SPEC2006 scores are unaffected by this patch on Cortex-A57.
>
>
> Good, thanks for checking :)
>
>
 +/* Helper function for arm_rtx_costs_internal.  Calculates the cost of
 a
 MEM,
 +   considering the costs of the addressing mode and memory access
 +   separately.  */
 +static bool
 +arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
 +  int *cost, bool speed_p)
 +{
 +  machine_mode mode = GET_MODE (x);
 +  if (flag_pic
 +  && GET_CODE (XEXP (x, 0)) == PLUS
 +  && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
 +/* This will be split into two instructions.  Add the cost of the
 +   additional instruction here.  The cost of the memory access is
 computed
 +   below.  See arm.md:calculate_pic_address.  */
 +*cost = COSTS_N_INSNS (1);
 +  else
 +*cost = 0;
>>>
>>>
>>> For speed_p we want the size cost of the insn (COSTS_N_INSNS (1) for a
>>> each
>>> insn)
>>> plus the appropriate field in extra_cost. So you should unconditionally
>>> initialise the cost
>>> to COSTS_N_INSNS (1), conditionally increment it by COSTS_N_INSNS (1)
>>> with
>>> the condition above.
>>
>> OK. I also have to subtract that COSTS_N_INSNS (1) in the if (speed_p)
>> part because the cost of a single bus transfer is included in that
>> initial cost.
>>
 +
 +  /* Calculate cost of the addressing mode.  */
 +  if (speed_p)
 +{
 +  /* TODO: Add table-driven costs for addressing modes.  (See patch
 2) */
 +}
>>>
>>>
>>> You mean "patch 3". I recommend you just remove this conditional from
>>> this
>>> patch and add the logic
>>> in patch 3 entirely.
>>
>> OK.
>>
 +
 +  /* Calculate cost of memory access.  */
 +  if (speed_p)
 +{
 +  /* data transfer is transfer size divided by bus width.  */
 +  int bus_width_bytes = current_tune->bus_width / 4;
>>>
>>>
>>> This should be bus_width / BITS_PER_UNIT to get the size in bytes.
>>> BITS_PER_UNIT is 8 though, so you'll have to double check to make sure
>>> the
>>> cost calculation and generated code is still appropriate.
>>
>> Oops, I changed the units around and messed this up. I'll fix this.
>>
 +  *cost += CEIL (GET_MODE_SIZE (mode), bus_width_bytes);
 +  *cost += extra_cost->ldst.load;
 +}
 +  else
 +{
 +  *cost += COSTS_N_INSNS (1);
 +}
>>>
>>> Given my first comment above this else would be deleted.
>>
>> OK
>
>
> I have a concern about using the bus_width parameter which
> I explain in the thread for patch 1 (I don't think we need it, we should use
> the fields in extra_cost->ldst
> more carefully).

I have modified this patch accordingly. Patch 1 is no longer needed.

Passes "make check" (with patch 3) on arm-linux-gnueabihf with no
regressions. Bootstrap is in progress.

Can I still get this in during stage 3?

gcc/ChangeLog:

  Charles Baylis  

* config/arm/arm.c (arm_mem_costs): New function.
(arm_rtx_costs_internal): Use arm_mem_costs.

gcc/testsuite/ChangeLog:

  Charles Baylis  

* gcc.target/arm/addr-modes-float.c: New test.
* gcc.target/arm/addr-modes-int.c: New test.
* gcc.target/arm/addr-modes.h: New header.
From 26d9c0839ef7318074d3fd38dca3989bd3e51d54 Mon Sep 17 00:00:00 2001
From: Charles Baylis 
Date: Wed, 8 Feb 2017 16:52:10 +
Subject: [PATCH 1/3] [ARM] Refactor costs calculation for MEM.

This patch moves the 

Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread Jonathan Wakely

On 20/11/17 21:01 +, Jonathan Wakely wrote:

On 20/11/17 21:43 +0100, Christophe Lyon wrote:

On 20 November 2017 at 17:02, David Edelsohn  wrote:

This patch has introduced new regressions on at least PowerPC and AArch64.

FAIL: ext/special_functions/hyperg/check_value.cc execution test
FAIL: tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc
execution test

Thanks, David


On AArch64 and ARM, I have also noticed
FAIL: special_functions/18_riemann_zeta/check_value.cc (test for excess errors)
UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc
compilation failed to produce executable
because:
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
In function 'void test(const testcase_riemann_zeta (&)[Num],
Ret)':
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
error: 'riemann_zeta' is not a member of 'std'
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
note: suggested alternative: 'remainder'
compiler exited with status 1


The problem is that { dg-addition-options } was changed to dg-options,
and so the first dg-options that enables the special functions is not
used:

--- a/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
+++ b/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
@@ -21,7 +21,7 @@
//  riemann_zeta

// This can take long on simulators, timing out the test.
-// { dg-additional-options "-DMAX_ITERATIONS=5" { target simulator } }
+// { dg-options "-DMAX_ITERATIONS=5" { target simulator } }




I have a script to check dejagnu directives, and it says:

testsuite/tr1/5_numerical_facilities/special_functions/20_riemann_zeta/check_value_neg.cc
 has multiple dg-options directives
testsuite/ext/special_functions/airy_ai/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/hyperg/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/conf_hyperg/check_nan.cc has dg-options after 
dg-add-options
testsuite/ext/special_functions/airy_bi/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/02_assoc_legendre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/14_expint/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/12_ellint_2/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/09_cyl_bessel_k/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/21_sph_neumann/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/15_hermite/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/19_sph_bessel/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/05_comp_ellint_2/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/11_ellint_1/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/17_legendre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/10_cyl_neumann/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/06_comp_ellint_3/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/06_comp_ellint_3/pr66689.cc has dg-options after 
dg-add-options
testsuite/special_functions/01_assoc_laguerre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/16_laguerre/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/13_ellint_3/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/13_ellint_3/pr66689.cc has dg-options after 
dg-add-options
testsuite/special_functions/07_cyl_bessel_i/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/18_riemann_zeta/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/18_riemann_zeta/check_value.cc has multiple 
dg-options directives
testsuite/special_functions/08_cyl_bessel_j/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/04_comp_ellint_1/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/03_beta/check_nan.cc has dg-options after 
dg-add-options
testsuite/special_functions/20_sph_legendre/check_nan.cc has dg-options after 
dg-add-options

For now I'll just fix the multiple dg-options one causing the FAILs.



[PATCH] Fix ICEs from expand_mul_overflow (PR target/82981)

2017-11-20 Thread Jakub Jelinek
Hi!

Apparently ARM can do the widening SImode multiply only using a libcall,
so the recently changed expand_mul_overflow ignores it at first, then
sees possibility to use the HImode code, but uses expand_simple_binop
with OPTAB_DIRECT which requires actual HImode optabs, which ARM doesn't
have, it needs to use widening to SImode.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-11-20  Jakub Jelinek  

PR target/82981
* internal-fn.c (expand_mul_overflow): Use OPTAB_WIDEN instead of
OPTAB_DIRECT in calls to expand_simple_binop.

--- gcc/internal-fn.c.jj2017-11-15 09:54:30.0 +0100
+++ gcc/internal-fn.c   2017-11-20 16:38:55.185145957 +0100
@@ -1760,7 +1760,7 @@ expand_mul_overflow (location_t loc, tre
  tem = convert_modes (mode, hmode, lopart, 1);
  tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1);
  tem = expand_simple_binop (mode, MINUS, loxhi, tem, NULL_RTX,
-1, OPTAB_DIRECT);
+1, OPTAB_WIDEN);
  emit_move_insn (loxhi, tem);
 
  emit_label (after_hipart_neg);
@@ -1774,7 +1774,7 @@ expand_mul_overflow (location_t loc, tre
 profile_probability::even ());
 
  tem = expand_simple_binop (mode, MINUS, loxhi, larger, NULL_RTX,
-1, OPTAB_DIRECT);
+1, OPTAB_WIDEN);
  emit_move_insn (loxhi, tem);
 
  emit_label (after_lopart_neg);
@@ -1783,7 +1783,7 @@ expand_mul_overflow (location_t loc, tre
  /* loxhi += (uns) lo0xlo1 >> (bitsize / 2);  */
  tem = expand_shift (RSHIFT_EXPR, mode, lo0xlo1, hprec, NULL_RTX, 1);
  tem = expand_simple_binop (mode, PLUS, loxhi, tem, NULL_RTX,
-1, OPTAB_DIRECT);
+1, OPTAB_WIDEN);
  emit_move_insn (loxhi, tem);
 
  /* if (loxhi >> (bitsize / 2)
@@ -1810,7 +1810,7 @@ expand_mul_overflow (location_t loc, tre
   convert_modes (hmode, mode, lo0xlo1, 1), 1);
 
  tem = expand_simple_binop (mode, IOR, loxhishifted, tem, res,
-1, OPTAB_DIRECT);
+1, OPTAB_WIDEN);
  if (tem != res)
emit_move_insn (res, tem);
  emit_jump (done_label);
@@ -1835,7 +1835,7 @@ expand_mul_overflow (location_t loc, tre
  if (!op0_medium_p)
{
  tem = expand_simple_binop (hmode, PLUS, hipart0, const1_rtx,
-NULL_RTX, 1, OPTAB_DIRECT);
+NULL_RTX, 1, OPTAB_WIDEN);
  do_compare_rtx_and_jump (tem, const1_rtx, GTU, true, hmode,
   NULL_RTX, NULL, do_error,
   profile_probability::very_unlikely 
());
@@ -1844,7 +1844,7 @@ expand_mul_overflow (location_t loc, tre
  if (!op1_medium_p)
{
  tem = expand_simple_binop (hmode, PLUS, hipart1, const1_rtx,
-NULL_RTX, 1, OPTAB_DIRECT);
+NULL_RTX, 1, OPTAB_WIDEN);
  do_compare_rtx_and_jump (tem, const1_rtx, GTU, true, hmode,
   NULL_RTX, NULL, do_error,
   profile_probability::very_unlikely 
());

Jakub


[committed] i386.c diagnostics fix

2017-11-20 Thread Jakub Jelinek
Hi!

While working on the just posted patch, I've noticed another spot
in i386.c that uses incorrect capitalization in diagnostics.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk as obvious.

2017-11-20  Jakub Jelinek  

* config/i386/i386.c (parse_mtune_ctrl_str): Start diagnostics
with lower case letter.

--- gcc/config/i386/i386.c.jj   2017-11-20 11:02:42.0 +0100
+++ gcc/config/i386/i386.c  2017-11-20 11:44:16.286754295 +0100
@@ -3280,7 +3280,7 @@ parse_mtune_ctrl_str (bool dump)
 }
 }
   if (i == X86_TUNE_LAST)
-error ("Unknown parameter to option -mtune-ctrl: %s",
+error ("unknown parameter to option -mtune-ctrl: %s",
clear ? curr_feature_string - 1 : curr_feature_string);
   curr_feature_string = next_feature_string;
 }

Jakub


[C/C++ PATCH] Fix ICE in get_atomic_generic_size (PR c++/83059)

2017-11-20 Thread Jakub Jelinek
Hi!

Not all INTEGER_CSTs satisfy tree_fits_uhwi_p, and those that don't
ICE in tree_to_uhwi.  But, what the memmodel_base function actually
does is mask only the 16 low bits, the upper bits are reserved for
target dependent flags and -Winvalid-memory-model already warns
about those during expansion.  So it makes no sense to warn about say
(-1 << 16) here, thus I'm using TREE_INT_CST_LOW instead of tree_to_*hwi,
because we really care just about the low 16 bits thereof.

The i386.c change is an obvious capitalization fix.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-20  Jakub Jelinek  

PR c++/83059
* c-common.c (get_atomic_generic_size): Use TREE_INT_CST_LOW
instead of tree_to_uhwi, formatting fix.
* config/i386/i386.c (ix86_memmodel_check): Start
-Winvalid-memory-model diagnostics with lowercase letter.

* c-c++-common/pr83059.c: New test.

--- gcc/c-family/c-common.c.jj  2017-11-15 09:38:22.0 +0100
+++ gcc/c-family/c-common.c 2017-11-20 11:35:08.720607317 +0100
@@ -6671,13 +6671,10 @@ get_atomic_generic_size (location_t loc,
   tree p = (*params)[x];
   if (TREE_CODE (p) == INTEGER_CST)
 {
- int i = tree_to_uhwi (p);
- if (i < 0 || (memmodel_base (i) >= MEMMODEL_LAST))
-   {
- warning_at (loc, OPT_Winvalid_memory_model,
- "invalid memory model argument %d of %qE", x + 1,
- function);
-   }
+ if (memmodel_base (TREE_INT_CST_LOW (p)) >= MEMMODEL_LAST)
+   warning_at (loc, OPT_Winvalid_memory_model,
+   "invalid memory model argument %d of %qE", x + 1,
+   function);
}
   else
if (!INTEGRAL_TYPE_P (TREE_TYPE (p)))
--- gcc/config/i386/i386.c.jj   2017-11-20 11:02:42.0 +0100
+++ gcc/config/i386/i386.c  2017-11-20 11:44:16.286754295 +0100
@@ -49066,7 +49066,7 @@ ix86_memmodel_check (unsigned HOST_WIDE_
   || ((val & IX86_HLE_ACQUIRE) && (val & IX86_HLE_RELEASE)))
 {
   warning (OPT_Winvalid_memory_model,
-  "Unknown architecture specific memory model");
+  "unknown architecture specific memory model");
   return MEMMODEL_SEQ_CST;
 }
   strong = (is_mm_acq_rel (model) || is_mm_seq_cst (model));
--- gcc/testsuite/c-c++-common/pr83059.c.jj 2017-11-20 12:01:52.491531090 
+0100
+++ gcc/testsuite/c-c++-common/pr83059.c2017-11-20 12:02:32.930024696 
+0100
@@ -0,0 +1,10 @@
+/* PR c++/83059 */
+/* { dg-do compile } */
+
+void
+foo (int *p, int *q, int *r)
+{
+  __atomic_compare_exchange (p, q, r, 0, 0, -1);   /* { dg-warning 
"invalid memory model argument 6" } */
+  /* { dg-warning "unknown architecture specifi" "" { target *-*-* } .-1 } */
+  /* { dg-warning "failure memory model cannot be stronger than success memory 
model" "" { target *-*-* } .-2 } */
+}

Jakub


Re: [14/nn] Add helpers for shift count modes

2017-11-20 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, Oct 26, 2017 at 2:06 PM, Richard Biener
>  wrote:
>> On Mon, Oct 23, 2017 at 1:25 PM, Richard Sandiford
>>  wrote:
>>> This patch adds a stub helper routine to provide the mode
>>> of a scalar shift amount, given the mode of the values
>>> being shifted.
>>>
>>> One long-standing problem has been to decide what this mode
>>> should be for arbitrary rtxes (as opposed to those directly
>>> tied to a target pattern).  Is it the mode of the shifted
>>> elements?  Is it word_mode?  Or maybe QImode?  Is it whatever
>>> the corresponding target pattern says?  (In which case what
>>> should the mode be when the target doesn't have a pattern?)
>>>
>>> For now the patch picks word_mode, which should be safe on
>>> all targets but could perhaps become suboptimal if the helper
>>> routine is used more often than it is in this patch.  As it
>>> stands the patch does not change the generated code.
>>>
>>> The patch also adds a helper function that constructs rtxes
>>> for constant shift amounts, again given the mode of the value
>>> being shifted.  As well as helping with the SVE patches, this
>>> is one step towards allowing CONST_INTs to have a real mode.
>>
>> I think gen_shift_amount_mode is flawed and while encapsulating
>> constant shift amount RTX generation into a gen_int_shift_amount
>> looks good to me I'd rather have that ??? in this function (and
>> I'd use the mode of the RTX shifted, not word_mode...).

OK.  I'd gone for word_mode because that's what expand_binop uses
for CONST_INTs:

  op1_mode = (GET_MODE (op1) != VOIDmode
  ? as_a  (GET_MODE (op1))
  : word_mode);

But using the inner mode should be fine too.  The patch below does that.

>> In the end it's up to insn recognizing to convert the op to the
>> expected mode and for generic RTL it's us that should decide
>> on the mode -- on GENERIC the shift amount has to be an
>> integer so why not simply use a mode that is large enough to
>> make the constant fit?

...but I can do that instead if you think it's better.

>> Just throwing in some comments here, RTL isn't my primary
>> expertise.
>
> To add a little bit - shift amounts is maybe the only(?) place
> where a modeless CONST_INT makes sense!  So "fixing"
> that first sounds backwards.

But even here they have a mode conceptually, since out-of-range shift
amounts are target-defined rather than undefined.  E.g. if the target
interprets the shift amount as unsigned, then for a shift amount
(const_int -1) it matters whether the mode is QImode (and so we're
shifting by 255) or HImode (and so we're shifting by 65535.

OK, so shifts by 65535 make no sense in practice, but *conceptually*... :-)

Jeff Law  writes:
> On 10/26/2017 06:06 AM, Richard Biener wrote:
>> On Mon, Oct 23, 2017 at 1:25 PM, Richard Sandiford
>>  wrote:
>>> This patch adds a stub helper routine to provide the mode
>>> of a scalar shift amount, given the mode of the values
>>> being shifted.
>>>
>>> One long-standing problem has been to decide what this mode
>>> should be for arbitrary rtxes (as opposed to those directly
>>> tied to a target pattern).  Is it the mode of the shifted
>>> elements?  Is it word_mode?  Or maybe QImode?  Is it whatever
>>> the corresponding target pattern says?  (In which case what
>>> should the mode be when the target doesn't have a pattern?)
>>>
>>> For now the patch picks word_mode, which should be safe on
>>> all targets but could perhaps become suboptimal if the helper
>>> routine is used more often than it is in this patch.  As it
>>> stands the patch does not change the generated code.
>>>
>>> The patch also adds a helper function that constructs rtxes
>>> for constant shift amounts, again given the mode of the value
>>> being shifted.  As well as helping with the SVE patches, this
>>> is one step towards allowing CONST_INTs to have a real mode.
>> 
>> I think gen_shift_amount_mode is flawed and while encapsulating
>> constant shift amount RTX generation into a gen_int_shift_amount
>> looks good to me I'd rather have that ??? in this function (and
>> I'd use the mode of the RTX shifted, not word_mode...).
>> 
>> In the end it's up to insn recognizing to convert the op to the
>> expected mode and for generic RTL it's us that should decide
>> on the mode -- on GENERIC the shift amount has to be an
>> integer so why not simply use a mode that is large enough to
>> make the constant fit?
>> 
>> Just throwing in some comments here, RTL isn't my primary
>> expertise.
> I wonder if encapsulation + a target hook to specify the mode would be
> better?  We'd then have to argue over word_mode, vs QImode vs something
> else for the default, but at least we'd have a way for the target to
> specify the mode is generally best when working on shift counts.
>
> In the end I doubt there's a single definition that is 

Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread Jonathan Wakely

On 20/11/17 21:43 +0100, Christophe Lyon wrote:

On 20 November 2017 at 17:02, David Edelsohn  wrote:

This patch has introduced new regressions on at least PowerPC and AArch64.

FAIL: ext/special_functions/hyperg/check_value.cc execution test
FAIL: tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc
execution test

Thanks, David


On AArch64 and ARM, I have also noticed
FAIL: special_functions/18_riemann_zeta/check_value.cc (test for excess errors)
UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc
compilation failed to produce executable
because:
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
In function 'void test(const testcase_riemann_zeta (&)[Num],
Ret)':
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
error: 'riemann_zeta' is not a member of 'std'
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
note: suggested alternative: 'remainder'
compiler exited with status 1


The problem is that { dg-addition-options } was changed to dg-options,
and so the first dg-options that enables the special functions is not
used:

--- a/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
+++ b/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc
@@ -21,7 +21,7 @@
//  riemann_zeta

// This can take long on simulators, timing out the test.
-// { dg-additional-options "-DMAX_ITERATIONS=5" { target simulator } }
+// { dg-options "-DMAX_ITERATIONS=5" { target simulator } }




[PATCH] Fix load_gsi computation in store-merging (PR tree-optimization/83047)

2017-11-20 Thread Jakub Jelinek
Hi!

This is something the bswap pass has been already doing, but not the
new load handling code in store-merging.  If all the loads have the same
vuse, it still doesn't mean they are necessarily in the same basic block.
If they are in multiple bbs, previously we've chosen randomly (well, from
the load corresponding to the first store in the group), now we pick at
least the last basic block.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

It might not be good enough, if some program has
  int x = q[0]; foo (x); int y = q[1]; p[0] = x; p[1] = y; and
foo conditionally exits or aborts or externally throws or loops forever
in case q[1] might not be mapped, then even when both loads are in the
same bb we might put the larger load on the first load rather than the
second.  I think we'd need to compute uids (perhaps lazily) and compare
which stmt comes last.  Thoughts on that?

2017-11-20  Jakub Jelinek  

PR tree-optimization/83047
* gimple-ssa-store-merging.c
(imm_store_chain_info::output_merged_store): If the loads with the
same vuse are in different basic blocks, for load_gsi pick a load
location that is dominated by the other loads.

* gcc.dg/pr83047.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2017-11-17 08:40:25.0 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-11-20 10:37:40.429859947 +0100
@@ -1857,7 +1857,30 @@ imm_store_chain_info::output_merged_stor
   store_immediate_info *infol = group->stores.last ();
   if (gimple_vuse (op.stmt) == gimple_vuse (infol->ops[j].stmt))
{
- load_gsi[j] = gsi_for_stmt (op.stmt);
+ /* We can't pick the location randomly; while we've verified
+all the loads have the same vuse, they can be still in different
+basic blocks and we need to pick the one from the last bb:
+int x = q[0];
+if (x == N) return;
+int y = q[1];
+p[0] = x;
+p[1] = y;
+otherwise if we put the wider load at the q[0] load, we might
+segfault if q[1] is not mapped.  */
+ basic_block bb = gimple_bb (op.stmt);
+ gimple *ostmt = op.stmt;
+ store_immediate_info *info;
+ FOR_EACH_VEC_ELT (group->stores, i, info)
+   {
+ gimple *tstmt = info->ops[j].stmt;
+ basic_block tbb = gimple_bb (tstmt);
+ if (dominated_by_p (CDI_DOMINATORS, tbb, bb))
+   {
+ ostmt = tstmt;
+ bb = tbb;
+   }
+   }
+ load_gsi[j] = gsi_for_stmt (ostmt);
  load_addr[j]
= force_gimple_operand_1 (unshare_expr (op.base_addr),
  _seq[j], is_gimple_mem_ref_addr,
--- gcc/testsuite/gcc.dg/pr83047.c.jj   2017-11-20 10:19:26.612657065 +0100
+++ gcc/testsuite/gcc.dg/pr83047.c  2017-11-20 10:24:15.0 +0100
@@ -0,0 +1,58 @@
+/* PR tree-optimization/83047 */
+/* { dg-do run { target mmap } } */
+/* { dg-options "-O2" } */
+
+#include 
+#include 
+#include 
+#ifndef MAP_ANONYMOUS
+#define MAP_ANONYMOUS MAP_ANON
+#endif
+#ifndef MAP_ANON
+#define MAP_ANON 0
+#endif
+#ifndef MAP_FAILED
+#define MAP_FAILED ((void *)-1)
+#endif
+#include 
+
+__attribute__((noipa)) void
+foo (char *p, char *q, int r)
+{
+  char a = q[0];
+  if (r || a == '\0')
+return;
+  char b = q[1];
+  p[0] = a;
+  p[1] = b;
+}
+
+int
+main ()
+{
+  char *p = mmap (NULL, 131072, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  if (p == MAP_FAILED)
+return 0;
+  if (munmap (p + 65536, 65536) < 0)
+return 0;
+  p[0] = 'c';
+  p[1] = 'd';
+  p[65536 - 2] = 'a';
+  p[65536 - 1] = 'b';
+  volatile int r = 1;
+  foo (p, p + 65536 - 2, r);
+  if (p[0] != 'c' || p[1] != 'd')
+abort ();
+  r = 0;
+  foo (p, p + 65536 - 2, r);
+  if (p[0] != 'a' || p[1] != 'b')
+abort ();
+  p[0] = 'e';
+  p[1] = 'f';
+  r = 1;
+  foo (p, p + 65536 - 1, r);
+  if (p[0] != 'e' || p[1] != 'f')
+abort ();
+  return 0;
+}

Jakub


Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Richard Biener
On November 20, 2017 9:02:55 PM GMT+01:00, Sandra Loosemore 
 wrote:
>On 11/20/2017 10:34 AM, Michael Matz wrote:
>> What's your rationale for changing this?  In your initial mail you
>said:
>> 
>> "On many targets this means global variable accesses having an
>unnecessary
>> codesize and performance penalty in C code (the same source generates
>> better code when built as C++)."
>> 
>> I have a hard time imaging that, so can you give details?  FWIW I've
>> personally always considered using common symbols nicer.
>
>The optimization case I'm familiar with is for targets that support -G 
>and GP-relative addressing modes to address small data.  Generally you 
>can only do that if the variable is allocated in the same compilation 
>unit as the reference, so the compiler knows definitely where it's 
>placed (unless the backend also supports some other mechanism for 
>asserting that a variable is small data).  Putting tentatively-defined 
>variables in common defeats that optimization.

Also we cannot raise alignment of commons and thus vectorization is pessimized 
(all vectorizer testcases use - fno-common). 

IIRC LTO promotes commons to locals. 

Richard. 

>-Sandra



Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread Christophe Lyon
On 20 November 2017 at 17:02, David Edelsohn  wrote:
> This patch has introduced new regressions on at least PowerPC and AArch64.
>
> FAIL: ext/special_functions/hyperg/check_value.cc execution test
> FAIL: tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc
> execution test
>
> Thanks, David

On AArch64 and ARM, I have also noticed
FAIL: special_functions/18_riemann_zeta/check_value.cc (test for excess errors)
UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc
compilation failed to produce executable
because:
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
In function 'void test(const testcase_riemann_zeta (&)[Num],
Ret)':
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
error: 'riemann_zeta' is not a member of 'std'
/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:292:
note: suggested alternative: 'remainder'
compiler exited with status 1


Re: Increase precision of static profiles

2017-11-20 Thread Christophe Lyon
On 19 November 2017 at 11:01, Tom de Vries  wrote:
> On 11/17/2017 08:53 PM, Jan Hubicka wrote:
>>
>> Hi,
>> this patch makes static profile to be in range 0...2^30 rather than
>> 0...1.  This is safe now as profile-counts are taking care of
>> possible overflow when the profile ends up cummulating too high after
>> inlining.
>> MThere are two testcases that needs adusting. dump-2.c simply checks
>> for specific value of counter that is now different. pr77445-2
>> now gets one extra mismatch reported. The mismatch was present before
>> too but due to low precision it was not visible.
>>
>> Bootstrapped/regtested x86_64-linux, comitted.
>>
>> Honza
>>
>> * predict.c (determine_unlikely_bbs): Set cgraph node count to 0
>> when entry block was promoted unlikely.
>> (estimate_bb_frequencies): Increase frequency scale.
>> * profile-count.h (profile_count): Export precision info.
>> * gcc.dg/tree-ssa/dump-2.c: Fixup template for profile precision
>> changes.
>> * gcc.dg/tree-ssa/pr77445-2.c: Fixup template for profile
>> precision
>> changes.
>
>
> Hi,
>
> this caused PR 83043 - "FAIL: libgomp.graphite/force-parallel-1.c
> scan-tree-dump-times graphite "2 loops carried no dependency" 1 (found 0
> times)" ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83043 ).
>

I have also reported PR 83081
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83081

FAIL: gcc.dg/pr80218.c scan-rtl-dump-not ira "Invalid sum"
on arm-none-linux-gnueabi

This might be a duplicate.

Christophe

> Thanks,
> - Tom


Re: [patch] Add support for #pragma GCC unroll

2017-11-20 Thread Bernhard Reutner-Fischer
On 20 November 2017 at 15:45, Steve Kargl
 wrote:
> On Mon, Nov 20, 2017 at 12:57:47PM +0100, Bernhard Reutner-Fischer wrote:
>> On 20 November 2017 at 12:26, Eric Botcazou  wrote:
>> >> If anybody finds the time to push the corresponding Fortran changes then 
>> >> I'd
>> >> be grateful. I won't have time for this until end of stage 1...
>> >>
>> >> https://gcc.gnu.org/ml/fortran/2015-02/msg00014.html
>> >
>> > OK, I'm going to merge it in the main patch.
>>
>> [CCing fortran@]
>> Thanks alot in advance!
>
> The URL points to a nearly 3 year old patch.  I noticed
> that there is no documentation of the new Fortran directive.
> Section 7.2 of gfortran.info should be updated.  In particular,
> does '!GCC$ UNROLL 4' affect only the immediately following
> DO-LOOP or all DO-LOOPs that follow the directive until another
> 'GCC$ UNROLL ...' is found?  How does this new directive interface
> with OpenMP and OpenACC?

The documentation for the directive is missing indeed. We can fix this
during stage3.

Currently the directive works on the whole function (see
gfc_cfun_has_unroll()) and instructs the loop-optimizers to run on
that function.
The loop-optimizers will discover the ANNOTATE_EXPR and act accordingly.
Richard B. already noted that the RTL unroller might do more than
intended, see https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01468.html
I expect updates to the C and C++ in this area to be reflected to Fortran too.

The interaction with OpenMP and OpenACC in the Fortran FE is the same
as in the other frontends, obviously.
Eric's current respin of Mikes patch is here, FYI:
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01452.html

thanks,


Re: [PATCH, rs6000] Testcase updates for power9 codegen

2017-11-20 Thread Will Schmidt
On Mon, 2017-11-20 at 11:23 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 17, 2017 at 11:30:27AM -0600, Will Schmidt wrote:
> > [testsuite]
> > * fold-vec-neg-int.c: Specify -mcpu=power8
> 
> Dot at end of sentence (quite a few times), please.
> 
> > -/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> > +//# Power9 added xxspltib instruction.
> > +/* { dg-final { scan-assembler-times "vspltisw|vxor|xxspltib" 1 } } */
> 
> That's not the usual comment style (and it won't work with all compiler
> settings).  But does this comment bring any value?  Just leave it out?

In the grand scheme, no.  mostly useful while sifting out the changes in
my sandbox.  I'll cut those.

> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> > @@ -1,11 +1,11 @@
> >  /* Verify that overloaded built-ins for vec_abs with int
> > inputs produce the right results.  */
> >  
> >  /* { dg-do compile } */
> >  /* { dg-require-effective-target powerpc_altivec_ok } */
> > -/* { dg-options "-maltivec -O2 -fwrapv" } */
> > +/* { dg-options "-maltivec -O2 -fwrapv -mcpu=power8" } */
> 
> Is this testcase really testing something specific to power8 codegen?

Yes, in contrast to power9 codegen.

for example in this case
power8 gens:
/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
/* { dg-final { scan-assembler-times "vsubuwm" 1 } } */
/* { dg-final { scan-assembler-times "vmaxsw" 1 } } */
power9 gens:
+/* { dg-final { scan-assembler-times "vnegw" 1 } } */

> Making all these testcases use -mcpu=power8 means they won't be
tested 
> with any other settings.  (Also, does that work if the user puts another
> -mcpu= in RUNTESTFLAGS).

possibly not.  I did (do?) have a do-not-override option in place
initially for a couple of the tests, but didn't seem to need it during
my sniff-testing.I can try a few runs with manually specifying -mcpu
flags and revisit. 
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power8" } } */

> I'm all for making the testresults cleaner, but let's not do that by
> (effectively) disabling all failing tests ;-)

No, thats really not my intent here.  I'm not really convinced thats
what I've done here either.   I've added .p9 versions for most of what
I've touched, with the intent to continue to have good coverage.
Adding the -mcpu=foo option to dg-options shouldn't be disabling the
test..?

I see no -mcpu=power8-or-earlier option.  :-/

I'll need a suggestion or two on how to do some of these differently.

> (Please split this up a bit when you repost it btw).

Ok.

> 
> Segher
> 




Re: [PATCH] rs6000: Don't touch below the stack pointer (PR77687)

2017-11-20 Thread Segher Boessenkool
On Tue, Sep 19, 2017 at 09:21:50PM +, Segher Boessenkool wrote:
> With the 32-bit SVR4 ABI we don't have a red zone, so we have to restore
> the callee-saved registers before we restore the stack pointer.
> 
> The previous fix for this PR failed in two ways, for huge frames: first,
> we use a negative offset from r11 in that case, so the (mem:BLK 11) access
> does no good; second, sched does not handle accesses to mem:BLK correctly
> in this case (does not make dependencies).
> 
> This patch fixes it by doing a store to (mem:BLK (scratch)) instead.
> This means no unrelated (not to stack) loads/stores can be moved over the
> stack restore either, but so be it.

I backported this to GCC 7 now.


Segher


> 2017-09-19  Segher Boessenkool  
> 
>   PR target/77687
>   * config/rs6000/rs6000.md (stack_restore_tie): Store to a scratch
>   address instead of to r1 and r11.
> 
> gcc/testsuite/
>   PR target/77687
>   * gcc.target/powerpc/pr77687.c: New testcase.
> 
> 
> ---
>  gcc/config/rs6000/rs6000.md|  6 ++
>  gcc/testsuite/gcc.target/powerpc/pr77687.c | 20 
>  2 files changed, 22 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr77687.c
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 7f17628..13ba743 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -13161,14 +13161,12 @@ (define_insn "stack_tie"
>  
>  ; Some 32-bit ABIs do not have a red zone, so the stack deallocation has to
>  ; stay behind all restores from the stack, it cannot be reordered to before
> -; one.  See PR77687.  This insn is an add or mr, and a stack_tie on the
> -; operands of that.
> +; one.  See PR77687.  This insn is an add or mr, and a memory clobber.
>  (define_insn "stack_restore_tie"
>[(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
>   (plus:SI (match_operand:SI 1 "gpc_reg_operand" "r,r")
>(match_operand:SI 2 "reg_or_cint_operand" "O,rI")))
> -   (set (mem:BLK (match_dup 0)) (const_int 0))
> -   (set (mem:BLK (match_dup 1)) (const_int 0))]
> +   (set (mem:BLK (scratch)) (const_int 0))]
>"TARGET_32BIT"
>"@
> mr %0,%1
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr77687.c 
> b/gcc/testsuite/gcc.target/powerpc/pr77687.c
> new file mode 100644
> index 000..95b27ae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr77687.c
> @@ -0,0 +1,20 @@
> +/* { dg-options "-std=gnu99 -O2" } */
> +/* { dg-final { scan-assembler-not {\mmr r?1,r?11\M.*11.*\mblr\M} } } */
> +
> +/* PR77687: We used to do stack accesses (via r11) after restoring r1.  */
> +
> +void g(int, char *);
> +const char * dum = "hello";
> +
> +void f(int x)
> +{
> +   char big[20];
> + start:
> +   g(x, big);
> +   g(x, big);
> +   register void *p asm("r11") = &
> +   asm("" : : "r"(p));
> +   asm("" : : :"r28");
> +   asm("" : : :"r29");
> +   asm("" : : :"r30");
> +}
> -- 
> 1.8.3.1


[PATCH] PR 53796 Improve INQUIRE(RECL=...) handling

2017-11-20 Thread Janne Blomqvist
The current F2018 draft (N2137) specifies behavior of the RECL=
specifier in the INQUIRE statement, where it previously was left as
undefined. Namely:

- If the unit is not connected, RECL= should be given the value -1.
- If the unit is connected with stream access, RECL= should be given
  the value -2.

Further, as PR 53796 describes, the handling of RECL= is poor in other
ways as well. When the recl is set to the maximum possible
(GFC_INTEGER_8_HUGE / LLONG_MAX), which it does by default except for
preconnected units, and when INQUIRE(RECL=) is used with a 4 byte
integer, the value is truncated and the 4 byte value is thus
-1. Fixing this to generate an error is a lot of work, as currently
the truncation is done by the frontend, the library sees only an 8
byte value with no indication that the frontend is going to copy it to
a 4 byte one. Instead, this patch does a bit twiddling trick such that
the truncated 4 byte value is GFC_INTEGER_4_HUGE while still being
0. * GFC_INTEGER_8_HUGE which is large enough for all
practical purposes.

Finally, the patch removes GFORTRAN_DEFAULT_RECL which was used only
for preconnected units, and instead uses the same approach as describe
above.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

gcc/fortran/ChangeLog:

2017-11-20  Janne Blomqvist  

PR fortran/53796
* gfortran.texi: Remove mentions of GFORTRAN_DEFAULT_RECL.

libgfortran/ChangeLog:

2017-11-20  Janne Blomqvist  

PR fortran/53796
* io/inquire.c (inquire_via_unit): Set recl to -1 for unconnected
units.
* io/io.h (default_recl): New variable.
* io/open.c (new_unit): Set recl to default_recl for sequential,
-2 for stream access.
* io/transfer.c (read_block_form): Test against default_recl
instead of DEFAULT_RECL.
(write_block): Likewise.
* io/unit.c (init_units): Calculate max_offset, default_recl.
* libgfortran.h (DEFAULT_RECL): Remove.
* runtime/environ.c: Remove GFORTRAN_DEFAULT_RECL.

gcc/testsuite/ChangeLog:

2017-11-20  Janne Blomqvist  

PR fortran/53796
* gfortran.dg/inquire_recl_f2018.f90: New test.
---
 gcc/fortran/gfortran.texi|  9 -
 gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90 | 42 
 libgfortran/io/inquire.c |  4 ++-
 libgfortran/io/io.h  |  5 +++
 libgfortran/io/open.c|  6 ++--
 libgfortran/io/transfer.c|  4 +--
 libgfortran/io/unit.c| 33 ---
 libgfortran/libgfortran.h|  8 +
 libgfortran/runtime/environ.c|  4 ---
 9 files changed, 79 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4b4688c..36c7b94 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -600,7 +600,6 @@ Malformed environment variables are silently ignored.
 * GFORTRAN_UNBUFFERED_PRECONNECTED:: Do not buffer I/O for preconnected units.
 * GFORTRAN_SHOW_LOCUS::  Show location for runtime errors
 * GFORTRAN_OPTIONAL_PLUS:: Print leading + where permitted
-* GFORTRAN_DEFAULT_RECL:: Default record length for new files
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
@@ -683,14 +682,6 @@ where permitted by the Fortran standard.  If the first 
letter
 is @samp{n}, @samp{N} or @samp{0}, a plus sign is not printed
 in most cases.  Default is not to print plus signs.
 
-@node GFORTRAN_DEFAULT_RECL
-@section @env{GFORTRAN_DEFAULT_RECL}---Default record length for new files
-
-This environment variable specifies the default record length, in
-bytes, for files which are opened without a @code{RECL} tag in the
-@code{OPEN} statement.  This must be a positive integer.  The
-default value is 1073741824 bytes (1 GB).
-
 @node GFORTRAN_LIST_SEPARATOR
 @section @env{GFORTRAN_LIST_SEPARATOR}---Separator for list output
 
diff --git a/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90 
b/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90
new file mode 100644
index 000..8a13340
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90
@@ -0,0 +1,42 @@
+! { dg-do run }
+! PR 53796 INQUIRE(RECL=...)
+program inqrecl
+  implicit none
+  integer(8) :: r
+  integer :: r4
+  ! F2018 (N2137) 12.10.2.26: recl for unconnected should be -1
+  inquire(10, recl=r)
+  if (r /= -1) then
+ call abort()
+  end if
+  
+  ! Formatted sequential
+  open(10, status="scratch")
+  inquire(10, recl=r)
+  inquire(10, recl=r4)
+  close(10)
+  if (r /= huge(0_8) - huge(0_4) - 1) then
+ call abort()
+  end if
+  if (r4 /= huge(0)) then
+ call 

Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Sandra Loosemore

On 11/20/2017 10:34 AM, Michael Matz wrote:

What's your rationale for changing this?  In your initial mail you said:

"On many targets this means global variable accesses having an unnecessary
codesize and performance penalty in C code (the same source generates
better code when built as C++)."

I have a hard time imaging that, so can you give details?  FWIW I've
personally always considered using common symbols nicer.


The optimization case I'm familiar with is for targets that support -G 
and GP-relative addressing modes to address small data.  Generally you 
can only do that if the variable is allocated in the same compilation 
unit as the reference, so the compiler knows definitely where it's 
placed (unless the backend also supports some other mechanism for 
asserting that a variable is small data).  Putting tentatively-defined 
variables in common defeats that optimization.


-Sandra


Re: [PATCH, i386]: Add bswaphi2 insn pattern

2017-11-20 Thread Uros Bizjak
Hello!

Attached patch introduces bswaphi2 named insn pattern that results in
movbe instruction.

Without the patch, the following testcase:

void baz (char *buf, unsigned int data)
{
  buf[0] = data >> 8;
  buf[1] = data;
}

compiles to (-O2 -march=haswell):

rolw$8, %si
movw%si, (%rdi)

and with patched compiler:

movbe   %si, (%rdi)

2017-11-20  Uros Bizjak  

* config/i386/i386.md (bswaphi2): New expander.
(*bswaphi2_movbe): New insn pattern.
(bswaphi -> rorhi pepehole2): New peephole pattern.

testsuite/ChangeLog:

2017-11-20  Uros Bizjak  
Jakub Jelinek  

* gcc.target/i386/movbe-5.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 254958)
+++ config/i386/i386.md (working copy)
@@ -14091,6 +14091,39 @@
(set_attr "modrm" "0")
(set_attr "mode" "")])
 
+(define_expand "bswaphi2"
+  [(set (match_operand:HI 0 "register_operand")
+   (bswap:HI (match_operand:HI 1 "nonimmediate_operand")))]
+  "TARGET_MOVBE")
+
+(define_insn "*bswaphi2_movbe"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=Q,r,m")
+   (bswap:HI (match_operand:HI 1 "nonimmediate_operand" "0,m,r")))]
+  "TARGET_MOVBE
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "@
+xchg{b}\t{%h0, %b0|%b0, %h0}
+movbe\t{%1, %0|%0, %1}
+movbe\t{%1, %0|%0, %1}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "*,1,1")
+   (set_attr "prefix_0f" "*,1,1")
+   (set_attr "prefix_extra" "*,1,1")
+   (set_attr "pent_pair" "np,*,*")
+   (set_attr "athlon_decode" "vector,*,*")
+   (set_attr "amdfam10_decode" "double,*,*")
+   (set_attr "bdver1_decode" "double,*,*")
+   (set_attr "mode" "QI,HI,HI")])
+
+(define_peephole2
+  [(set (match_operand:HI 0 "general_reg_operand")
+   (bswap:HI (match_dup 0)))]
+  "TARGET_MOVBE
+   && !(TARGET_USE_XCHGB || optimize_function_for_size_p (cfun))
+   && peep2_regno_dead_p (0, FLAGS_REG)"
+  [(parallel [(set (match_dup 0) (rotate:HI (match_dup 0) (const_int 8)))
+ (clobber (reg:CC FLAGS_REG))])])
+
 (define_insn "*bswaphi_lowpart_1"
   [(set (strict_low_part (match_operand:HI 0 "register_operand" "+Q,r"))
(bswap:HI (match_dup 0)))
Index: testsuite/gcc.target/i386/movbe-5.c
===
--- testsuite/gcc.target/i386/movbe-5.c (nonexistent)
+++ testsuite/gcc.target/i386/movbe-5.c (working copy)
@@ -0,0 +1,18 @@
+/* PR tree-optimization/78821 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmovbe" } */
+/* { dg-final { scan-assembler-times "movbe\[ \t\]" 2 } } */
+
+unsigned short
+foo (unsigned short *buf)
+{
+  unsigned short a = buf[0];
+  return ((unsigned short) (a >> 8)) | (unsigned short) (a << 8);
+}
+
+void
+bar (char *buf, unsigned int data)
+{
+  buf[0] = data >> 8;
+  buf[1] = data;
+}


[PATCH, i386]: Add bswaphi2 insn pattern

2017-11-20 Thread Uros Bizjak
Hello!

Attached patch introduces bswaphi2 named insn pattern that results in
movbe instruction.

Without the patch, the following testcase:


Re: [patch] Add support for #pragma GCC unroll

2017-11-20 Thread Sandra Loosemore

On 11/17/2017 03:23 AM, Eric Botcazou wrote:


Index: doc/extend.texi
===
--- doc/extend.texi (revision 254797)
+++ doc/extend.texi (working copy)
@@ -22376,6 +22376,18 @@ void ignore_vec_dep (int *a, int k, int
 @}
 @end smallexample
 
+@table @code

+@item #pragma GCC unroll @var{n}
+@cindex pragma GCC unroll @var{n}
+
+With this pragma, the programmer informs the optimizer how many times
+a loop should be unrolled.  A 0 or 1 informs the compiler to not
+perform any loop unrolling.  The pragma must be immediately before
+@samp{#pragma ivdep} or a @code{for}, @code{while} or @code{do} loop
+and applies only to the loop that follows.  @var{n} is an
+assignment-expression that evaluates to an integer constant.
+
+@end table
 
 @node Unnamed Fields

 @section Unnamed Structure and Union Fields


This documentation patch needs some work.

First of all, the structuring in this section is screwed up.  The 
discussion and examples for the previous item (#pragma ivdep) should be 
moved inside the @table so that you don't have to introduce another 
@table here, just insert another entry into the existing one.


Second, we shouldn't be talking about "the programmer" in the third 
person; programmers are "you", the readers of the manual.  The paragraph 
structure and phrasing seem awkward as well.  How about something like 
this instead?


You can use this pragma to control how many times a loop should be 
unrolled.  It must be placed immediately before a @code{for}, 
@code{while} or @code{do} loop or a @samp{#pragma ivdep}, and applies 
only to the loop that follows.  @var{n} is an integer constant 
expression; a value of 0 or 1 disables unrolling of the loop.


-Sandra


Re: [PATCH] avoid assuming strncat/strncpy don't change length of source string (PR 83075)

2017-11-20 Thread Jakub Jelinek
On Mon, Nov 20, 2017 at 12:15:09PM -0700, Martin Sebor wrote:
>   PR tree-optimization/83075
>   * tree-ssa-strlen.c (handle_builtin_stxncpy): Avoid assuming
>   strncat/strncpy don't change length of source string.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/83075
>   * gcc.dg/tree-ssa/strncat.c: New test.
>   * gcc.dg/tree-ssa/strncpy-2.c: Same.
> 
> Index: gcc/testsuite/gcc.dg/tree-ssa/strncat.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/strncat.c   (nonexistent)
> +++ gcc/testsuite/gcc.dg/tree-ssa/strncat.c   (working copy)
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/83075 - Invalid strncpy optimization
> +   { dg-do compile }
> +   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
> +
> +void test_overlapping_strncpy (void)
> +{
> +  char a[8] = "";
> +
> +  __builtin_strcpy (a, "123");
> +
> +  unsigned n0 = __builtin_strlen (a);
> +
> +  __builtin_strncat (a + 3, a, n0);
> +
> +  unsigned n1 = __builtin_strlen (a);
> +
> +  if (n1 == n0)
> +__builtin_abort ();
> +}
> +
> +/* Verify the call to abort hasn't been eliminated.
> +   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */
> Index: gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c (nonexistent)
> +++ gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c (working copy)
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/83075 - Invalid strncpy optimization
> +   { dg-do compile }
> +   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
> +
> +void test_overlapping_strncpy (void)
> +{
> +  char a[8] = "";
> +
> +  __builtin_strcpy (a, "123");
> +
> +  unsigned n0 = __builtin_strlen (a);
> +
> +  __builtin_strncpy (a + 3, a, n0);
> +
> +  unsigned n1 = __builtin_strlen (a);
> +
> +  if (n1 == n0)
> +__builtin_abort ();
> +}
> +
> +/* Verify the call to abort hasn't been eliminated.
> +   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */

I think it is better to have a dg-do run testcase and not scan for abort.
We can be able to optimize at some point n0 to 3 and n1 to 6 and optimize
away the abort.  In the testcase I've added to the PR there was a separate
function, you could add __attribute__((noipa)) to it to make sure that
it isn't IPA optimized.
Similarly for the strncat test.

> Index: gcc/tree-ssa-strlen.c
> ===
> --- gcc/tree-ssa-strlen.c (revision 254961)
> +++ gcc/tree-ssa-strlen.c (working copy)
> @@ -1931,10 +1931,9 @@ handle_builtin_stxncpy (built_in_function, gimple_
>int sidx = get_stridx (src);
>strinfo *sisrc = sidx > 0 ? get_strinfo (sidx) : NULL;
>  
> -  /* Strncpy() et al. cannot modify the source string.  Prevent the rest
> - of the pass from invalidating the strinfo data.  */
> -  if (sisrc)
> -sisrc->dont_invalidate = true;
> +  /* Strncat() and strncpy() can modify the source string by specifying
> + as a destination SRC + strlen(SRC) and a bound <= strlen(SRC) so
> + SISRC->DONT_INVALIDATE must be left clear.  */

The destination could be SRC + N and as long as LEN <= N, you shouldn't
invalidate SRC.  Though, if LEN < strlen(SRC) I think you return early.

Jakub


Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-11-20 Thread Steve Ellcey
On Mon, 2017-11-20 at 18:27 +, James Greenhalgh wrote:
> 
> If you have the time, would you mind giving me a quick run-down of what
> design decisions went in to this patch, and why they are the right thing
> to do? Sorry to offload that, but it will be the most efficient route
> to a review.

The main design decision was to use the existing IFUNC infrastructure
that is used on ARM32 to enable atomic instructions that were added
with armv7-a, on i386 to enable instructions added with i586, and on
x86_64 to enable instructions added with cx16.

The basic idea for all these is to allow users who create programs that
use the atomic_* functions to use new instructions on machines that
support them while also working on older machines that do not support
them and to not have to create two separate executables.

Some atomic_* functions get inlined into programs, and those will
either use or not use LSE instructions based on the compiler arguments
used during compilations.  If you want your program to work on all
machines you have to not compile for LSE intructions.  But other
functions (or all functions if -fno-inline-atomics is used) will call
the libatomic library.  Currently those functions do not use LSE
instructions but with this patch we can use the IFUNC infrastructure to
check for LSE support and use LSE in libatomic on machines where it is
supported or not use it on machines where it is not supported.

As an example of what this change does, __atomic_compare_exchange_8 will
turn into a call to libat_compare_exchange_8_i1 on a machine that supports
LSE:

 :
   0:   f9400023    ldr x3, [x1]
   4:   aa0303e4    mov x4, x3
   8:   c8e4fc02    casal   x4, x2, [x0]
   c:   eb03009f    cmp x4, x3
  10:   1a9f17e0    csetw0, eq
  14:   3540    cbnzw0, 1c 
  18:   f924    str x4, [x1]
  1c:   d65f03c0    ret

But call libat_compare_exchange_8 on a machine without LSE:

 :
   0:   f9400023    ldr x3, [x1]
   4:   c85ffc04    ldaxr   x4, [x0]
   8:   eb03009f    cmp x4, x3
   c:   5461    b.ne18 
  10:   c805fc02    stlxr   w5, x2, [x0]
  14:   3585    cbnzw5, 4 
  18:   1a9f17e0    csetw0, eq
  1c:   3440    cbz w0, 24 
  20:   d65f03c0    ret
  24:   f924    str x4, [x1]
  28:   d65f03c0    ret
  2c:   d503201f    nop

Steve Ellcey
sell...@cavium.com


[PATCH] avoid assuming strncat/strncpy don't change length of source string (PR 83075)

2017-11-20 Thread Martin Sebor

Unlike strcat and strcpy, it is valid for strncat and strncpy
to result in modifying the source string by overwriting its
terminating nul.  The attached patch removes an assumption
to the contrary introduced with the -Wstringop-truncation
enhancement.

Martin
PR tree-optimization/83075 - Invalid strncpy optimization

gcc/ChangeLog:

	PR tree-optimization/83075
	* tree-ssa-strlen.c (handle_builtin_stxncpy): Avoid assuming
	strncat/strncpy don't change length of source string.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83075
	* gcc.dg/tree-ssa/strncat.c: New test.
	* gcc.dg/tree-ssa/strncpy-2.c: Same.

Index: gcc/testsuite/gcc.dg/tree-ssa/strncat.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strncat.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strncat.c	(working copy)
@@ -0,0 +1,22 @@
+/* PR tree-optimization/83075 - Invalid strncpy optimization
+   { dg-do compile }
+   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
+
+void test_overlapping_strncpy (void)
+{
+  char a[8] = "";
+
+  __builtin_strcpy (a, "123");
+
+  unsigned n0 = __builtin_strlen (a);
+
+  __builtin_strncat (a + 3, a, n0);
+
+  unsigned n1 = __builtin_strlen (a);
+
+  if (n1 == n0)
+__builtin_abort ();
+}
+
+/* Verify the call to abort hasn't been eliminated.
+   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strncpy-2.c	(working copy)
@@ -0,0 +1,22 @@
+/* PR tree-optimization/83075 - Invalid strncpy optimization
+   { dg-do compile }
+   { dg-options "-O2 -Wno-stringop-overflow -fdump-tree-optimized" } */
+
+void test_overlapping_strncpy (void)
+{
+  char a[8] = "";
+
+  __builtin_strcpy (a, "123");
+
+  unsigned n0 = __builtin_strlen (a);
+
+  __builtin_strncpy (a + 3, a, n0);
+
+  unsigned n1 = __builtin_strlen (a);
+
+  if (n1 == n0)
+__builtin_abort ();
+}
+
+/* Verify the call to abort hasn't been eliminated.
+   { dg-final { scan-tree-dump-times "abort" 1 "optimized" } } */
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 254961)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -1931,10 +1931,9 @@ handle_builtin_stxncpy (built_in_function, gimple_
   int sidx = get_stridx (src);
   strinfo *sisrc = sidx > 0 ? get_strinfo (sidx) : NULL;
 
-  /* Strncpy() et al. cannot modify the source string.  Prevent the rest
- of the pass from invalidating the strinfo data.  */
-  if (sisrc)
-sisrc->dont_invalidate = true;
+  /* Strncat() and strncpy() can modify the source string by specifying
+ as a destination SRC + strlen(SRC) and a bound <= strlen(SRC) so
+ SISRC->DONT_INVALIDATE must be left clear.  */
 
   /* Retrieve the strinfo data for the string S that LEN was computed
  from as some function F of strlen (S) (i.e., LEN need not be equal


Re: [patches] Re: [PATCH] RISC-V: Implement __umulsidi3, umul_ppmm and __muluw3

2017-11-20 Thread Palmer Dabbelt

On Sun, 19 Nov 2017 23:31:56 PST (-0800), ja...@redhat.com wrote:

On Sun, Nov 19, 2017 at 08:53:00PM -0800, Jim Wilson wrote:

> 2017-11-24  Kito Cheng  
>
> * longlong.h [__riscv] (__umulsidi3): Define.
> [__riscv] (umul_ppmm) Likewise.
> [__riscv] (__muluw3) Likewise.


BTW, two colons missing after )


Thanks.  Committed!


Re: [committed][PATCH] Fix bogus propagation in DOM

2017-11-20 Thread Jeff Law
On 11/20/2017 03:25 AM, Richard Biener wrote:
> On Sun, Nov 19, 2017 at 9:16 PM, Jeff Law  wrote:
>> On my local branch gcc.dg/torture/pr56349.c fails by sending GCC into an
>> infinite loop trying to simplify a self-referring statement. ie
>> something like
>>
>> x_1 = x_1 + 10;
>>
>> That, of course, shouldn't be happening in SSA form.  After some digging
>> I've found the culprit.
>>
>> Let's say we've got a PHI.
>>
>> a_1 = PHI (a_0, a_2)
>>
>> If DOM decides that the edge associated with a_2 is not executable, then
>> DOM will consider the PHI a degenerate and enter a_1 = a_0 into its
>> equivalence table.
>>
>> That in turn will result in propagation of a_0 into uses of a_1.
>>
>> That, of course, isn't right.  There's nothing that guarantees that the
>> definition of a_0 dominates the uses of a_1.  In the testcase that bogus
>> propagation cascades and eventually results in a self-referring node
>> like I showed above.
>>
>> The solution here is to note whether or not we ignored any PHI
>> arguments.  If we do and the equivalence we want to enter is SSA_NAME =
>> SSA_NAME, then we must reject the equivalence.Obviously if we wanted
>> to enter SSA_NAME = CONST, then we can still do so.
>>
>> Bootstrapped and regression tested on x86_64.  Installing on the trunk.
> 
> Hmm, but if the edge isn't executable then it will be removed and thus the
> definition _will_ dominate.  So I think the error happens elsewhere.  With
> the change you remove one of the advantages of tracking unexecutable
> edges, namely that we can treat those merges optimistically resulting in
> more CSE.
> 
> You didn't add a testcase so I can't have a quick look myself.
> 
> Short: I think you're papering over an issue elsehwere.
Depends on your point of view :-)

It's not something I think you can trigger on the trunk right now.  But
the testcase is pr56349.  You need the embedded vrp bits installed into
DOM and for DOM to also use those bits to detect branches that have a
static destination.

So I'll just walk you through it...

At the start of the second DOM pass we have this:

f ()
{
  int k__lsm.8;
  int * k;
  int a;
  int _2;
  int b.0_3;
  int _4;
  unsigned int ivtmp_5;
  int _6;
  short int iftmp.3_14;
  int _15;
  int b.4_25;
  short int iftmp.3_27;
  short int c.2_33;
  unsigned int ivtmp_38;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  a_9 = 1;
  ivtmp_38 = 1;
  a_31 = a_9 + 1;
  ivtmp_5 = ivtmp_38 + 4294967295;
  _2 = 1;
  b.0_3 = b;
  _4 = b.0_3 | _2;
  b = _4;
  if (_4 == 0)
goto ; [66.00%]
  else
goto ; [34.00%]
;;succ:   12
;;10

;;   basic block 3, loop depth 1
;;pred:   5
;;3
  goto ; [100.00%]
;;succ:   3

;;   basic block 4, loop depth 0
;;pred:   11
;;12
lbl1:
  c.2_33 = c;
;;succ:   5

;;   basic block 5, loop depth 1
;;pred:   4
;;5
  if (c.2_33 != 0)
goto ; [85.00%]
  else
goto ; [15.00%]
;;succ:   3
;;5

;;   basic block 6, loop depth 0
;;pred:   11
  b.4_25 = b;
  if (b.4_25 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]
;;succ:   7
;;8

;;   basic block 7, loop depth 0
;;pred:   6
  iftmp.3_27 = (short int) b.4_25;
  goto ; [100.00%]
;;succ:   9

;;   basic block 8, loop depth 0
;;pred:   6
;;12
  # k_37 = PHI 
;;succ:   9

;;   basic block 9, loop depth 0
;;pred:   7
;;8
  # iftmp.3_14 = PHI 
  # k_44 = PHI 
  c = iftmp.3_14;
  if (iftmp.3_14 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]
;;succ:   10
;;11

;;   basic block 10, loop depth 0
;;pred:   2
;;9
  # k_10 = PHI <0B(2), k_44(9)>
lbl2:
  b = 0;
;;succ:   11

;;   basic block 11, loop depth 0
;;pred:   10
;;9
  # k_11 = PHI 
  k_30 = k_11 + 4;
  _6 = MEM[(int *)k_11 + 4B];
  if (_6 != 0)
goto ; [66.00%]
  else
goto ; [34.00%]
;;succ:   6
;;4

;;   basic block 12, loop depth 0
;;pred:   2
  _15 = MEM[(int *)0B];
  if (_15 != 0)
goto ; [66.00%]
  else
goto ; [34.00%]
;;succ:   8
;;4

}

The first tidbit of interest is we can statically determine that bb2
will always transfer control to bb10.  The edge 2->12 is marked as not
executable.  That also means that 12->4 and 12->8 are not executable as
well.

The PHI in BB8 is critical:

  # k_37 = PHI 

With the edge 8->12 being unexecutable the PHI is essentially k_37 =
k_30 and we'll try to propagate k_30 into the uses of k_37.

The first use of interest is BB9.


  # k_44 = PHI 

Which turns into
  # k_44 = PHI 

And we've lost.  The assignment to k_30 does not currently dominate k_44
and won't until after we've done cfg 

Re: Patch ping

2017-11-20 Thread Nathan Sidwell

On 11/20/2017 02:55 AM, Jakub Jelinek wrote:


   http://gcc.gnu.org/ml/gcc-patches/2017-11/msg00851.html
   C++2A P0428R2 - familiar template syntax for generic lambdas


Are there existing testcases checking this is permitted w/o warning 
under the appropriate circumstances?  That seems to be the only thing 
missing from the patch itself.


Otherwise ok.

nathan

--
Nathan Sidwell


Re: [patch, fortran] Implement maxloc and minloc for character

2017-11-20 Thread Thomas Koenig

Am 20.11.2017 um 09:30 schrieb Janne Blomqvist:

On Sun, Nov 19, 2017 at 11:11 PM, Thomas Koenig  wrote:

There is one question regarding the ABI. Apparently, the string length
is passed as an int even on a 64-bit system. I verified that this
is indeed the case by doing the actual work on a
powerpc64-unknown-linux-gnu box (gcc110 on the gcc compile farm),
which is big-endian. If we were actually passing an eight-byte
quantity, and only getting the upper bytes, we would crash & burn.

Now, I _thought_ we were passing string lengths as size_t now (Janne?),
but maybe something was missing in that change.


Unfortunately I had to revert the charlen->size_t patch since it
caused regressions on aix/power (presumably due to endianness issues).


Ah, that explains it. I had forgotten the reversion part.


There's apparently some other process for getting compile farm
accounts nowadays, and we have broken the ABI again for gcc 8, so
maybe I should dust off the patch and try again. Or what do you think?


You can apply at https://cfarm.tetaneutral.net/ . These machines are
indeed quite nice to work on, especially because of the different
architectures (and because there are some very powerful machines
there).

So, any other comments about my patch? OK for trunk?

Regards

Thomas


Re: Patch ping

2017-11-20 Thread Nathan Sidwell

On 11/20/2017 02:55 AM, Jakub Jelinek wrote:

Hi!



   http://gcc.gnu.org/ml/gcc-patches/2017-11/msg01026.html
   C++2A P0329R4: Designated Initialization


OK, thanks


--
Nathan Sidwell


Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-11-20 Thread James Greenhalgh
On Mon, Nov 20, 2017 at 05:39:25PM +, Steve Ellcey wrote:
> Re-ping with a CC to the Aarch64 maintainers.

If I'm completely honest with myself, I don't know enough about this area
to review the patch.

Szabolcs' OK holds a lot of weight with me, but I'd like to understand more
of the top-level overview of what this patch means.

If you have the time, would you mind giving me a quick run-down of what
design decisions went in to this patch, and why they are the right thing
to do? Sorry to offload that, but it will be the most efficient route
to a review.

Otherwise, I'll try and make some time for this once I've made it
through the Stack-smashing and SVE reviews. (Or another maintainer
might beat me to it :-) )

Thanks,
James

> > > > looks good to me, but i cannot approve.
> > > > 
> > > > (this will make libatomic depend on ifuncs on aarch64*-linux-gnu.)
> > > If you build GCC with default options, ifunc is enabled on aarch64
> > > and
> > > used by libatomic but if you configure with '--disable-gnu-
> > > indirect-
> > > function' then GCC will not recognize the 'resolver' attribute and
> > > libatomic will build the 'old' way and not use IFUNC's.
> > > 
> > > It seems like using '--disable-version-specific-runtime-libs' should
> > > allow GCC to recognize the 'resolver' attribute for IFUNCs but to not
> > > use them when building libatomic but when I tried that I got ifuncs
> > > in
> > > libatomic anyway.  I think that is a bug and I will look into it some
> > > more.
> > > 
> > > The recent alias checking improvements made to GCC and which caused
> > > some glibc build problems also caused the ifunc check in libatomic to
> > > fail.  That problem has been fixed but it involved a change to
> > > libatomic/ibatomic_i.h that conflicted with this patch so I am
> > > including an updated version that will apply cleanly to the top of
> > > tree.  Only libatomic_i.h changed.
> > > 
> > > Steve Ellcey
> > > sell...@cavium.com
> > > 
> > > 2017-10-03  Steve Ellcey  
> > > 
> > >   * Makefile.am (ARCH_AARCH64_LINUX): Add IFUNC_OPTIONS and
> > >   libatomic_la_LIBADD.
> > >   * config/linux/aarch64/host-config.h: New file.
> > >   * configure.ac (IFUNC_RESOLVER_ARGS): Define.
> > >   (ARCH_AARCH64_LINUX): New conditional for IFUNC builds.
> > >   * configure.tgt (aarch64): Set ARCH and try_ifunc.
> > >   (aarch64*-*-linux*) Update config_path.
> > >   (aarch64*-*-linux*) Set IFUNC_RESOLVER_ARGS.
> > >   * libatomic_i.h (GEN_SELECTOR): Add IFUNC_RESOLVER_ARGS
> > > argument.
> > >   * Makefile.in: Regenerate.
> > >   * auto-config.h.in: Regenerate.
> > >   * configure: Regenerate.


Re: [PATCH] Add _Float/_FloatX rounding built-ins & improve gimple optimization of _Float/_FloatX built-in functions

2017-11-20 Thread Michael Meissner
On Mon, Nov 20, 2017 at 11:32:08AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 17, 2017 at 07:35:05PM -0500, Michael Meissner wrote:
> > Here is the fixed patch.  It fixes the btrunc2 insn to use the correct
> > XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> > test to make sure round, trunc, ceil, and floor return the correct values.  
> > The
> > machine independent portions are the same.
> > 
> > Assuming the machine independent versions are approved, can I check in the
> > PowerPC bits?
> 
> The rs6000 parts are okay, with a trivial fix:
> 
> > --- gcc/testsuite/gcc.target/powerpc/float128-hw11.c(revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/float128-hw11.c(revision 0)
> > @@ -0,0 +1,59 @@
> > +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mpower9-vector -O2" } */
> > +/* { dg-options "-mvsx -O2" } */
> 
> This last line should not be there.

Ok.  I will also substitute p9vector_hw for powerpc_p9vector_ok, since the
former says we have power9 hardware to run tests, and the later says the
compiler supports compiling for a power9 target.  Sorry about that.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-11-20 Thread Steve Ellcey
Re-ping with a CC to the Aarch64 maintainers.

Steve Ellcey
sell...@cavium.com

On Tue, 2017-10-24 at 11:11 -0700, Steve Ellcey wrote:
> Ping.
> 
> Steve Ellcey
> sell...@cavium.com
> 
> On Tue, 2017-10-03 at 11:57 -0700, Steve Ellcey wrote:
> > 
> > On Mon, 2017-10-02 at 15:38 +0100, Szabolcs Nagy wrote:
> > > 
> > > 
> > >  
> > > looks good to me, but i cannot approve.
> > > 
> > > (this will make libatomic depend on ifuncs on aarch64*-linux-gnu.)
> > If you build GCC with default options, ifunc is enabled on aarch64
> > and
> > used by libatomic but if you configure with '--disable-gnu-
> > indirect-
> > function' then GCC will not recognize the 'resolver' attribute and
> > libatomic will build the 'old' way and not use IFUNC's.
> > 
> > It seems like using '--disable-version-specific-runtime-libs' should
> > allow GCC to recognize the 'resolver' attribute for IFUNCs but to not
> > use them when building libatomic but when I tried that I got ifuncs
> > in
> > libatomic anyway.  I think that is a bug and I will look into it some
> > more.
> > 
> > The recent alias checking improvements made to GCC and which caused
> > some glibc build problems also caused the ifunc check in libatomic to
> > fail.  That problem has been fixed but it involved a change to
> > libatomic/ibatomic_i.h that conflicted with this patch so I am
> > including an updated version that will apply cleanly to the top of
> > tree.  Only libatomic_i.h changed.
> > 
> > Steve Ellcey
> > sell...@cavium.com
> > 
> > 2017-10-03  Steve Ellcey  
> > 
> > * Makefile.am (ARCH_AARCH64_LINUX): Add IFUNC_OPTIONS and
> > libatomic_la_LIBADD.
> > * config/linux/aarch64/host-config.h: New file.
> > * configure.ac (IFUNC_RESOLVER_ARGS): Define.
> > (ARCH_AARCH64_LINUX): New conditional for IFUNC builds.
> > * configure.tgt (aarch64): Set ARCH and try_ifunc.
> > (aarch64*-*-linux*) Update config_path.
> > (aarch64*-*-linux*) Set IFUNC_RESOLVER_ARGS.
> > * libatomic_i.h (GEN_SELECTOR): Add IFUNC_RESOLVER_ARGS
> > argument.
> > * Makefile.in: Regenerate.
> > * auto-config.h.in: Regenerate.
> > * configure: Regenerate.


Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Michael Matz
Hi,

On Mon, 20 Nov 2017, Wilco Dijkstra wrote:

> > Note you have to make sure GFortran still works!  So I think the patch 
> > should be changed to make the default behavior be frontend dependent 
> > or have a fortran/ adjustment that fixes things up for the fortran 
> > dialects that need it.
> 
> Fortran doesn't use flag_no_common, so COMMON globals are not affected.
> 
> There is one use in Ada which looks like an optimization for specific targets:
> 
>   /* Ada doesn't feature Fortran-like COMMON variables so we shouldn't
>  try to fiddle with DECL_COMMON.  However, on platforms that don't
>  support global BSS sections, uninitialized global variables would
>  go in DATA instead, thus increasing the size of the executable.  */
>   if (!flag_no_common
>   && TREE_CODE (var_decl) == VAR_DECL
>   && TREE_PUBLIC (var_decl)
>   && !have_global_bss_p ())
> DECL_COMMON (var_decl) = 1;
> 
> I don't understand how this works - if there is no bss support in the 
> linker, wouldn't common variables would still end up in the data 
> section?

bss _sections_ != bss-like segments in the executable.  Targets might not 
have a bss section that could be named in the asm file, or no way to 
switch to it without disrupting surrounding code, but they might have 
common symbols, which ultimately might or might not be collected in some 
bss-like segment.  In that case you want to use them instead of symbols in 
.data.

What's your rationale for changing this?  In your initial mail you said:

"On many targets this means global variable accesses having an unnecessary 
codesize and performance penalty in C code (the same source generates 
better code when built as C++)."

I have a hard time imaging that, so can you give details?  FWIW I've 
personally always considered using common symbols nicer.


Ciao,
Michael.


Re: [PATCH] Add _Float/_FloatX rounding built-ins & improve gimple optimization of _Float/_FloatX built-in functions

2017-11-20 Thread Segher Boessenkool
Hi!

On Fri, Nov 17, 2017 at 07:35:05PM -0500, Michael Meissner wrote:
> Here is the fixed patch.  It fixes the btrunc2 insn to use the correct
> XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> test to make sure round, trunc, ceil, and floor return the correct values.  
> The
> machine independent portions are the same.
> 
> Assuming the machine independent versions are approved, can I check in the
> PowerPC bits?

The rs6000 parts are okay, with a trivial fix:

> --- gcc/testsuite/gcc.target/powerpc/float128-hw11.c  (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/float128-hw11.c  (revision 0)
> @@ -0,0 +1,59 @@
> +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mpower9-vector -O2" } */
> +/* { dg-options "-mvsx -O2" } */

This last line should not be there.

Thanks,


Segher


Re: [PATCH, rs6000] Testcase updates for power9 codegen

2017-11-20 Thread Segher Boessenkool
Hi!

On Fri, Nov 17, 2017 at 11:30:27AM -0600, Will Schmidt wrote:
> [testsuite]
>   * fold-vec-neg-int.c: Specify -mcpu=power8

Dot at end of sentence (quite a few times), please.

> -/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +//# Power9 added xxspltib instruction.
> +/* { dg-final { scan-assembler-times "vspltisw|vxor|xxspltib" 1 } } */

That's not the usual comment style (and it won't work with all compiler
settings).  But does this comment bring any value?  Just leave it out?

> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> @@ -1,11 +1,11 @@
>  /* Verify that overloaded built-ins for vec_abs with int
> inputs produce the right results.  */
>  
>  /* { dg-do compile } */
>  /* { dg-require-effective-target powerpc_altivec_ok } */
> -/* { dg-options "-maltivec -O2 -fwrapv" } */
> +/* { dg-options "-maltivec -O2 -fwrapv -mcpu=power8" } */

Is this testcase really testing something specific to power8 codegen?
Making all these testcases use -mcpu=power8 means they won't be tested
with any other settings.  (Also, does that work if the user puts another
-mcpu= in RUNTESTFLAGS).

I'm all for making the testresults cleaner, but let's not do that by
(effectively) disabling all failing tests ;-)

(Please split this up a bit when you repost it btw).


Segher


Re: Adjust empty class parameter passing ABI (PR c++/60336)

2017-11-20 Thread Richard Biener
On November 20, 2017 4:51:04 PM GMT+01:00, Marek Polacek  
wrote:
>On Thu, Nov 16, 2017 at 02:20:59PM -0500, Jason Merrill wrote:
>> On Thu, Nov 16, 2017 at 12:41 PM, Marek Polacek 
>wrote:
>> > On Tue, Nov 14, 2017 at 07:34:54AM +0100, Richard Biener wrote:
>> >> On November 14, 2017 6:21:41 AM GMT+01:00, Jason Merrill
> wrote:
>> >> >On Mon, Nov 13, 2017 at 1:02 PM, Marek Polacek
>
>> >> >> In the end I did two bootstraps with the patch, but modifed one
>of
>> >> >them
>> >> >> to always return false for ix86_is_empty_record.  Then I
>compared all
>> >> >the
>> >> >> *.o in both dirs.  The result is attached.  Then I looked at
>> >> >DW_AT_producer
>> >> >> for all these .o that differ; all of them are C++.  Is this
>enough to
>> >> >> clear our concerns?
>> >> >
>> >> >Hmm, a bunch of these are right at the beginning, bytes 41 and
>65, in
>> >> >the header.
>> >> >
>> >> >Did you build them in the different trunk/trunk2 directories?  I
>think
>> >> >Jakub was suggesting building them in the same directory.
>> >> >> And I also ran a bootstrap with --enable-cxx-flags=-Wabi=11,
>and
>> >> >didn't
>> >> >> see any warnings.
>> >> >
>> >> >If there's a codegen change, there ought to be a warning to go
>along
>> >> >with it.
>> >>
>> >> The question was of course also for unintended changes but yes (I
>was mainly concerned by libstdc++ ABI changes).
>> >
>> > Ok, I did two bootstraps in the same dir, one with
>ix86_is_empty_record always
>> > returning false.  There were a few object files that differ in
>their assembly
>> > between those two bootstraps.  Previously I didn't see any warnings
>because
>> > I hadn't thought of -Wsystem-headers.  Also, we intentionally don't
>warn if
>> > the empty parameter is the last one:
>> >
>> > +  bool seen_empty_type = false;
>> > +  FOREACH_FUNCTION_ARGS (fntype, argtype, iter)
>> > +   {
>> > + if (VOID_TYPE_P (argtype))
>> > +   break;
>> > + if (TYPE_EMPTY_P (argtype))
>> > +   seen_empty_type = true;
>> > + else if (seen_empty_type)
>> > +   {
>> > + cum->warn_empty = true;
>> > + break;
>> > +   }
>> > +   }
>> >
>> > After enabling -Wsystem-headers and tweaking the code above so that
>we warn
>> > even if the empty parameter is trailing I can see the warnings that
>correspond
>> > to the assembly changes.  Below is a summary of what I found. 
>TL;DR: I don't
>> > see any unintended changes.
>> 
>> Looks good to me.
>
>Thanks!
>
>Richi, are you ok with the patch now?

Yes. 

Richard. 

>Honza/Uros, are the config/i386/* changes ok?
>
>The last version of the patch is
>https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00969.html
>
>   Marek



Re: [RFA][PATCH] 8/n Pull evrp range analyzer into its own file

2017-11-20 Thread Richard Biener
On November 20, 2017 4:46:08 PM GMT+01:00, Jeff Law  wrote:
>On 11/20/2017 03:05 AM, Richard Biener wrote:
>> On Sat, Nov 18, 2017 at 9:00 AM, Jeff Law  wrote:
>>>
>>>
>>>
>>> This is just a straight cut-n-paste pulling the evrp_range_analyzer
>>> methods out of gimple-ssa-evrp.c into gimple-ssa-evrp-analyze.[ch].
>>>
>>> Bootstrapped and regression tested on x86_64.
>>>
>>>
>>> OK for the trunk?
>> 
>> Hum.   That makes things harder to follow - what's the advantage to
>> making two files of EVRP?It's just separation of the analysis engine
>from the clients.  ie,
>gimple-ssa-evrp.c is just a one client of the engine
>gimple-ssa-evrp-analyze.c
>
>Other clients (DOM, sprintf, array bounds checking, _b_o_s, whatever)
>wouldn't ever need to look at to look a gimple-ssa-evrp.c.
>
>Then again, I don't expect the clients to have to look deeply at
>gimple-ssa-evrp-analyze.c either.  THe idea is to make range
>information
>available in a dead simple way :-)
>
>It's probably the least important piece.  If you'd prefer to keep them
>together it's not a big deal -- and if at some later date we want to
>pull them apart again, it shouldn't be hard.
>
>Your call.

I don't have a strong opinion here, so just go ahead with what is less work for 
you. 

Richard. 

>
>jeff



Re: std::advance istreambuf_iterator overload

2017-11-20 Thread Jonathan Wakely

On 20/11/17 07:40 +0100, François Dumont wrote:

Here is the latest version I plan to commit after validation.


I like the simplified logic for the while-loop.

I'd like to keep the change to the money_get/get/[char/wchar_t]/9.cc 
tests. At the moment those tests passes only because there is no 
operation on the is iterator (like a check for eof). If we add any, 
internal _M_buf will be reseted and tests will start to fail.


Those changes seem useful then, thanks.


Tested under Linux x86_64 normal and debug modes.

Ok to commit ?


OK for trunk, thanks!




Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread Jonathan Wakely

On 18/11/17 11:49 -0500, Ed Smith-Rowland wrote:

On 11/17/2017 03:54 PM, Jonathan Wakely wrote:


Hmm, you're probably right. I'd be tempted to though.

I had an idea.  What about a macro _GLIBCXX_ELLINT_3_POS_NU or 
something that:


1. would allow users to detect which convention is on by default.

2. They could set or unset to get the other convention.

It's bloody but it would work.  it would prevent users from having to 
test the compiler version and guess or check the value every time.


True, but I'd prefer not to have to maintain that forever, it's just
another variation that needs to be tested.

I feel that distros are likely to pick up gcc-7 soon and I'd like to 
do *something*.  This would be something of a transition path.


Fedora 26 has been shipping with GCC 7 for some months, and Fedora 27
is just out, also using GCC 7.

Maybe we should just change it for GCC 8 as you suggested initially.



Re: [libstdc++] Expose Airy functions.

2017-11-20 Thread Jonathan Wakely

On 18/11/17 20:33 -0500, Ed Smith-Rowland wrote:

Here is the final patch fir libstdc++ Airy functions...


OK for trunk, thanks.



Re: [RFC][PATCH 12/n Embed range analysis in DOM

2017-11-20 Thread Jeff Law
On 11/20/2017 03:13 AM, Richard Biener wrote:
> On Sat, Nov 18, 2017 at 10:31 AM, Jeff Law  wrote:
>> And a WIP.  I can justify continuing work on this during stage3 for
>> pr78496.  But I thought it was worth giving folks a looksie.
>>
>> The goal here is to make tree-vrp's threading obsolete and remove it and
>> at the same time pick up all the missed jump threads in pr78496.
>>
>> Essentially this patch embeds the evrp analysis into DOM and adds some
>> simplification code to use the results of the evrp analysis within jump
>> threading.
>>
>> This bootstraps, but doesn't quite pass regression testing.  There's a
>> few tests that need adjustment.  There's also two failing tests which I
>> think are a manifestation of a latent bug in the EVRP bits I've been
>> worried about since I started looking at the code.
>>
>> It does find *all* the missing threads in pr78496.  I'm still evaluating
>> the impact of dropping tree-vrp.c's jump threading, but it looks promising.
>>
>> There's two patches I'm not posting at this time.  First is the range
>> analyzer embedded in the sprintf warning pass to avoid a false positive
>> reported to gcc-bugs a while back.  I'm pretty sure it tickles the same
>> latent bug I mentioned above with the range analyzer embedded in DOM.
>> It also needs minor fixes to deal with being called when the optimizer
>> is not on.  Given the false positive posted to gcc-bugs and the tiny
>> size of the patch I can justify wrapping that up early in stage3.
>>
>> The second patch I'm not posting rips jump threading out of tree-vrp.c
>> It's just too rough in its current state.  I'm sure there's a bug that
>> says GCC has gotten slower than I can use to justify pushing on this
>> early in stage3 as well.
>>
>> I'm calling it a night from my virtual office in Hawaii.
> 
> So DOM now does EVRP in parallel but only uses the result for
> threading, not for other simplification.  That somehow feels like
> a waste of cycles?  Isn't this the opportunity to "merge" DOM and EVRP
> to make one (configurable) DOM-walker optimization pass?
Yes.  I hadn't really figured out how I want to wire in the
simplifications to the main part of DOM.  But yes, that's in the plan
after removal of tree-vrp's jump threading.  There's all kinds of things
that I think simplify or get stronger in DOM with the availability of
quality context specific range data.  I don't really consider that
stage3 material though.

Though I did ultimately end up wiring in one simplification after
posting that patch to address a missed optimization at -O1 which
eventually led to a false positive warning.

--


One of the things I'm still concerned about with the EVRP code is that
it sets the global range information attached to SSA_NAMEs.  It is
careful to only set it at an object's DEF site, so it *may* be OK,
though I'm still not 100% convinced.  Anyway...

In the case I'm looking at on the trunk the range information set during
early VRP is coarse enough that the range for the argument passed to new
is [0,0x].

In my local tree where we run the EVRP analysis engine during DOM (or
the sprintf warnings) we end up with a much tighter range simply because
we're running it after other optimizations have cleaned things up.  THe
range we get is: [0xfffe,0xfff]

The maximum size for a call to new is 0x7fff.

In the warning code which checks for overflow in calls to allocators we
compare the low bound of the object to the maximum size allowed.

So on the trunk we compare 0 to 0x7fff.  Since the low bound
is smaller than the max size we don't warn.

In  my code where we have a nice tight range for the argument to new
we're comparing 0xfffe to 0x7fff.  Of course
since the argument to new has low bound larger than the maximum allowed
we get a warning.

As it turns out the code is unexecutable.  So even though I'm not
entirely sure I know how I want to wire in exploiting the evrp analysis
in DOM in general, I did add a chunk of code to allow DOM to use the
range info to prove the result of a conditional was a compile-time constant.

There's a similar situation that arises in the libgomp Fortran testsuite
where a block of unreachable code passes a huge size to memset that we
get a diagnostic for.  The same code fixed this problem as well.


> I applaud the first and foremost goal of ripping threading out of VRP
That's certainly where I'm going.

> (and to be honest I'd rather get rid of the SSA propagator VRP
> implementation completely at some point...).
I think that's probably a gcc-9 goal.  I doubt the additional precision
we get from propagating through the lattice is all that important in
practice and that it probably can go away.  The evrp style analysis gets
most of the precision and should be notably cheaper.

Jeff



RFC: SAFE_MACRO_STMT

2017-11-20 Thread Tom de Vries

Hi,

when writing macros that are intended to be used as if they were single 
statements, there are a couple of common problems:


1. Ending in semicolon
...
#define foo stmt;
...

This works fine if we use it like:
...
void
bar (void)
{
  foo;
}
...

but we get an "else without a previous if" error when we use it like:
...
void
bar (void)
{
  if (1)
foo;
  else
{}
}
...

2. Using if-then-else without wrapping in "do {} while (0)"
...
#define foo if (1) stmt; else {}
...

Again this works fine if we use it like:
...
void
bar (void)
{
  foo;
}
...

but we get an "suggest explicit braces to avoid ambiguous else 
[-Wparentheses]" warning when we use it like:

...
void
bar (void)
{
  if (1)
foo;
}
...

3. Multi statement without wrapping it in "do {} while (0)"
...
#define foo stmt; stmt
...

Again this works fine if we use it like:
...
void
bar (void)
{
  foo;
}
...

but we get a "macro expands to multiple statements 
[-Wmultistatement-macros]" warning when we use it like:

...
void
bar (void)
{
  if (1)
foo;
}
...


I've written a macro SAFE_MACRO_STMT that triggers these three 
situations for a macro, but does not change the semantics:

...
#define SAFE_MACRO_STMT(stmt)   \
  do {  \
stmt;   \
if (0)  \
  stmt; \
else\
  {}\
if (0)  \
  stmt; \
  } while (0)
...

In other words, by wrapping the macro usage in SAFE_MACRO_STMT, we 
trigger an error or warning if the macro body contains any of the problems:

...
void
bar (void)
{
  SAFE_MACRO_STMT (foo);
}
...

The first demonstrator patch applies this strategy to 
ASM_OUTPUT_ADDR_VEC_ELT. This smokes out a type 1 error in the ft32 
definition of ASM_OUTPUT_ADDR_VEC_ELT, which is fixed by the second patch.


Any comments on this approach?

Thanks,
- Tom
Use SAFE_MACRO_STMT for ASM_OUTPUT_ADDR_VEC_ELT

---
 gcc/final.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/final.c b/gcc/final.c
index 672c5bb..17c1c12 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -2169,6 +2169,17 @@ asm_show_source (const char *filename, int linenum)
   fputc ('\n', asm_out_file);
 }
 
+#define SAFE_MACRO_STMT(stmt)   \
+  do {  \
+stmt;	\
+if (0)  \
+  stmt; \
+else\
+  {}\
+if (0)  \
+  stmt; \
+  } while (0)
+
 /* The final scan for one insn, INSN.
Args are same as in `final', except that INSN
is the insn being scanned.
@@ -2562,8 +2573,10 @@ final_scan_insn (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
 		if (GET_CODE (body) == ADDR_VEC)
 		  {
 #ifdef ASM_OUTPUT_ADDR_VEC_ELT
-		ASM_OUTPUT_ADDR_VEC_ELT
-		  (file, CODE_LABEL_NUMBER (XEXP (XVECEXP (body, 0, idx), 0)));
+		SAFE_MACRO_STMT
+		  (ASM_OUTPUT_ADDR_VEC_ELT
+		   (file,
+			CODE_LABEL_NUMBER (XEXP (XVECEXP (body, 0, idx), 0;
 #else
 		gcc_unreachable ();
 #endif
[ft32] Remove semicolon after ASM_OUTPUT_ADDR_VEC_ELT

---
 gcc/config/ft32/ft32.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/ft32/ft32.h b/gcc/config/ft32/ft32.h
index b276f25..c32db40 100644
--- a/gcc/config/ft32/ft32.h
+++ b/gcc/config/ft32/ft32.h
@@ -205,7 +205,7 @@ enum reg_class
 /* This is how to output an element of a case-vector that is absolute.  */
 
 #define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE)  \
-fprintf (FILE, "\t.long\t.L%d\n", VALUE);\
+fprintf (FILE, "\t.long\t.L%d\n", VALUE)\
 
 /* Passing Arguments in Registers */
 


Re: [PATCH libstdc++/66689] comp_ellint_3 and ellint_3 return garbage values

2017-11-20 Thread David Edelsohn
This patch has introduced new regressions on at least PowerPC and AArch64.

FAIL: ext/special_functions/hyperg/check_value.cc execution test
FAIL: tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc
execution test

Thanks, David


Re: Adjust empty class parameter passing ABI (PR c++/60336)

2017-11-20 Thread Marek Polacek
On Thu, Nov 16, 2017 at 02:20:59PM -0500, Jason Merrill wrote:
> On Thu, Nov 16, 2017 at 12:41 PM, Marek Polacek  wrote:
> > On Tue, Nov 14, 2017 at 07:34:54AM +0100, Richard Biener wrote:
> >> On November 14, 2017 6:21:41 AM GMT+01:00, Jason Merrill 
> >>  wrote:
> >> >On Mon, Nov 13, 2017 at 1:02 PM, Marek Polacek 
> >> >> In the end I did two bootstraps with the patch, but modifed one of
> >> >them
> >> >> to always return false for ix86_is_empty_record.  Then I compared all
> >> >the
> >> >> *.o in both dirs.  The result is attached.  Then I looked at
> >> >DW_AT_producer
> >> >> for all these .o that differ; all of them are C++.  Is this enough to
> >> >> clear our concerns?
> >> >
> >> >Hmm, a bunch of these are right at the beginning, bytes 41 and 65, in
> >> >the header.
> >> >
> >> >Did you build them in the different trunk/trunk2 directories?  I think
> >> >Jakub was suggesting building them in the same directory.
> >> >> And I also ran a bootstrap with --enable-cxx-flags=-Wabi=11, and
> >> >didn't
> >> >> see any warnings.
> >> >
> >> >If there's a codegen change, there ought to be a warning to go along
> >> >with it.
> >>
> >> The question was of course also for unintended changes but yes (I was 
> >> mainly concerned by libstdc++ ABI changes).
> >
> > Ok, I did two bootstraps in the same dir, one with ix86_is_empty_record 
> > always
> > returning false.  There were a few object files that differ in their 
> > assembly
> > between those two bootstraps.  Previously I didn't see any warnings because
> > I hadn't thought of -Wsystem-headers.  Also, we intentionally don't warn if
> > the empty parameter is the last one:
> >
> > +  bool seen_empty_type = false;
> > +  FOREACH_FUNCTION_ARGS (fntype, argtype, iter)
> > +   {
> > + if (VOID_TYPE_P (argtype))
> > +   break;
> > + if (TYPE_EMPTY_P (argtype))
> > +   seen_empty_type = true;
> > + else if (seen_empty_type)
> > +   {
> > + cum->warn_empty = true;
> > + break;
> > +   }
> > +   }
> >
> > After enabling -Wsystem-headers and tweaking the code above so that we warn
> > even if the empty parameter is trailing I can see the warnings that 
> > correspond
> > to the assembly changes.  Below is a summary of what I found.  TL;DR: I 
> > don't
> > see any unintended changes.
> 
> Looks good to me.

Thanks!

Richi, are you ok with the patch now?
Honza/Uros, are the config/i386/* changes ok?

The last version of the patch is
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00969.html

Marek


Re: [RFA][PATCH] 8/n Pull evrp range analyzer into its own file

2017-11-20 Thread Jeff Law
On 11/20/2017 03:05 AM, Richard Biener wrote:
> On Sat, Nov 18, 2017 at 9:00 AM, Jeff Law  wrote:
>>
>>
>>
>> This is just a straight cut-n-paste pulling the evrp_range_analyzer
>> methods out of gimple-ssa-evrp.c into gimple-ssa-evrp-analyze.[ch].
>>
>> Bootstrapped and regression tested on x86_64.
>>
>>
>> OK for the trunk?
> 
> Hum.   That makes things harder to follow - what's the advantage to
> making two files of EVRP?It's just separation of the analysis engine from the 
> clients.  ie,
gimple-ssa-evrp.c is just a one client of the engine
gimple-ssa-evrp-analyze.c

Other clients (DOM, sprintf, array bounds checking, _b_o_s, whatever)
wouldn't ever need to look at to look a gimple-ssa-evrp.c.

Then again, I don't expect the clients to have to look deeply at
gimple-ssa-evrp-analyze.c either.  THe idea is to make range information
available in a dead simple way :-)

It's probably the least important piece.  If you'd prefer to keep them
together it's not a big deal -- and if at some later date we want to
pull them apart again, it shouldn't be hard.

Your call.

jeff



Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Wilco Dijkstra
Richard Biener wrote:

> A target specific default might be a good idea if we decide to revert.
> 
> Note I proposed this change a few times already, but the fear was always
> we'll break too much legacy code.

It will definitely break some code, but new warnings with -Werror might too...

> Note you have to make sure GFortran still works!  So I think the patch should
> be changed to make the default behavior be frontend dependent or have a
> fortran/ adjustment that fixes things up for the fortran dialects that need 
> it.

Fortran doesn't use flag_no_common, so COMMON globals are not affected.

There is one use in Ada which looks like an optimization for specific targets:

  /* Ada doesn't feature Fortran-like COMMON variables so we shouldn't
 try to fiddle with DECL_COMMON.  However, on platforms that don't
 support global BSS sections, uninitialized global variables would
 go in DATA instead, thus increasing the size of the executable.  */
  if (!flag_no_common
  && TREE_CODE (var_decl) == VAR_DECL
  && TREE_PUBLIC (var_decl)
  && !have_global_bss_p ())
DECL_COMMON (var_decl) = 1;

I don't understand how this works - if there is no bss support in the linker,
wouldn't common variables would still end up in the data section?

Wilco

RE: [RFC, vectorizer] Allow half_type for left shift in vect_operation_fits_smaller_type?

2017-11-20 Thread Jon Beniston
Hi Richard,

>Not digged long into this "interesting" function but this case is only
valid if type == 
>final type and if the result is not shifted back.
vect_recog_over_widening_pattern 
>works on a whole sequence of stmts after all, thus
>
>  b = (T_PROMOTED) a;
>  c = b << 2;
>  d = b >> 2;
>  e = (T_ORIG) b;
>
>would be miscompiled by your new case.

Here is the followup patch.
  
It supports half type left shift operation in
vect_recog_over_widening_patter
function.  As you suggested, the patch keeps half type lshift flag and gives
up on right shift operation or different new_type/use_type cases.
  
Two test cases are added, one should be recognized as pattern and the other
shouldn't.

Bootstrap OK and no gcc/g++ regression on x86_64/AArch64.
  
Does this look OK?

2017-11-20  Jon Beniston   gcc/
* tree-vect-patterns.c (vect_operation_fits_smaller_type): New
parameter.  Support half type lshift.
(vect_recog_over_widening_patter): Support half type lshift.

gcc/testsuite

* gcc.dg/vect/vect-over-widen-5.c: New test.
* gcc.dg/vect/vect-over-widen-6.c: New test.


diff --git a/gcc/testsuite/gcc.dg/vect/vect-over-widen-5.c
b/gcc/testsuite/gcc.dg/vect/vect-over-widen-5.c
new file mode 100644
index 000..a3e5a44
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-over-widen-5.c
@@ -0,0 +1,46 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */
+
+#include 
+#include "tree-vect.h"
+
+#define N 256
+
+short a[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+short b[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+
+__attribute__ ((noinline))
+int foo (void)
+{
+  int i;
+
+  for (i=0; i> 2);
+  a[i] = x;
+}
+}
+
+int main (void)
+{
+  int i;
+
+  check_vect ();
+
+  for (i=0; i> 2))
+  abort ();
+  }
+  
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "vect_recog_over_widening_pattern:
detected" "vect" } } */
+/* { dg-final { scan-tree-dump-not "pattern recognized" "vect" } } */
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 1cd6e57..daadcfb 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -1240,12 +1240,15 @@ vect_recog_widen_sum_pattern (vec *stmts,
tree *type_in,
  statements for STMT: the first one is a type promotion and the
second
  one is the operation itself.  We return the type promotion
statement
 in NEW_DEF_STMT and further store it in STMT_VINFO_PATTERN_DEF_SEQ
of
- the second pattern statement.  */
+ the second pattern statement.
+   HALF_TYPE_LSHIFT_P - Set to TRUE if STMT is left shift operation can be
+done in smaller type that has half precision of promoted type.  */
 
 static bool
 vect_operation_fits_smaller_type (gimple *stmt, tree def, tree *new_type,
  tree *op0, tree *op1, gimple
**new_def_stmt,
- vec *stmts)
+ vec *stmts,
+ bool *half_type_lshift_p)
 {
   enum tree_code code;
   tree const_oprnd, oprnd;
@@ -1296,6 +1299,7 @@ vect_operation_fits_smaller_type (gimple *stmt, tree
def, tree *new_type,
 return false;
 }
 
+  unsigned HOST_WIDE_INT half_prec = TYPE_PRECISION (half_type);
   /* Can we perform the operation on a smaller type?  */
   switch (code)
 {
@@ -1305,11 +1309,11 @@ vect_operation_fits_smaller_type (gimple *stmt, tree
def, tree *new_type,
 if (!int_fits_type_p (const_oprnd, half_type))
   {
 /* HALF_TYPE is not enough.  Try a bigger 

Re: [PATCH 1/7]: SVE: Add CLOBBER_HIGH expression

2017-11-20 Thread Alan Hayward

> On 17 Nov 2017, at 19:31, Jeff Law  wrote:
> 
> On 11/16/2017 11:50 AM, Alan Hayward wrote:
>> 
>>> On 16 Nov 2017, at 18:24, Richard Biener  wrote:
>>> 
>>> On November 16, 2017 7:05:30 PM GMT+01:00, Jeff Law  wrote:
 On 11/16/2017 05:34 AM, Alan Hayward wrote:
> This is a set of patches aimed at supporting aarch64 SVE register
> preservation around TLS calls.
> 
> Across a TLS call, Aarch64 SVE does not explicitly preserve the
> SVE vector registers. However, the Neon vector registers are
 preserved.
> Due to overlapping of registers, this means the lower 128bits of all
> SVE vector registers will be preserved.
> 
> The existing GCC code will currently incorrectly assume preservation
> of all of the SVE registers.
> 
> This patch introduces a CLOBBER_HIGH expression. This behaves a bit
 like
> a CLOBBER expression. CLOBBER_HIGH can only refer to a single
 register.
> The mode of the expression indicates the size of the lower bits which
> will be preserved. If the register contains a value bigger than this
> mode then the code will treat the register as clobbered.
> 
> The means in order to evaluate if a clobber high is relevant, we need
 to ensure
> the mode of the existing value in a register is tracked.
> 
> The following patches in this series add support for the
 CLOBBER_HIGH,
> with the final patch adding CLOBBER_HIGHs around TLS_DESC calls for
> aarch64. The testing performed on these patches is also detailed in
 the
> final patch.
> 
> These patches are based on top of the linaro-dev/sve branch.
> 
> A simpler alternative to this patch would be to assume all Neon and
 SVE
> registers are clobbered across TLS calls, however this would be a
> performance regression against all Aarch64 targets.
 So just a couple design questions.
 
 Presumably there's no reasonable way to set up GCC's view of the
 register file to avoid this problem?  ISTM that if the SVE register was
 split into two, one for the part that overlapped with the neon register
 and one that did not, then this could be handled via standard
 mechanisms?
 
>> 
>> Yes, that was an early alternative option for the patch.
>> 
>> With that it would effect every operation that uses SVE registers. A simple
>> add of two registers now has 4 inputs and two outputs. It would get in the
>> way when debugging any sve dumps and be generally annoying.
>> Possible that the code for that in would all be in the aarch64 target,
>> (making everyone else happy!) But I suspect that there would be still be
>> strange dependency issues that’d need sorting in the common code.
>> 
>> Whereas with this patch, there are no new oddities in non-tls compiles/dumps.
>> Although the patch touches a lot of files, the changes are mostly restricted
>> to places where standard clobbers were already being checked.
> I'm not entirely sure that it would require doubling the number of
> inputs/outputs.  It's not conceptually much different than how we
> describe DImode operations on 32bit targets.  The mode selects one or
> more consecutive registers, so you don't actually need anything weird in
> your patterns.  This is pretty standard stuff.

Ok, fair enough.

> 
> 
> It would be painful in that the Neon regs would have to interleave with
> the upper part of the SVE regs in terms of register numbers.  It would
> also mean that you couldn't tie together multiple neon regs into
> something wider.  I'm not sure if the latter would be an issue or not.

And there’s also the weirdness that the register would not be split evenly - 
it’ll be a TI
reg followed by a reg of the size of multiple TIs.

All of that has the potential to complicate all non-sve aarch64 code.

> 
> You might also look at TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  I'd
> totally forgotten about it.  And in fact it seems to come pretty close
> to what you need…

Yes, some of the code is similar to the way
TARGET_HARD_REGNO_CALL_PART_CLOBBERED works. Both that code and the
CLOBBER expr code served as a starting point for writing the patch. The main 
difference
here, is that _PART_CLOBBERED is around all calls and is not tied to a specific 
Instruction,
it’s part of the calling abi. Whereas clobber_high is explicitly tied to an 
expression (tls_desc).
It meant there wasn’t really any opportunity to resume any existing code.


Alan.




Re: VRP: x+1 and -x cannot be INT_MIN

2017-11-20 Thread Richard Biener
On Mon, Nov 20, 2017 at 3:10 PM, Marc Glisse  wrote:
> On Mon, 20 Nov 2017, Richard Biener wrote:
>
>>> I had to adapt one testcase where for VR_VARYING | [1, 1] we used to
>>> produce
>>> ~[0, 0] and now produce [-INT_MAX, INT_MAX]. I am surprised at how late
>>> the
>>> transformation now happens (only after removing __builtin_unreachable, in
>>> forwprop3, while trunk currently has it in evrp IIRC), but I didn't
>>> investigate, doesn't seem like the right time with all the VRP changes
>>> going
>>> on.
>>
>>
>> Interesting - can you open a bugreport so we don't forget?  I suspect it's
>> the effect of zero_nonzero_bits_from_vr handling VARYING and [INT_MIN,
>> INT_MAX]
>> differently rippling down.  At some point in time I wanted to get rid of
>> VARYING
>> in favor of [INT_MIN, INT_MAX] ...
>
>
> I filed PR 83072 about missing the optimization in evrp, but now I am not
> sure if that's what you wanted in the PR or if you wanted one about chosing
> between ~[0, 0] and [-INT_MAX, INT_MAX] for VR_VARYING | [1, 1]...

Yes, the one choosing between these two.  The EVRP missed optimization
is of course also useful to track.

Richard.

> --
> Marc Glisse


Re: Fix gcc-7 and gcc-8 build of GO for Hurd and not kFreeBSD in debian/rules.defs

2017-11-20 Thread Aurelien Jarno
On 2017-11-20 14:31, Svante Signell wrote:
> On Mon, 2017-11-20 at 14:04 +0100, Aurelien Jarno wrote:
> > On 2017-11-20 13:20, Svante Signell wrote:
> > > On Thu, 2017-11-16 at 20:39 +0100, Svante Signell wrote:
> > > 
> > > Seeing the recent update of debian/rules.defs in gcc-7 a mistake was
> > > revealed in the patch debian_rules.defs for both gcc-7 and gcc-8. The
> > > correct patch is inlined below:
> > > 
> > > --- a/debian/rules.defs   2017-11-20 12:59:25.0 +0100
> > > +++ b/debian/rules.defs   2017-11-20 13:01:54.0 +0100
> > > @@ -807,7 +807,7 @@
> > >  ifeq (,$(filter $(distrelease),lenny etch squeeze dapper hardy jaunty
> > > karmic
> > > lucid maverick natty oneiric))
> > >    go_no_cpus := $(filter-out arm, $(go_no_cpus))
> > >  endif
> > > -go_no_systems := gnu kfreebsd-gnu
> > > +go_no_systems := kfreebsd
> > >  
> > >  ifneq ($(with_base_only),yes)
> > >    ifneq ($(separate_lang),yes)
> > > @@ -817,7 +817,7 @@
> > >  ifneq (,$(filter $(DEB_TARGET_ARCH_CPU),$(go_no_cpus)))
> > >    with_go := disabled for cpu $(DEB_TARGET_ARCH_CPU)
> > >  endif
> > > -ifneq (,$(findstring $(DEB_TARGET_GNU_SYSTEM),$(go_no_systems)))
> > > +ifneq (,$(findstring $(DEB_TARGET_ARCH_OS),$(go_no_systems)))
> > >    with_go := disabled for system $(DEB_TARGET_GNU_SYSTEM)
> > >  endif
> > >  ifeq ($(go_no_cross)-$(DEB_CROSS),yes-yes)
> > > 
> > > Changing back to go_no_systems := kfreebsd-gnu as aurel32 did disables the
> > > build of go also for GNU/Hurd. The culprit is to match DEB_TARGET_ARCH_OS 
> > > to
> > > kfreebsd to disable the build of go for GNU/kFreeBSD, not to match
> > > DEB_TARGET_GNU_SYSTEM to kfreebsd-gnu, since that will match gnu too :(
> > 
> > I did that change because changing go_no_systems from kfreebsd-gnu to
> > kfreebsd is what caused GCC 7 to FTBFS on at least kfreebsd-i386 since
> > version 7.2.0-15 (see bug#881656).
> > 
> > Your change also looks fine to me. Should I commit it?
> 
> Yes please. And do this change also for gcc-8, otherwise it will FTBFS on
> kFreeBSD too.

Done, thanks!

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


Re: [PATCH GCC]A simple implementation of loop interchange

2017-11-20 Thread Richard Biener
On Thu, Nov 16, 2017 at 4:18 PM, Bin.Cheng  wrote:
> On Tue, Oct 24, 2017 at 3:30 PM, Michael Matz  wrote:
>> Hello,
>>
>> On Fri, 22 Sep 2017, Bin.Cheng wrote:
>>
>>> This is updated patch for loop interchange with review suggestions
>>> resolved.  Changes are:
>>>   1) It does more light weight checks like rectangle loop nest check
>>> earlier than before.
>>>   2) It checks profitability of interchange before data dependence 
>>> computation.
>>>   3) It calls find_data_references_in_loop only once for a loop nest now.
>>>   4) Data dependence is open-computed so that we can skip instantly at
>>> unknown dependence.
>>>   5) It improves code generation in mapping induction variables for
>>> loop nest, as well as
>>>  adding a simple dead code elimination pass.
>>>   6) It changes magic constants into parameters.
>>
>> So I have a couple comments/questions.  Something stylistic:
> Hi Michael,
> Thanks for reviewing.
>
>>
>>> +class loop_cand
>>> +{
>>> +public:
>>> ...
>>> +  friend class tree_loop_interchange;
>>> +private:
>>
>> Just make this all public (and hence a struct, not class).
>> No need for friends in file local classes.
> Done.
>
>>
>>> +single_use_in_loop (tree var, struct loop *loop)
>>> ...
>>> +  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
>>> +{
>>> +  stmt = USE_STMT (use_p);
>>> ...
>>> +  basic_block bb = gimple_bb (stmt);
>>> +  gcc_assert (bb != NULL);
>>
>> This pattern reoccurs often in your patch: you check for a bb associated
>> for a USE_STMT.  Uses of SSA names always occur in basic blocks, no need
>> for checking.
> Done.
>
>>
>> Then, something about your handling of simple reductions:
>>
>>> +void
>>> +loop_cand::classify_simple_reduction (reduction_p re)
>>> +{
>>> ...
>>> +  /* Require memory references in producer and consumer are the same so
>>> + that we can undo reduction during interchange.  */
>>> +  if (re->init_ref && !operand_equal_p (re->init_ref, re->fini_ref, 0))
>>> +return;
>>
>> Where is it checked that the undoing transformation is legal also
>> from a data dep point of view?  Think code like this:
>>
>>sum = X[i];
>>for (j ...)
>>  sum += X[j];
>>X[i] = sum;
>>
>> Moving the store into the inner loop isn't always correct and I don't seem
>> to find where the above situation is rejected.
> Yeah.  for the old patch, it's possible to have such loop wrongly 
> interchanged;
> in practice, it's hard to create an example.  The pass will give up
> when computing
> data dep between references in inner/outer loops.  In this updated
> patch, it's fixed
> by giving up if there is any dependence between references of inner/outer 
> loops.
>
>>
>> Maybe I'm confused because I also don't see where you even can get into
>> the above situation (though I do see testcases about this).  The thing is,
>> for an 2d loop nest to contain something like the above reduction it can't
>> be perfect:
>>
>>for (j) {
>>  int sum = X[j];  // 1
>>  for (i)
>>sum += Y[j][i];
>>  X[j] = sum;  // 2
>>}
>>
>> But you do check for perfectness in proper_loop_form_for_interchange and
>> prepare_perfect_loop_nest, so either you can't get into the situation or
>> the checking can't be complete, or you define the above to be perfect
>> nevertheless (probably because the load and store are in outer loop
>> header/exit blocks?).  The latter would mean that you accept also other
>> code in header/footer of loops from a pure CFG perspective, so where is it
>> checked that that other code (which aren't simple reductions) isn't
>> harmful to the transformation?
> Yes, I used the name perfect loop nest, but the pass can handle special form
> imperfect loop nest for the simple reduction.  I added comments describing
> this before function prepare_perfect_loop_nest.
>
>>
>> Then, the data dependence part of the new pass:
>>
>>> +bool
>>> +tree_loop_interchange::valid_data_dependences (unsigned inner, unsigned 
>>> outer)
>>> +{
>>> +  struct data_dependence_relation *ddr;
>>> +
>>> +  for (unsigned i = 0; ddrs.iterate (i, ); ++i)
>>> +{
>>> +  /* Skip no-dependence case.  */
>>> +  if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
>>> + continue;
>>> +
>>> +  for (unsigned j = 0; j < DDR_NUM_DIR_VECTS (ddr); ++j)
>>> + {
>>> +   lambda_vector dist_vect = DDR_DIST_VECT (ddr, j);
>>> +   unsigned level = dependence_level (dist_vect, loop_nest.length ());
>>> +
>>> +   /* If there is no carried dependence.  */
>>> +   if (level == 0)
>>> + continue;
>>> +
>>> +   level --;
>>> +   /* Skip case which has '>' as the leftmost direction.  */
>>> +   if (!lambda_vector_lexico_pos (dist_vect, level))
>>> + return false;
>>
>> Shouldn't happen as dist vectors are forced positive via DDR_REVERSED.
> Done.
>
>>
>>> +   /* If dependence is carried by outer loop of the two loops for
>>> +  interchange.  */
>>> +   if 

Re: [patch] Add support for #pragma GCC unroll

2017-11-20 Thread Steve Kargl
On Mon, Nov 20, 2017 at 12:57:47PM +0100, Bernhard Reutner-Fischer wrote:
> On 20 November 2017 at 12:26, Eric Botcazou  wrote:
> >> If anybody finds the time to push the corresponding Fortran changes then 
> >> I'd
> >> be grateful. I won't have time for this until end of stage 1...
> >>
> >> https://gcc.gnu.org/ml/fortran/2015-02/msg00014.html
> >
> > OK, I'm going to merge it in the main patch.
> 
> [CCing fortran@]
> Thanks alot in advance!

The URL points to a nearly 3 year old patch.  I noticed
that there is no documentation of the new Fortran directive.
Section 7.2 of gfortran.info should be updated.  In particular,
does '!GCC$ UNROLL 4' affect only the immediately following
DO-LOOP or all DO-LOOPs that follow the directive until another
'GCC$ UNROLL ...' is found?  How does this new directive interface
with OpenMP and OpenACC?

-- 
Steve


Re: [PATCH][aarch64] Fix pr81356 - copy empty string with wrz, not a ldrb/strb

2017-11-20 Thread Christophe Lyon
On 17 November 2017 at 22:41, James Greenhalgh  wrote:
> On Fri, Sep 15, 2017 at 07:22:39PM +0100, Steve Ellcey wrote:
>> PR 81356 points out that doing a __builtin_strcpy of an empty string on
>> aarch64 does a copy from memory instead of just writing out a zero byte.
>> In looking at this I found that it was because of
>> aarch64_use_by_pieces_infrastructure_p, which returns false for
>> STORE_BY_PIECES.  The comment says:
>>
>>   /* STORE_BY_PIECES can be used when copying a constant string, but
>>  in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
>>  For now we always fail this and let the move_by_pieces code copy
>>  the string from read-only memory.  */
>>
>> But this doesn't seem to be the case anymore.  When I remove this function
>> and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code
>> for __builtin_strcpy of a constant string seems to be either better or the
>> same.  The only time I got more instructions after removing this function
>> was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk
>> instructions to create the source followed by a store instead of doing a
>> load/store of 8 bytes.  The comment may have been applicable for
>> -mstrict-align at one time but it doesn't seem to be the case now.  I still
>> get better code without this routine under that option as well.
>
> I've been trying to remember why this is the way it is, It looks like it dates
> back to the initial port, and the only note I have on it points at an
> incorrect PR number. So, I think this is probably a safe and sensible
> choice.
>
> OK.
>

As a result of this patch, we now have:
XPASS: gfortran.dg/pr45636.f90   -O   scan-tree-dump-times forwprop2 "memset" 0
instead of:
XFAIL: gfortran.dg/pr45636.f90   -O   scan-tree-dump-times forwprop2
"memset" 0 (found 2 times)

Is it expected?

Christophe


> Reviewed-by: James Greenhalgh 
>
> Thanks,
> James
>
>>
>> Bootstrapped and tested without regressions, OK to checkin?
>>
>> Steve Ellcey
>> sell...@cavium.com
>>
>>
>>
>> 2017-09-15  Steve Ellcey  
>>
>>   PR target/81356
>>   * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p):
>>   Remove.
>>   (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.
>>
>>
>> 2017-09-15  Steve Ellcey  
>>
>>   * gcc.target/aarch64/pr81356.c: New test.
>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 1c14008..fc72236 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -14118,22 +14118,6 @@ aarch64_asan_shadow_offset (void)
>>return (HOST_WIDE_INT_1 << 36);
>>  }
>>
>> -static bool
>> -aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
>> - unsigned int align,
>> - enum by_pieces_operation op,
>> - bool speed_p)
>> -{
>> -  /* STORE_BY_PIECES can be used when copying a constant string, but
>> - in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
>> - For now we always fail this and let the move_by_pieces code copy
>> - the string from read-only memory.  */
>> -  if (op == STORE_BY_PIECES)
>> -return false;
>> -
>> -  return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
>> -}
>> -
>>  static rtx
>>  aarch64_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq,
>>   int code, tree treeop0, tree treeop1)
>> @@ -15631,10 +15615,6 @@ aarch64_libgcc_floating_mode_supported_p
>>  #undef TARGET_LEGITIMIZE_ADDRESS
>>  #define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address
>>
>> -#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
>> -#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
>> -  aarch64_use_by_pieces_infrastructure_p
>> -
>>  #undef TARGET_SCHED_CAN_SPECULATE_INSN
>>  #define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn
>>
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/pr81356.c 
>> b/gcc/testsuite/gcc.target/aarch64/pr81356.c
>> index e69de29..9fd6baa 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/pr81356.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/pr81356.c
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +
>> +void f(char *a)
>> +{
>> +  __builtin_strcpy (a, "");
>> +}
>> +
>> +/* { dg-final { scan-assembler-not "ldrb" } } */
>


Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c

2017-11-20 Thread Kyrill Tkachov


On 20/11/17 14:14, Christophe Lyon wrote:

Hi,

On 17 November 2017 at 12:12, Kyrill  Tkachov
 wrote:

On 17/11/17 10:45, Sudi Das wrote:

Hi Kyrill

Thanks I have made the change.


Thanks Sudi, I've committed this on your behalf with r254863.

Kyrill



Sudi



From: Kyrill Tkachov 
Sent: Thursday, November 16, 2017 5:03 PM
To: Sudi Das; gcc-patches@gcc.gnu.org
Cc: nd; Ramana Radhakrishnan; Richard Earnshaw
Subject: Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c

Hi Sudi,

On 16/11/17 16:37, Sudi Das wrote:

Hi

This patch fixes the test case armv8_2-fp16-move-1.c for
arm-none-linux-gnueabihf where 2 of the scan-assembler directives were
failing. We now generate less vmov between core and VFP registers.
Thus changing those directives to reflect that.

Is this ok for trunk?
If yes could someone commit it on my behalf?

Sudi


*** gcc/testsuite/ChangeLog ***

2017-11-16  Sudakshina Das  

   * gcc.target/arm/armv8_2-fp16-move-1.c: Edit vmov
scan-assembler
   directives.


diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
index bb4e68f..0ed8560 100644
--- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
@@ -101,8 +101,8 @@ test_select_8 (__fp16 a, __fp16 b, __fp16 c)
/* { dg-final { scan-assembler-times {vselgt\.f16\ts[0-9]+, s[0-9]+,
s[0-9]+} 1 } }  */
/* { dg-final { scan-assembler-times {vselge\.f16\ts[0-9]+, s[0-9]+,
s[0-9]+} 1 } }  */
-/* { dg-final { scan-assembler-times {vmov\.f16\ts[0-9]+, r[0-9]+} 4 }
}  */
-/* { dg-final { scan-assembler-times {vmov\.f16\tr[0-9]+, s[0-9]+} 4 } }
*/
+/* { dg-final { scan-assembler-times {vmov\.f16\ts[0-9]+, r[0-9]+} 2 } }
*/
+/* { dg-final { scan-assembler-times {vmov\ts[0-9]+, s[0-9]+} 4 } }  */
Some of the moves between core and fp registers were the result of
inefficient codegen and in hindsight
scanning for them was not very useful. Now that we emit only the required
ones I think scanning for the plain
vmovs between two S-registers doesn't test anything useful.
So can you please just remove the second scan-assembler directive here?


You are probably already aware of that: the tests fail on
arm-none-linux-gnueabi/arm-none-eabi
FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vmov\\.f16\\ts[0-9]+, r[0-9]+ 2 (found 38 times)

but this is not a regression, the previous version of the test had the
same problem.


Grrr, that's because the softfp ABI necessitates moves between core and 
FP registers,
so scanning for a particular number of vmovs between them is just not 
gonna be stable across
soft-float ABIs. At this point I'd just remove the scan for vmovs 
completely as it doesn't

seem to check anything useful.

A patch to remove that scan for VMOV is pre-approved.

Thanks for reminding me of this Christophe, softfp tends to slip my mind :(
Kyrill


Christophe


Thanks,
Kyrill








Re: [PATCH] RTEMS: Enable some libstdc++ features

2017-11-20 Thread Jonathan Wakely

On 20/11/17 09:03 +0100, Sebastian Huber wrote:

libstdc++/
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Add rtems*.
(GLIBCXX_ENABLE_FILESYSTEM_TS): Likewise.
* configure: Regenerate.


OK, thanks.




Re: [v3 PATCH] Implement LWG 2353

2017-11-20 Thread Jonathan Wakely

On 19/11/17 02:40 +0200, Ville Voutilainen wrote:

--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/operations/lwg2353.cc
@@ -0,0 +1,26 @@
+// { dg-options "-D_GLIBCXX_CONCEPT_CHECKS" }
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+template
+auto


This will FAIL if the test is run with
--target_board=unix/-std=gnu++11

OK for trunk with an explicitly specified return type here.



Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c

2017-11-20 Thread Christophe Lyon
Hi,

On 17 November 2017 at 12:12, Kyrill  Tkachov
 wrote:
>
> On 17/11/17 10:45, Sudi Das wrote:
>>
>> Hi Kyrill
>>
>> Thanks I have made the change.
>
>
> Thanks Sudi, I've committed this on your behalf with r254863.
>
> Kyrill
>
>
>> Sudi
>>
>>
>>
>> From: Kyrill Tkachov 
>> Sent: Thursday, November 16, 2017 5:03 PM
>> To: Sudi Das; gcc-patches@gcc.gnu.org
>> Cc: nd; Ramana Radhakrishnan; Richard Earnshaw
>> Subject: Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c
>>
>> Hi Sudi,
>>
>> On 16/11/17 16:37, Sudi Das wrote:
>>>
>>> Hi
>>>
>>> This patch fixes the test case armv8_2-fp16-move-1.c for
>>> arm-none-linux-gnueabihf where 2 of the scan-assembler directives were
>>> failing. We now generate less vmov between core and VFP registers.
>>> Thus changing those directives to reflect that.
>>>
>>> Is this ok for trunk?
>>> If yes could someone commit it on my behalf?
>>>
>>> Sudi
>>>
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2017-11-16  Sudakshina Das  
>>>
>>>   * gcc.target/arm/armv8_2-fp16-move-1.c: Edit vmov
>>> scan-assembler
>>>   directives.
>>>
>> diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
>> b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
>> index bb4e68f..0ed8560 100644
>> --- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
>> +++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
>> @@ -101,8 +101,8 @@ test_select_8 (__fp16 a, __fp16 b, __fp16 c)
>>/* { dg-final { scan-assembler-times {vselgt\.f16\ts[0-9]+, s[0-9]+,
>> s[0-9]+} 1 } }  */
>>/* { dg-final { scan-assembler-times {vselge\.f16\ts[0-9]+, s[0-9]+,
>> s[0-9]+} 1 } }  */
>>-/* { dg-final { scan-assembler-times {vmov\.f16\ts[0-9]+, r[0-9]+} 4 }
>> }  */
>> -/* { dg-final { scan-assembler-times {vmov\.f16\tr[0-9]+, s[0-9]+} 4 } }
>> */
>> +/* { dg-final { scan-assembler-times {vmov\.f16\ts[0-9]+, r[0-9]+} 2 } }
>> */
>> +/* { dg-final { scan-assembler-times {vmov\ts[0-9]+, s[0-9]+} 4 } }  */
>>Some of the moves between core and fp registers were the result of
>> inefficient codegen and in hindsight
>> scanning for them was not very useful. Now that we emit only the required
>> ones I think scanning for the plain
>> vmovs between two S-registers doesn't test anything useful.
>> So can you please just remove the second scan-assembler directive here?
>>

You are probably already aware of that: the tests fail on
arm-none-linux-gnueabi/arm-none-eabi
FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vmov\\.f16\\ts[0-9]+, r[0-9]+ 2 (found 38 times)

but this is not a regression, the previous version of the test had the
same problem.

Christophe

>> Thanks,
>> Kyrill
>>
>>
>
>


Re: VRP: x+1 and -x cannot be INT_MIN

2017-11-20 Thread Marc Glisse

On Mon, 20 Nov 2017, Richard Biener wrote:


I had to adapt one testcase where for VR_VARYING | [1, 1] we used to produce
~[0, 0] and now produce [-INT_MAX, INT_MAX]. I am surprised at how late the
transformation now happens (only after removing __builtin_unreachable, in
forwprop3, while trunk currently has it in evrp IIRC), but I didn't
investigate, doesn't seem like the right time with all the VRP changes going
on.


Interesting - can you open a bugreport so we don't forget?  I suspect it's
the effect of zero_nonzero_bits_from_vr handling VARYING and [INT_MIN, INT_MAX]
differently rippling down.  At some point in time I wanted to get rid of VARYING
in favor of [INT_MIN, INT_MAX] ...


I filed PR 83072 about missing the optimization in evrp, but now I am not 
sure if that's what you wanted in the PR or if you wanted one about 
chosing between ~[0, 0] and [-INT_MAX, INT_MAX] for VR_VARYING | [1, 1]...


--
Marc Glisse


Re: [PATCH] Fix test-suite fallout of default -Wreturn-type.

2017-11-20 Thread Jason Merrill
On Sun, Nov 19, 2017 at 7:56 AM, Thomas Schwinge
 wrote:
> OK to fix the first four issues as follows?  If approving this, please
> respond with "Reviewed-by: NAME " so that your effort will be
> recorded.  See .

OK.

Reviewed-by: Jason Merrill 


Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-20 Thread Jason Merrill
On Sun, Nov 19, 2017 at 6:54 PM, Marc Glisse  wrote:
> new version, regtested on powerpc64le-unknown-linux-gnu. The front-end parts
> are up for review.

The C++ changes are OK.

Jason


[PATCH] Fix Werror=stringop-overflow in target.c

2017-11-20 Thread Martin Liška
Hello.

This is fix of compilation error I see with 
--enable-offload-targets=nvptx-none=. It's explained
in very detail way here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83007#c1

Ready for trunk?
Martin

libgomp/ChangeLog:

2017-11-20  Martin Liska  

* target.c (gomp_target_init): Use proper string operation.
---
 libgomp/target.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)


diff --git a/libgomp/target.c b/libgomp/target.c
index 8ac05e8c641..4838dc98de6 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2668,7 +2668,10 @@ gomp_target_init (void)
 	  }
 
 	strcpy (plugin_name, prefix);
-	strncat (plugin_name, cur, next ? next - cur : strlen (cur));
+	if (next)
+	  strncat (plugin_name, cur, next - cur);
+	else
+	  strcpy (plugin_name, cur);
 	strcat (plugin_name, suffix);
 
 	if (gomp_load_plugin_for_device (_device, plugin_name))



Re: [PATCH] make canonicalize_condition keep its promise

2017-11-20 Thread Aaron Sawdey
On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
> On 11/15/2017 08:40 AM, Aaron Sawdey wrote:
> > So, the story of this very small patch starts with me adding
> > patterns
> > for ppc instructions bdz[tf] and bdnz[tf] such as this:
> > 
> >   [(set (pc)
> > (if_then_else
> >   (and
> >  (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
> >  (const_int 1))
> >  (match_operator 3 "branch_comparison_operator"
> >   [(match_operand 4 "cc_reg_operand" "y,y,y,y")
> >    (const_int 0)]))
> >   (label_ref (match_operand 0))
> >   (pc)))
> >    (set (match_operand:P 2 "nonimmediate_operand"
> > "=1,*r,m,*d*wi*c*l")
> > (plus:P (match_dup 1)
> > (const_int -1)))
> >    (clobber (match_scratch:P 5 "=X,X,,r"))
> >    (clobber (match_scratch:CC 6 "=X,,,"))
> >    (clobber (match_scratch:CCEQ 7 "=X,,,"))]
> > 
> > However when this gets to the loop_doloop pass, we get an assert
> > fail
> > in iv_number_of_iterations():
> > 
> >   gcc_assert (COMPARISON_P (condition));
> > 
> > This is happening because this branch insn tests two things ANDed
> > together so the and is at the top of the expression, not a
> > comparison.
> 
> Is this something you've created for an existing loop?  Presumably an
> existing loop that previously was a simple loop?

The rtl to use this instruction is generated by new code I'm working on
to do a builtin expansion of memcmp using a loop. I call
gen_bdnztf_di() to generate rtl for the insn. It would be nice to be
able to generate this instruction from doloop conversion but that is
beyond the scope of what I'm working on presently.

> > 
> > This condition is extracted from the insn by get_condition() which
> > is
> > pretty straightforward, and which calls canonicalize_condition()
> > before
> > returning it. Now, one could put a test for a jump condition that
> > is
> > not a conditional test in here but the comment for
> > canonicalize_condition() says:
> > 
> >    (1) The code will always be a comparison operation (EQ, NE, GT,
> > etc.).
> > 
> > So, this patch adds a test at the end that just returns 0 if the
> > return
> > rtx is not a comparison. As it happens, doloop conversion is not
> > needed
> > here because I'm already generating rtl for a branch-decrement
> > counter
> > based loop.
> 
> I'm not at all familiar with this code, but I would probably guess
> that
> COND is supposed to appear within INSN.  Canonicalize comparison is
> supposed to modify the COND expression that appears within INSN.  THe
> overall structure of INSN doesn't much matter as long as COND refers
> to
> a comparison code with two operands.
> 
> My gut tells me the problem is upstream or downstream in the call
> chain
> or that you've taken a loop that was considered a simple loop and
> modified the exit test in a way that other code is not prepared to
> handle.

Yes, get_condition() is the other place that this could be fixed. It
expects rtl of the form:

(set (pc) (if_then_else (cond ...

and here it's seeing something of the form:

(set (pc) (if_then_else (and (cond ... ) (cond ...)

It seems legitimate to add a check for this in get_condition() as well:

  cond = XEXP (SET_SRC (set), 0);
  if (!COMPARISON_P (cond))
return NULL_RTX;

There are a couple of uses of canonicalize_condition() in ifcvt.c and
it looks like the code there may expect the return to be a conditional.
This suggests it might be a good idea to do both checks.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [PATCH] Fix test-suite fallout of default -Wreturn-type.

2017-11-20 Thread Martin Liška
On 11/19/2017 01:56 PM, Thomas Schwinge wrote:
> Hi!
> 
> On Wed, 18 Oct 2017 14:48:13 +0200, Martin Liška  wrote:
>> This is second patch that addresses test-suite fallout. All these tests fail 
>> because -Wreturn-type is
>> now on by default.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> ... but alters some LTO test cases as follows:
> 
>> --- a/gcc/testsuite/g++.dg/lto/20080907_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/20080907_0.C
>> @@ -1,3 +1,5 @@
>>  // { dg-lto-do assemble }
>> +// { dg-lto-options "-Wno-return-type" }
> 
> This means that instead of the default (without explicit
> "dg-lto-options") "LTO options torture testing" testing, we're now *only*
> testing one variant, with only the "-Wno-return-type" option specified.
> 
>> --- a/gcc/testsuite/g++.dg/lto/20080915_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/20080915_0.C
>> @@ -1,4 +1,6 @@
>>  // { dg-lto-do assemble }
>> +// { dg-lto-options "-Wno-return-type" }
> 
> Same, but...
> 
>> @@ -16,7 +18,7 @@ int func(const Bar& b) {
>>  }
>>  
>>  struct Baz {
>> - Bar& operator*() {}
>> + Bar& operator*() { static Bar a; return a; }
>>  };
> 
> ... given that you're addressing here the (only, as it seems)
> "-Wreturn-type" issue, you don't need to specify "-Wno-return-type" at
> all.
> 
>> --- a/gcc/testsuite/g++.dg/lto/20091002-1_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/20091002-1_0.C
>> @@ -1,6 +1,6 @@
>>  // { dg-lto-do link }
>>  // { dg-require-effective-target fpic }
>> -// { dg-lto-options {{-fPIC -flto}} }
>> +// { dg-lto-options {{-fPIC -flto -Wno-return-type}} }
>>  // { dg-extra-ld-options "-fPIC -r -nostdlib" }
> 
> ACK.  This only tested one variant: "-fPIC -flto", now testing "-fPIC
> -flto -Wno-return-type".  (Similarly for several more.)
> 
>> --- a/gcc/testsuite/g++.dg/lto/20101010-1_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/20101010-1_0.C
>> @@ -1,4 +1,5 @@
>>  // { dg-lto-do link }
>> +// { dg-lto-options "-Wno-return-type" }
> 
> NACK, see above.
> 
>> --- a/gcc/testsuite/g++.dg/lto/20101010-2_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/20101010-2_0.C
>> @@ -1,4 +1,5 @@
>>  // { dg-lto-do link }
>> +// { dg-lto-options "-Wno-return-type" }
> 
> NACK, see above.
> 
>> --- a/gcc/testsuite/g++.dg/lto/pr65475c_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/pr65475c_0.C
>> @@ -1,6 +1,7 @@
>>  /* { dg-lto-do link } */
>> -/* { dg-lto-options "-O2  -w" } */
>>  /* { dg-extra-ld-options { -O2 -Wno-odr -r -nostdlib } } */
>> +/* { dg-lto-options { "-O2 -w -Wno-return-type" } } */
> 
> I don't know about that one.  This previously tested two variants
> separately: "-O2" and "-w", and you've now changed it to test one
> combined variant: "-O2 -w -Wno-return-type".  Your change is correct if
> this previously meant to test the combined variant "-O2 -w" instead of
> each of them separately -- which I don't know/have not verified.

Hello.

I don't think so, I've just tested that and all invocations of the source file
have '-O2  -w' in corresponding command line. Can you please verify that it 
builds
all possible combinations of provided arguments of dg-lto-options?

> 
>> --- a/gcc/testsuite/g++.dg/lto/pr69589_0.C
>> +++ b/gcc/testsuite/g++.dg/lto/pr69589_0.C
>> @@ -1,5 +1,5 @@
>>  // { dg-lto-do link }
>> -// { dg-lto-options "-O2 -rdynamic" }
>> +// { dg-lto-options { "-O2 -rdynamic -Wno-return-type" } }
>>  // { dg-extra-ld-options "-r -nostdlib" }
> 
> Same: "-O2" and "-rdynamic" separately vs. now "-O2 -rdynamic
> -Wno-return-type" combined.
> 
> Unfortunately, we can't use "dg-options" or "dg-additional-options" here:
> 
> WARNING: lto.exp does not support dg-additional-options
> WARNING: lto.exp does not support dg-options in primary source file

I know, I've tried that :)

Martin

> 
> OK to fix the first four issues as follows?  If approving this, please
> respond with "Reviewed-by: NAME " so that your effort will be
> recorded.  See .
> 
> diff --git gcc/testsuite/g++.dg/lto/20080907_0.C 
> gcc/testsuite/g++.dg/lto/20080907_0.C
> index a423196..153d0ab 100644
> --- gcc/testsuite/g++.dg/lto/20080907_0.C
> +++ gcc/testsuite/g++.dg/lto/20080907_0.C
> @@ -1,5 +1,7 @@
>  // { dg-lto-do assemble }
> -// { dg-lto-options "-Wno-return-type" }
> +
> +/* "WARNING: lto.exp does not support dg-additional-options" */
> +#pragma GCC diagnostic ignored "-Wreturn-type"
>  
>  struct Foo { void func (); }; Foo & bar () { } struct Baz { Baz (Baz &); };
>  Baz dummy() { bar().func(); }
> diff --git gcc/testsuite/g++.dg/lto/20080915_0.C 
> gcc/testsuite/g++.dg/lto/20080915_0.C
> index 40c5042..c91e756 100644
> --- gcc/testsuite/g++.dg/lto/20080915_0.C
> +++ gcc/testsuite/g++.dg/lto/20080915_0.C
> @@ -1,5 +1,4 @@
>  // { dg-lto-do assemble }
> -// { dg-lto-options "-Wno-return-type" }
>  
>  struct Foo {
>   static const int dummy;
> diff --git gcc/testsuite/g++.dg/lto/20101010-1_0.C 
> gcc/testsuite/g++.dg/lto/20101010-1_0.C
> index 8f694c7..bb3e6d4 100644
> --- 

Re: Fix gcc-7 and gcc-8 build of GO for Hurd and not kFreeBSD in debian/rules.defs

2017-11-20 Thread Svante Signell
On Mon, 2017-11-20 at 14:04 +0100, Aurelien Jarno wrote:
> On 2017-11-20 13:20, Svante Signell wrote:
> > On Thu, 2017-11-16 at 20:39 +0100, Svante Signell wrote:
> > 
> > Seeing the recent update of debian/rules.defs in gcc-7 a mistake was
> > revealed in the patch debian_rules.defs for both gcc-7 and gcc-8. The
> > correct patch is inlined below:
> > 
> > --- a/debian/rules.defs 2017-11-20 12:59:25.0 +0100
> > +++ b/debian/rules.defs 2017-11-20 13:01:54.0 +0100
> > @@ -807,7 +807,7 @@
> >  ifeq (,$(filter $(distrelease),lenny etch squeeze dapper hardy jaunty
> > karmic
> > lucid maverick natty oneiric))
> >    go_no_cpus := $(filter-out arm, $(go_no_cpus))
> >  endif
> > -go_no_systems := gnu kfreebsd-gnu
> > +go_no_systems := kfreebsd
> >  
> >  ifneq ($(with_base_only),yes)
> >    ifneq ($(separate_lang),yes)
> > @@ -817,7 +817,7 @@
> >  ifneq (,$(filter $(DEB_TARGET_ARCH_CPU),$(go_no_cpus)))
> >    with_go := disabled for cpu $(DEB_TARGET_ARCH_CPU)
> >  endif
> > -ifneq (,$(findstring $(DEB_TARGET_GNU_SYSTEM),$(go_no_systems)))
> > +ifneq (,$(findstring $(DEB_TARGET_ARCH_OS),$(go_no_systems)))
> >    with_go := disabled for system $(DEB_TARGET_GNU_SYSTEM)
> >  endif
> >  ifeq ($(go_no_cross)-$(DEB_CROSS),yes-yes)
> > 
> > Changing back to go_no_systems := kfreebsd-gnu as aurel32 did disables the
> > build of go also for GNU/Hurd. The culprit is to match DEB_TARGET_ARCH_OS to
> > kfreebsd to disable the build of go for GNU/kFreeBSD, not to match
> > DEB_TARGET_GNU_SYSTEM to kfreebsd-gnu, since that will match gnu too :(
> 
> I did that change because changing go_no_systems from kfreebsd-gnu to
> kfreebsd is what caused GCC 7 to FTBFS on at least kfreebsd-i386 since
> version 7.2.0-15 (see bug#881656).
> 
> Your change also looks fine to me. Should I commit it?

Yes please. And do this change also for gcc-8, otherwise it will FTBFS on
kFreeBSD too.


Re: Fix gcc-7 and gcc-8 build of GO for Hurd and not kFreeBSD in debian/rules.defs

2017-11-20 Thread Aurelien Jarno
On 2017-11-20 13:20, Svante Signell wrote:
> On Thu, 2017-11-16 at 20:39 +0100, Svante Signell wrote:
> 
> Seeing the recent update of debian/rules.defs in gcc-7 a mistake was revealed 
> in
> the patch debian_rules.defs for both gcc-7 and gcc-8. The correct patch is
> inlined below:
> 
> --- a/debian/rules.defs   2017-11-20 12:59:25.0 +0100
> +++ b/debian/rules.defs   2017-11-20 13:01:54.0 +0100
> @@ -807,7 +807,7 @@
>  ifeq (,$(filter $(distrelease),lenny etch squeeze dapper hardy jaunty karmic
> lucid maverick natty oneiric))
>    go_no_cpus := $(filter-out arm, $(go_no_cpus))
>  endif
> -go_no_systems := gnu kfreebsd-gnu
> +go_no_systems := kfreebsd
>  
>  ifneq ($(with_base_only),yes)
>    ifneq ($(separate_lang),yes)
> @@ -817,7 +817,7 @@
>  ifneq (,$(filter $(DEB_TARGET_ARCH_CPU),$(go_no_cpus)))
>    with_go := disabled for cpu $(DEB_TARGET_ARCH_CPU)
>  endif
> -ifneq (,$(findstring $(DEB_TARGET_GNU_SYSTEM),$(go_no_systems)))
> +ifneq (,$(findstring $(DEB_TARGET_ARCH_OS),$(go_no_systems)))
>    with_go := disabled for system $(DEB_TARGET_GNU_SYSTEM)
>  endif
>  ifeq ($(go_no_cross)-$(DEB_CROSS),yes-yes)
> 
> Changing back to go_no_systems := kfreebsd-gnu as aurel32 did disables the 
> build
> of go also for GNU/Hurd. The culprit is to match DEB_TARGET_ARCH_OS to 
> kfreebsd
> to disable the build of go for GNU/kFreeBSD, not to match 
> DEB_TARGET_GNU_SYSTEM
> to kfreebsd-gnu, since that will match gnu too :(

I did that change because changing go_no_systems from kfreebsd-gnu to
kfreebsd is what caused GCC 7 to FTBFS on at least kfreebsd-i386 since
version 7.2.0-15 (see bug#881656).

Your change also looks fine to me. Should I commit it?

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


Re: [014/nnn] poly_int: indirect_refs_may_alias_p

2017-11-20 Thread Richard Sandiford
Jeff Law  writes:
> On 10/23/2017 11:05 AM, Richard Sandiford wrote:
>> This patch makes indirect_refs_may_alias_p use ranges_may_overlap_p
>> rather than ranges_overlap_p.  Unlike the former, the latter can handle
>> negative offsets, so the fix for PR44852 should no longer be necessary.
>> It can also handle offset_int, so avoids unchecked truncations to
>> HOST_WIDE_INT.
>> 
>> 
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>> gcc/
>>  * tree-ssa-alias.c (indirect_ref_may_alias_decl_p)
>>  (indirect_refs_may_alias_p): Use ranges_may_overlap_p
>>  instead of ranges_overlap_p.
> OK.
>
> Note that this highlighted a nit in patch 001 -- namely that there's new
> function templates that aren't mentioned in the ChangeLog.

Do you mean ranges_may_overlap_p?  I can add that and the other new
poly-int.h functions to the changelog if you think it's useful,
but I thought for new files it was more usual just to do:

* foo.h: New file.

Thanks,
Richard


Re: [PATCH][AArch64] Restrict POST_INC operand in aarch64_simd_mem_operand_p to register

2017-11-20 Thread Dominik Inführ
Thanks for the review! Could you also please commit this patch for me? I don’t 
have commit rights.

Thanks,
Dominik

> On 17 Nov 2017, at 23:18, James Greenhalgh  wrote:
> 
> On Tue, Oct 31, 2017 at 02:47:29PM +0100, Dominik Inführ wrote:
>> Hi,
>> 
>> I have a custom optimization pass, that moves an expression into an
>> POST_INC-expression. GCC then ICE’s in df-scan.c since it expects REG_P to be
>> true for POST_INC’s operand. aarch64_simd_mem_operand_p doesn’t seem to check
>> POST_INC’s operand. Here is a patch that fixes this for me, although I am not
>> sure if this is the right way to address this. GCC bootstraps and it causes
>> no test regressions.
> 
> OK!
> 
> Reviewed-by: James Greenhalgh 
> 
> Thanks,
> James
> 
>> 
>> Dominik
>> 
>> ChangeLog:
>> 2017-10-31  Dominik Infuehr  
>> 
>>  * config/aarch64/aarch64.c (aarch64_simd_mem_operand_p): Check
>>  if register given as operand for POST_INC.
>> 
>> -
>> 
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index ed30b8c..bb61506 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -11850,8 +11850,15 @@ aarch64_simd_lane_bounds (rtx operand, 
>> HOST_WIDE_INT low, HOST_WIDE_INT high,
>> bool
>> aarch64_simd_mem_operand_p (rtx op)
>> {
>> -  return MEM_P (op) && (GET_CODE (XEXP (op, 0)) == POST_INC
>> -   || REG_P (XEXP (op, 0)));
>> +  if (!MEM_P (op))
>> +return false;
>> +
>> +  rtx opnd = XEXP (op, 0);
>> +
>> +  if (GET_CODE (opnd) == POST_INC)
>> +opnd = XEXP(opnd, 0);
>> +
>> +  return REG_P (opnd);
>> }
>> 
>> /* Emit a register copy from operand to operand, taking care not to
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


  1   2   >