Re: [PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Andrew Pinski
On Tue, Oct 7, 2014 at 11:43 PM, Thomas Preud'homme
 wrote:
>> From: Jakub Jelinek [mailto:ja...@redhat.com]
>> Sent: Wednesday, October 08, 2014 2:39 PM
>>
>> Doesn't it turn 16-bit {L,R}ROTATE_EXPR used alone into
>> __builtin_bswap16?
>> For those the question is if the canonical GIMPLE should be the rotation
>> or
>> byteswap, I'd think rotation would be perhaps better.  Or depending on
>> if
>> the backend has bswaphi2 or rotate pattern?
>
> Good point. It seems better to keep the status quo.


There is already code which converts __builtin_bswap16 to the rotate
so maybe it does not matter.  Though I have not seen code which does
the rotate into a bswaphi2 pattern.

Thanks,
Andrew


>
>>
>> Also, perhaps you could short-circuit this if the rotation isn't by constant
>> or not a multiple of BITS_PER_UNIT.  So
>> switch (code)
>>   {
>>   case BIT_IOR_EXPR:
>> break;
>>   case LROTATE_EXPR:
>>   case RROTATE_EXPR:
>> if (!tree_fits_uhwi_p (gimple_assign_rhs2 (cur_stmt))
>> || (tree_to_uhwi (gimple_assign_rhs2 (cur_stmt))
>> % BITS_PER_UNIT))
>>   continue;
>> break;
>>   default:
>> continue;
>>   }
>> ?
>
> Right. Thanks for the comments.
>
> Best regards,
>
> Thomas
>
>
>
>


RE: [PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Thomas Preud'homme
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, October 08, 2014 2:43 PM
> > Also, perhaps you could short-circuit this if the rotation isn't by constant

Note that do_shift_rotate already check for this. Is it enough?

Best regards,

Thomas 






Re: [Patch, MIPS] Add Octeon3 support

2014-10-07 Thread Hurugalawadi, Naveen
Hi,

>> Patches adding new -march= values need to update invoke.texi.

Thanks for reviewing the patch and suggestion.
Please find attached the modified patch which updates octeon3
in invoke.texi
Please review the patch and let us know if there should be any
further modifications.

Thanks,

2014-10-08  Andrew Pinski  

* config/mips/mips-cpus.def (octeon3): New cpu.
* config/mips/mips.c (mips_rtx_cost_data): Add octeon3.
(mips_print_operand ): Fix a bug as the mode
of the comparison no longer matches mode of the operands.
(mips_issue_rate): Handle PROCESSOR_OCTEON3.
* config/mips/mips.h (TARGET_OCTEON):  Add Octeon3.
(TARGET_OCTEON2): Likewise.
(TUNE_OCTEON): Add Octeon3.
* config/mips/mips.md (processor): Add octeon3.
* config/mips/octeon.md (octeon_fpu): New automaton and cpu_unit.
(octeon_arith): Add octeon3.
(octeon_condmove): Remove.
(octeon_condmove_o1): New reservation.
(octeon_condmove_o2): New reservation.
(octeon_condmove_o3_int_on_cc): New reservation.
(octeon_load_o2): Add octeon3.
(octeon_cop_o2): Likewise.
(octeon_store): Likewise.
(octeon_brj_o2): Likewise.
(octeon_imul3_o2): Likewise.
(octeon_imul_o2): Likewise.
(octeon_mfhilo_o2): Likewise.
(octeon_imadd_o2): Likewise.
(octeon_idiv_o2_si): Likewise.
(octeon_idiv_o2_di): Likewise.
(octeon_fpu): Add to the automaton.
(octeon_fpu): New cpu unit.
(octeon_condmove_o2): Check for non floating point modes.
(octeon_load_o2): Add prefetchx.
(octeon_cop_o2): Don't check for octeon3.
(octeon3_faddsubcvt): New reservation.
(octeon3_fmul): Likewise.
(octeon3_fmadd): Likewise.
(octeon3_div_sf): Likewise.
(octeon3_div_df): Likewise.
(octeon3_sqrt_sf): Likewise.
(octeon3_sqrt_df): Likewise.
(octeon3_rsqrt_sf): Likewise.
(octeon3_rsqrt_df): Likewise.
(octeon3_fabsnegmov): Likewise.
(octeon_fcond): Likewise.
(octeon_fcondmov): Likewise.
(octeon_fpmtc1): Likewise.
(octeon_fpmfc1): Likewise.
(octeon_fpload): Likewise.
(octeon_fpstore): Likewise.
* config/mips/mips-tables.opt: Regenerate.
* doc/invoke.texi (-march=@var{arch}): Add octeon3.diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e198b2b..43cfac2 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,54 @@
+2014-10-08  Andrew Pinski  
+
+	* config/mips/mips-cpus.def (octeon3): New cpu.
+	* config/mips/mips.c (mips_rtx_cost_data): Add octeon3.
+	(mips_print_operand ): Fix a bug as the mode
+	of the comparison no longer matches mode of the operands.
+	(mips_issue_rate): Handle PROCESSOR_OCTEON3.
+	* config/mips/mips.h (TARGET_OCTEON):  Add Octeon3.
+	(TARGET_OCTEON2): Likewise.
+	(TUNE_OCTEON): Add Octeon3.
+	* config/mips/mips.md (processor): Add octeon3.
+	* config/mips/octeon.md (octeon_fpu): New automaton and cpu_unit.
+	(octeon_arith): Add octeon3.
+	(octeon_condmove): Remove.
+	(octeon_condmove_o1): New reservation.
+	(octeon_condmove_o2): New reservation.
+	(octeon_condmove_o3_int_on_cc): New reservation.
+	(octeon_load_o2): Add octeon3.
+	(octeon_cop_o2): Likewise.
+	(octeon_store): Likewise.
+	(octeon_brj_o2): Likewise.
+	(octeon_imul3_o2): Likewise.
+	(octeon_imul_o2): Likewise.
+	(octeon_mfhilo_o2): Likewise.
+	(octeon_imadd_o2): Likewise.
+	(octeon_idiv_o2_si): Likewise.
+	(octeon_idiv_o2_di): Likewise.
+	(octeon_fpu): Add to the automaton.
+	(octeon_fpu): New cpu unit.
+	(octeon_condmove_o2): Check for non floating point modes.
+	(octeon_load_o2): Add prefetchx.
+	(octeon_cop_o2): Don't check for octeon3.
+	(octeon3_faddsubcvt): New reservation.
+	(octeon3_fmul): Likewise.
+	(octeon3_fmadd): Likewise.
+	(octeon3_div_sf): Likewise.
+	(octeon3_div_df): Likewise.
+	(octeon3_sqrt_sf): Likewise.
+	(octeon3_sqrt_df): Likewise.
+	(octeon3_rsqrt_sf): Likewise.
+	(octeon3_rsqrt_df): Likewise.
+	(octeon3_fabsnegmov): Likewise.
+	(octeon_fcond): Likewise.
+	(octeon_fcondmov): Likewise.
+	(octeon_fpmtc1): Likewise.
+	(octeon_fpmfc1): Likewise.
+	(octeon_fpload): Likewise.
+	(octeon_fpstore): Likewise.
+	* config/mips/mips-tables.opt: Regenerate.
+	* doc/invoke.texi (-march=@var{arch}): Add octeon3.
+
 2014-10-07  Iain Sandoe  
 
 	PR target/61387
diff --git a/gcc/config/mips/mips-cpus.def b/gcc/config/mips/mips-cpus.def
index d5528d3..e2985b8 100644
--- a/gcc/config/mips/mips-cpus.def
+++ b/gcc/config/mips/mips-cpus.def
@@ -162,4 +162,5 @@ MIPS_CPU ("loongson3a", PROCESSOR_LOONGSON_3A, 65, PTF_AVOID_BRANCHLIKELY)
 MIPS_CPU ("octeon", PROCESSOR_OCTEON, 65, PTF_AVOID_BRANCHLIKELY)
 MIPS_CPU ("octeon+", PROCESSOR_OCTEON, 65, PTF_AVOID_BRANCHLIKELY)
 MIPS_CPU ("octeon2", PROCESSOR_OCTEON2, 65, PTF_AVOID_BRANCHLIKELY)
+MIPS_CPU ("octeon3", PROCESSOR_OCTEON3, 65, PTF_AVOID_BRANCHLIKELY)
 MIPS_CPU ("xlp", PROCESSOR_XLP, 65, PTF_AVOID_BRANCHLIKELY)
diff --git a/gcc/config/mips/mips-tables.opt b/gcc/config/mips/mips-tables.opt
index 5791b41..99d2ed8 100644
--- a/gcc/config/mips/mips-tables.opt
+++ b/gcc/config/mips/mips-ta

RE: [PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Thomas Preud'homme
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, October 08, 2014 2:39 PM
> 
> Doesn't it turn 16-bit {L,R}ROTATE_EXPR used alone into
> __builtin_bswap16?
> For those the question is if the canonical GIMPLE should be the rotation
> or
> byteswap, I'd think rotation would be perhaps better.  Or depending on
> if
> the backend has bswaphi2 or rotate pattern?

Good point. It seems better to keep the status quo.

> 
> Also, perhaps you could short-circuit this if the rotation isn't by constant
> or not a multiple of BITS_PER_UNIT.  So
> switch (code)
>   {
>   case BIT_IOR_EXPR:
> break;
>   case LROTATE_EXPR:
>   case RROTATE_EXPR:
> if (!tree_fits_uhwi_p (gimple_assign_rhs2 (cur_stmt))
> || (tree_to_uhwi (gimple_assign_rhs2 (cur_stmt))
> % BITS_PER_UNIT))
>   continue;
> break;
>   default:
> continue;
>   }
> ?

Right. Thanks for the comments.

Best regards,

Thomas






Re: [PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Jakub Jelinek
On Wed, Oct 08, 2014 at 09:56:51AM +0800, Thomas Preud'homme wrote:
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -2377,11 +2377,16 @@ pass_optimize_bswap::execute (function *fun)
>  {
> gimple src_stmt, cur_stmt = gsi_stmt (gsi);
> tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
> +   enum tree_code code;
> struct symbolic_number n;
> bool bswap;
>  
> -   if (!is_gimple_assign (cur_stmt)
> -   || gimple_assign_rhs_code (cur_stmt) != BIT_IOR_EXPR)
> +   if (!is_gimple_assign (cur_stmt))
> + continue;
> +
> +   code = gimple_assign_rhs_code (cur_stmt);
> +   if (code != BIT_IOR_EXPR && code != LROTATE_EXPR
> +   && code != RROTATE_EXPR)
>   continue;
>  
> src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);

Doesn't it turn 16-bit {L,R}ROTATE_EXPR used alone into __builtin_bswap16?
For those the question is if the canonical GIMPLE should be the rotation or
byteswap, I'd think rotation would be perhaps better.  Or depending on if
the backend has bswaphi2 or rotate pattern?

Also, perhaps you could short-circuit this if the rotation isn't by constant
or not a multiple of BITS_PER_UNIT.  So
  switch (code)
{
case BIT_IOR_EXPR:
  break;
case LROTATE_EXPR:
case RROTATE_EXPR:
  if (!tree_fits_uhwi_p (gimple_assign_rhs2 (cur_stmt))
  || (tree_to_uhwi (gimple_assign_rhs2 (cur_stmt))
  % BITS_PER_UNIT))
continue;
  break;
default:
  continue;
}
?

Jakub


Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-10-07 Thread Bin.Cheng
On Wed, Oct 8, 2014 at 1:28 PM, Jeff Law  wrote:
> On 10/06/14 19:31, Bin.Cheng wrote:
>>
>> On Tue, Oct 7, 2014 at 1:20 AM, Mike Stump  wrote:
>>>
>>> On Oct 6, 2014, at 4:32 AM, Richard Biener 
>>> wrote:

 On Mon, Oct 6, 2014 at 11:57 AM, Bin.Cheng 
 wrote:

 How many merging opportunities does sched2 undo again?  ISTR it
 has the tendency of pushing stores down and loads up.
>>>
>>>
>>> So, the pass works by merging 2 or more loads into 1 load (at least on my
>>> port).  sched2 would need to rip apart 1 load into 2 loads to be able to
>>> undo the real work.  The non-real work, doesn't matter any.  Can sched2 rip
>>> apart a single load?
>>
>> On ARM and AARCH64, the two merged load/store are transformed into
>> single parallel insn by the following peephole2 pass, so that sched2
>> would not undo the fusion work.  I though sched2 works on the basis of
>> instructions, and it isn't good practice to have sched2 do split work.
>
> It's certainly advantageous for sched2 to split insns that generate multiple
> instructions.  Running after register allocation, sched2 is ideal for
> splitting because the we know the alternative for each insn and thus we can
> (possibly for the first time) accurately know if a particular insn will
> generate multiple assembly instructions.
>
> If the port has a splitter to rip apart a douple-word load into single-word
> loads, then we'd obviously only want to do that in cases where the
> double-word load actually generates > 1 assembly instruction.
>
> Addressing issues in that space seems out of scope for Bin's work to me,
> except perhaps for such issues on aarch64/arm which are Bin's primary
> concerns.


Hi Jeff,

Thanks very much for the explanation.  Very likely I am wrong here,
but seems what you mentioned fits to pass_split_before_sched2 very
well.  Then I guess it would be nice if we can differentiate cases in
the first place by generating different patterns, rather than split
some of instructions later.  Though I have no idea if we can do that
or not.

For arm/aarch64, I guess it's not an issue, otherwise the peephole2
won't work at all.  ARM maintainers should have answer to this.


>
> jeff


Re: [PATCH] Fix PR bootstrap/63432 in jump threading

2014-10-07 Thread Jeff Law

On 10/07/14 22:39, Teresa Johnson wrote:

This patch addresses PR bootstrap/63432 which was an insanity in the
probabilities created during jump threading. This was caused by threading
multiple times along the same path leading to the second jump thread path being
corrupted, which in turn caused the profile update code to fail. There
was code in mark_threaded_blocks that intended to catch and suppress
these cases of threading multiple times along the same path, but it
was sensitive to the order in which the paths were discovered and recorded.
This patch changes the detection to do two passes and removes the ordering
sensitivity.

Also, while fixing this I realized that the previous method of checking
the entry BB's profile count was not an accurate way to determine whether
the function has any non-zero profile counts. Created a new routine
to walk the path and see if it has all zero profile counts and estimated
frequencies.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Also did an LTO
profiledbootstrap.

Ok for trunk?

Thanks,
Teresa

2014-10-07  Teresa Johnson  

 PR bootstrap/63432.
 * tree-ssa-threadupdate.c (estimated_freqs_path): New function.
 (ssa_fix_duplicate_block_edges): Invoke it.
 (mark_threaded_blocks): Make two passes to avoid ordering dependences.

OK.

Thanks,
jeff



Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-10-07 Thread Jeff Law

On 10/06/14 19:31, Bin.Cheng wrote:

On Tue, Oct 7, 2014 at 1:20 AM, Mike Stump  wrote:

On Oct 6, 2014, at 4:32 AM, Richard Biener  wrote:

On Mon, Oct 6, 2014 at 11:57 AM, Bin.Cheng  wrote:

How many merging opportunities does sched2 undo again?  ISTR it
has the tendency of pushing stores down and loads up.


So, the pass works by merging 2 or more loads into 1 load (at least on my 
port).  sched2 would need to rip apart 1 load into 2 loads to be able to undo 
the real work.  The non-real work, doesn't matter any.  Can sched2 rip apart a 
single load?

On ARM and AARCH64, the two merged load/store are transformed into
single parallel insn by the following peephole2 pass, so that sched2
would not undo the fusion work.  I though sched2 works on the basis of
instructions, and it isn't good practice to have sched2 do split work.
It's certainly advantageous for sched2 to split insns that generate 
multiple instructions.  Running after register allocation, sched2 is 
ideal for splitting because the we know the alternative for each insn 
and thus we can (possibly for the first time) accurately know if a 
particular insn will generate multiple assembly instructions.


If the port has a splitter to rip apart a douple-word load into 
single-word loads, then we'd obviously only want to do that in cases 
where the double-word load actually generates > 1 assembly instruction.


Addressing issues in that space seems out of scope for Bin's work to me, 
except perhaps for such issues on aarch64/arm which are Bin's primary 
concerns.


jeff


[PATCH] Fix PR bootstrap/63432 in jump threading

2014-10-07 Thread Teresa Johnson
This patch addresses PR bootstrap/63432 which was an insanity in the
probabilities created during jump threading. This was caused by threading
multiple times along the same path leading to the second jump thread path being
corrupted, which in turn caused the profile update code to fail. There
was code in mark_threaded_blocks that intended to catch and suppress
these cases of threading multiple times along the same path, but it
was sensitive to the order in which the paths were discovered and recorded.
This patch changes the detection to do two passes and removes the ordering
sensitivity.

Also, while fixing this I realized that the previous method of checking
the entry BB's profile count was not an accurate way to determine whether
the function has any non-zero profile counts. Created a new routine
to walk the path and see if it has all zero profile counts and estimated
frequencies.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Also did an LTO
profiledbootstrap.

Ok for trunk?

Thanks,
Teresa

2014-10-07  Teresa Johnson  

PR bootstrap/63432.
* tree-ssa-threadupdate.c (estimated_freqs_path): New function.
(ssa_fix_duplicate_block_edges): Invoke it.
(mark_threaded_blocks): Make two passes to avoid ordering dependences.

Index: tree-ssa-threadupdate.c
===
--- tree-ssa-threadupdate.c (revision 215830)
+++ tree-ssa-threadupdate.c (working copy)
@@ -959,6 +959,43 @@ update_joiner_offpath_counts (edge epath, basic_bl
 }


+/* Check if the paths through RD all have estimated frequencies but zero
+   profile counts.  This is more accurate than checking the entry block
+   for a zero profile count, since profile insanities sometimes creep in.  */
+
+static bool
+estimated_freqs_path (struct redirection_data *rd)
+{
+  edge e = rd->incoming_edges->e;
+  vec *path = THREAD_PATH (e);
+  edge ein;
+  edge_iterator ei;
+  bool non_zero_freq = false;
+  FOR_EACH_EDGE (ein, ei, e->dest->preds)
+{
+  if (ein->count)
+return false;
+  non_zero_freq |= ein->src->frequency != 0;
+}
+
+  for (unsigned int i = 1; i < path->length (); i++)
+{
+  edge epath = (*path)[i]->e;
+  if (epath->src->count)
+return false;
+  non_zero_freq |= epath->src->frequency != 0;
+  edge esucc;
+  FOR_EACH_EDGE (esucc, ei, epath->src->succs)
+{
+  if (esucc->count)
+return false;
+  non_zero_freq |= esucc->src->frequency != 0;
+}
+}
+  return non_zero_freq;
+}
+
+
 /* Invoked for routines that have guessed frequencies and no profile
counts to record the block and edge frequencies for paths through RD
in the profile count fields of those blocks and edges.  This is because
@@ -1058,9 +1095,11 @@ ssa_fix_duplicate_block_edges (struct redirection_
  data we first take a snapshot of the existing block and edge frequencies
  by copying them into the empty profile count fields.  These counts are
  then used to do the incremental updates, and cleared at the end of this
- routine.  */
+ routine.  If the function is marked as having a profile, we still check
+ to see if the paths through RD are using estimated frequencies because
+ the routine had zero profile counts.  */
   bool do_freqs_to_counts = (profile_status_for_fn (cfun) != PROFILE_READ
- || !ENTRY_BLOCK_PTR_FOR_FN (cfun)->count);
+ || estimated_freqs_path (rd));
   if (do_freqs_to_counts)
 freqs_to_counts_path (rd);

@@ -2077,35 +2116,52 @@ mark_threaded_blocks (bitmap threaded_blocks)

   /* Now iterate again, converting cases where we want to thread
  through a joiner block, but only if no other edge on the path
- already has a jump thread attached to it.  */
+ already has a jump thread attached to it.  We do this in two passes,
+ to avoid situations where the order in the paths vec can hide overlapping
+ threads (the path is recorded on the incoming edge, so we would miss
+ cases where the second path starts at a downstream edge on the same
+ path).  First record all joiner paths, deleting any in the unexpected
+ case where there is already a path for that incoming edge.  */
   for (i = 0; i < paths.length (); i++)
 {
   vec *path = paths[i];

   if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK)
+{
+ /* Attach the path to the starting edge if none is yet recorded.  */
+  if ((*path)[0]->e->aux == NULL)
+(*path)[0]->e->aux = path;
+ else if (dump_file && (dump_flags & TDF_DETAILS))
+   dump_jump_thread_path (dump_file, *path, false);
+}
+}
+  /* Second, look for paths that have any other jump thread attached to
+ them, and either finish converting them or cancel them.  */
+  for (i = 0; i < paths.length (); i++)
+{
+  vec *path = paths[i];
+  edge e = (*path)[0]->e;

Re: [PATCH] add overlap function to gcov-tool

2014-10-07 Thread Jan Hubicka
> Hi,
> 
> This patch adds overlap functionality to gcov-tool. The overlap score
> estimates the similarity of two profiles. Currently it only computes
> overlap for arc counters.
> 
> The overlap score is defined as
> \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
> where p1-counter[i] and p2-counter[2] are two matched counter from
> profile1 and profiler2.
> p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
> profile2, repetitively.

The patch looks fine in general.  My statistics is all rusty, but can't we use
one of the established techniques like Kullback-Leibler to compare the
probabilitis distributions? It would be also nice to have ability to compare
branch probabilities in btween train runs.

Honza
> 
> The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
> no match and 1.0 mean a perfect match.
> 
> This tool can be used in performance triaging and reducing the fdo
> training set size (where similar inputs can be pruned).
> 
> Tested with spec2006 profiles.
> 
> Thanks,
> 
> -Rong

> 2014-10-07  Rong Xu  
> 
>   * gcc/gcov-tool.c (profile_overlap): New driver function
> to compute profile overlap. 
>   (print_overlap_usage_message): New.
>   (overlap_usage): New.
>   (do_overlap): New.
>   (print_usage): Add calls to overlap function.
>   (main): Ditto.
>   * libgcc/libgcov-util.c (read_gcda_file): Fix format.
>   (find_match_gcov_info): Ditto.
>   (calculate_2_entries): New.
>   (compute_one_gcov): Ditto.
>   (gcov_info_count_all_cold): Ditto.
>   (gcov_info_count_all_zero): Ditto.
>   (extract_file_basename): Ditto.
>   (get_file_basename): Ditto.
>   (set_flag): Ditto.
>   (matched_gcov_info): Ditto.
>   (calculate_overlap): Ditto.
>   (gcov_profile_overlap): Ditto.
>   * libgcc/libgcov-driver.c (compute_summary): Make
> it avavilable for external calls.
>   * gcc/doc/gcov-tool.texi: Add documentation.
> 
> Index: gcc/gcov-tool.c
> ===
> --- gcc/gcov-tool.c   (revision 215981)
> +++ gcc/gcov-tool.c   (working copy)
> @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
>  #include 
>  
>  extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, 
> int);
> +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
>  extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
>  extern int gcov_profile_scale (struct gcov_info*, float, int, int);
>  extern struct gcov_info* gcov_read_profile_dir (const char*, int);
> @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
>return ret;
>  }
>  
> +/* Driver function to computer the overlap score b/w profile D1 and D2.
> +   Return 1 on error and 0 if OK.  */
> +
> +static int
> +profile_overlap (const char *d1, const char *d2)
> +{
> +  struct gcov_info *d1_profile;
> +  struct gcov_info *d2_profile;
> +
> +  d1_profile = gcov_read_profile_dir (d1, 0);
> +  if (!d1_profile)
> +return 1;
> +
> +  if (d2)
> +{
> +  d2_profile = gcov_read_profile_dir (d2, 0);
> +  if (!d2_profile)
> +return 1;
> +
> +  return gcov_profile_overlap (d1_profile, d2_profile);
> +}
> +
> +  return 1;
> +}
> +
> +/* Usage message for profile overlap.  */
> +
> +static void
> +print_overlap_usage_message (int error_p)
> +{
> +  FILE *file = error_p ? stderr : stdout;
> +
> +  fnotice (file, "  overlap [options] Compute the 
> overlap of two profiles\n");
> +  fnotice (file, "-v, --verbose   Verbose mode\n");
> +  fnotice (file, "-h, --hotonly   Only print info 
> for hot objects/functions\n");
> +  fnotice (file, "-f, --function  Print function 
> level info\n");
> +  fnotice (file, "-F, --fullname  Print full 
> filename\n");
> +  fnotice (file, "-o, --objectPrint object level 
> info\n");
> +  fnotice (file, "-t , --hot_threshold  Set the threshold 
> for hotness\n");
> +
> +}
> +
> +static const struct option overlap_options[] =
> +{
> +  { "verbose",no_argument,   NULL, 'v' },
> +  { "function",   no_argument,   NULL, 'f' },
> +  { "fullname",   no_argument,   NULL, 'F' },
> +  { "object", no_argument,   NULL, 'o' },
> +  { "hotonly",no_argument,   NULL, 'h' },
> +  { "hot_threshold",  required_argument, NULL, 't' },
> +  { 0, 0, 0, 0 }
> +};
> +
> +/* Print overlap usage and exit.  */
> +
> +static void
> +overlap_usage (void)
> +{
> +  fnotice (stderr, "Overlap subcomand usage:");
> +  print_overlap_usage_message (true);
> +  exit (FATAL_EXIT_CODE);
> +}
> +
> +int overlap_func_level;
> +int overlap_obj_level;
> +int overlap_hot_only;
> +int overlap_use_fullname;
> +double overlap_hot_threshold = 0.005;
> +
> +/* Driver for profile 

Re: [PATCH GCC]Improve candidate selecting in IVOPT

2014-10-07 Thread Bin.Cheng
Ping.  Any review comments?

Thanks,
bin

On Wed, Oct 1, 2014 at 6:31 AM, Sebastian Pop  wrote:
> Bin Cheng wrote:
>> Hi,
>> As analyzed in PR62178, IVOPT can't find the optimal iv set for that case.
>> The problem with current heuristic algorithm is it only replaces candidate
>> with ones not in current solution one by one, starting from small solution.
>> This patch adds another heuristic which starts from assigning the best
>> candidate for each iv use, then replaces candidate with ones in the current
>> solution.
>> Before this patch, there are two runs of find_optimal_set_1 to find the
>> optimal iv sets, we name them as set_a and set_b.  After this patch we will
>> have set_c.  At last, IVOPT chooses the best one from set_a/set_b/set_c.  To
>> prove that this patch is necessary, I collected instrumental data for gcc
>> bootstrap, spec2k, eembc and can confirm for some cases only the newly added
>> heuristic can find the optimal iv set.  The number of these cases in which
>> set_c is the optimal one is on the same level of set_b.
>> As for the compilation time, the newly added function actually is one
>> iteration of previous selection algorithm, it should be much faster than
>> previous process.
>>
>> I also added one target dependent test case.
>> Bootstrap and test on x86_64, test on aarch64.  Any comments?
>
> I verified that the patch fixes the performance regression on intmm.  I have
> seen improvements to other benchmarks, and very small degradations that could
> very well be noise.
>
> Thanks for fixing this perf issue!
> Sebastian
>
>>
>> 2014-09-30  Bin Cheng  
>>
>>   PR tree-optimization/62178
>>   * tree-ssa-loop-ivopts.c (enum sel_type): New.
>>   (iv_ca_add_use): Add parameter RELATED_P and find the best cand
>>   for iv use if it's true.
>>   (try_add_cand_for, get_initial_solution): Change paramter ORIGINALP
>>   to SELECT_TYPE and handle it.
>>   (find_optimal_iv_set_1): Ditto.
>>   (try_prune_iv_set, find_optimal_iv_set_2): New functions.
>>   (find_optimal_iv_set): Call find_optimal_iv_set_2 and choose the
>>   best candidate set.
>>
>> gcc/testsuite/ChangeLog
>> 2014-09-30  Bin Cheng  
>>
>>   PR tree-optimization/62178
>>   * gcc.target/aarch64/pr62178.c: New test.
>
>> Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
>> ===
>> --- gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
>> +++ gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
>> +int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
>> +
>> +void Intmm (int run) {
>> +  int i, j, k;
>> +
>> +  for ( i = 1; i <= 30; i++ )
>> +for ( j = 1; j <= 30; j++ ) {
>> +  r[i][j] = 0;
>> +  for(k = 1; k <= 30; k++ )
>> +r[i][j] += a[i][k]*b[k][j];
>> +}
>> +}
>> +
>> +/* { dg-final { scan-assembler "ld1r\\t\{v\[0-9\]+\."} } */
>> Index: gcc/tree-ssa-loop-ivopts.c
>> ===
>> --- gcc/tree-ssa-loop-ivopts.c(revision 215113)
>> +++ gcc/tree-ssa-loop-ivopts.c(working copy)
>> @@ -254,6 +254,14 @@ struct iv_inv_expr_ent
>>hashval_t hash;
>>  };
>>
>> +/* Types used to start selecting the candidate for each IV use.  */
>> +enum sel_type
>> +{
>> +  SEL_ORIGINAL,  /* Start selecting from original cands.  */
>> +  SEL_IMPORTANT, /* Start selecting from important cands.  */
>> +  SEL_RELATED/* Start selecting from related cands.  */
>> +};
>> +
>>  /* The data used by the induction variable optimizations.  */
>>
>>  typedef struct iv_use *iv_use_p;
>> @@ -5417,22 +5425,51 @@ iv_ca_set_cp (struct ivopts_data *data, struct iv_
>>  }
>>
>>  /* Extend set IVS by expressing USE by some of the candidates in it
>> -   if possible.  Consider all important candidates if candidates in
>> -   set IVS don't give any result.  */
>> +   if possible.  If RELATED_P is FALSE, consider all important
>> +   candidates if candidates in set IVS don't give any result;
>> +   otherwise, try to find the best one from related or all candidates,
>> +   depending on consider_all_candidates.  */
>>
>>  static void
>>  iv_ca_add_use (struct ivopts_data *data, struct iv_ca *ivs,
>> -struct iv_use *use)
>> +struct iv_use *use, bool related_p)
>>  {
>>struct cost_pair *best_cp = NULL, *cp;
>>bitmap_iterator bi;
>>unsigned i;
>>struct iv_cand *cand;
>>
>> -  gcc_assert (ivs->upto >= use->id);
>> +  gcc_assert (ivs->upto == use->id);
>>ivs->upto++;
>>ivs->bad_uses++;
>>
>> +  if (related_p)
>> +{
>> +  if (data->consider_all_candidates)
>> + {
>> +   for (i = 0; i < n_iv_cands (data); i++)
>> + {
>> +   cand = iv_cand (data, i);
>> +   cp = get_use_iv_cost (data, use, cand);
>> +   if (cheaper_cost_pair (cp, best_cp))
>> + 

[PATCH PING]Improve induction variable elimination

2014-10-07 Thread Bin Cheng
Hi,
This patch is posted long before in a series of patches at
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01392.html .  Since the
preceding patch is changed according to review comments, also because it's
long time not reviewed, I rebased and updated this patch as attached.  

With this patch, spec2k/fp can be improved a little on aarch64.
Bootstrap and test on x86_64 and x86, I am also prepared to fix any
regression in the future.  Is it OK?

2014-09-30  Bin Cheng  

* tree-ssa-loop-ivopts.c (iv_nowrap_period)
(nowrap_cand_for_loop_niter_p): New functions.
(period_greater_niter_exit): New function refactored from
may_eliminate_iv.
(difference_cannot_overflow_p): Handle zero offset.
(iv_elimination_compare_lt): New parameter.  Check wrapping
behavior for candidate of wrapping type.  Handle folded forms
of may_be_zero expression.
(may_eliminate_iv): Call period_greater_niter_exit.  Pass new
argument for iv_elimination_compare_lt.

gcc/testsuite/ChangeLog
2014-09-30  Bin Cheng  

* gcc.dg/tree-ssa/ivopts-lt-3.c: New test.
* gcc.dg/tree-ssa/ivopts-lt-4.c: New test.Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 215108)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -4451,6 +4451,44 @@ iv_period (struct iv *iv)
   return period;
 }
 
+/* Returns no wrapping period of induction variable IV.  For now
+   only unsigned type IV is handled, we could extend it in case
+   of non-overflow for signed ones.  Return zero if it can't be
+   decided.  */
+
+static tree
+iv_nowrap_period (struct iv *iv)
+{
+  bool overflow;
+  tree type;
+  tree base = iv->base, step = iv->step;
+  widest_int base_val, step_val, max_val, span, period;
+
+  gcc_assert (step && TREE_CODE (step) == INTEGER_CST);
+
+  type = TREE_TYPE (base);
+  if (!TYPE_UNSIGNED (type) || TREE_CODE (base) != INTEGER_CST)
+return integer_zero_node;
+
+  base_val = wi::to_widest (base);
+  step_val = wi::to_widest (step);
+  if (!POINTER_TYPE_P (type) && TYPE_MAX_VALUE (type)
+  && TREE_CODE (TYPE_MAX_VALUE (type)) == INTEGER_CST)
+max_val = wi::to_widest (TYPE_MAX_VALUE (type));
+  else
+{
+  wide_int max_wi = wi::max_value (TYPE_PRECISION (type), UNSIGNED);
+  max_val = wi::to_widest (wide_int_to_tree (type, max_wi));
+}
+
+  span = max_val - base_val + step_val - 1;
+  period = wi::div_trunc (span, step_val, UNSIGNED, &overflow);
+  if (overflow)
+return integer_zero_node;
+
+  return wide_int_to_tree (type, period);
+}
+
 /* Returns the comparison operator used when eliminating the iv USE.  */
 
 static enum tree_code
@@ -4483,6 +4521,10 @@ difference_cannot_overflow_p (struct ivopts_data *
   tree e1, e2;
   aff_tree aff_e1, aff_e2, aff_offset;
 
+  /* No overflow if offset is zero.  */
+  if (integer_zerop (offset))
+return true;
+
   if (!nowrap_type_p (TREE_TYPE (base)))
 return false;
 
@@ -4538,7 +4580,84 @@ difference_cannot_overflow_p (struct ivopts_data *
 }
 }
 
-/* Tries to replace loop exit by one formulated in terms of a LT_EXPR
+/* Check whether PERIOD of CAND is greater than the number of iterations
+   described by DESC for which the exit condition is true.  The exit
+   condition is comparison against USE.  */
+
+static bool
+period_greater_niter_exit (struct ivopts_data *data,
+  struct iv_use *use, struct iv_cand *cand,
+  tree period, struct tree_niter_desc *desc)
+{
+  struct loop *loop = data->current_loop;
+
+  /* If the number of iterations is constant, compare against it directly.  */
+  if (TREE_CODE (desc->niter) == INTEGER_CST)
+{
+  /* See cand_value_at.  */
+  if (stmt_after_increment (loop, cand, use->stmt))
+{
+  if (!tree_int_cst_lt (desc->niter, period))
+return false;
+}
+  else
+{
+  if (tree_int_cst_lt (period, desc->niter))
+return false;
+}
+}
+
+  /* If not, and if this is the only possible exit of the loop, see whether
+ we can get a conservative estimate on the number of iterations of the
+ entire loop and compare against that instead.  */
+  else
+{
+  widest_int period_value, max_niter;
+
+  max_niter = desc->max;
+  if (stmt_after_increment (loop, cand, use->stmt))
+max_niter += 1;
+  period_value = wi::to_widest (period);
+  if (wi::gtu_p (max_niter, period_value))
+{
+  /* See if we can take advantage of inferred loop bound information.  
*/
+  if (data->loop_single_exit_p)
+{
+  if (!max_loop_iterations (loop, &max_niter))
+return false;
+  /* The loop bound is already adjusted by adding 1.  */
+  if (wi::gtu_p (max_niter, period_value))
+return false;
+}
+  else
+ 

[PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Thomas Preud'homme
Currently the bswap pass only look for bswap pattern by examining bitwise
OR statement and doing following def-use chains. However a rotation
(left or right) can finish a manual byteswap, as shown in the following example:

unsigned
byteswap_ending_with_rotation (unsigned in)
{
in = ((in & 0xff00ff00) >>  8) | ((in & 0x00ff00ff) <<  8);
in = ((in & 0x) >> 16) | ((in & 0x) << 16);
return in;
}

which is compiled into:

byteswap_ending_with_rotation (unsigned int in)
{
  unsigned int _2;
  unsigned int _3;
  unsigned int _4;
  unsigned int _5;

  :
  _2 = in_1(D) & 4278255360;
  _3 = _2 >> 8;
  _4 = in_1(D) & 16711935;
  _5 = _4 << 8;
  in_6 = _5 | _3;
  in_7 = in_6 r>> 16;
  return in_7;

}

This patch adds rotation (left and right) to the list of statement to consider 
for byte swap.

ChangeLog are as follows:

*** gcc/ChangeLog ***

2014-09-30  Thomas Preud'homme  

PR tree-optimization/63259
* tree-ssa-math-opts.c (pass_optimize_bswap::execute): Also consider
bswap in LROTATE_EXPR and RROTATE_EXPR statements.

*** gcc/testsuite/ChangeLog ***

2014-09-30  Thomas Preud'homme  

PR tree-optimization/63259
* optimize-bswapsi-1.c (swap32_e): New bswap pass test.


diff --git a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c 
b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
index 580e6e0..d4b5740 100644
--- a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
+++ b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
@@ -64,5 +64,16 @@ swap32_d (SItype in)
 | (((in >> 24) & 0xFF) << 0);
 }
 
-/* { dg-final { scan-tree-dump-times "32 bit bswap implementation found at" 4 
"bswap" } } */
+/* This variant comes from PR63259.  It compiles to a gimple sequence that ends
+   with a rotation instead of a bitwise OR.  */
+
+unsigned
+swap32_e (unsigned in)
+{
+  in = ((in & 0xff00ff00) >>  8) | ((in & 0x00ff00ff) <<  8);
+  in = ((in & 0x) >> 16) | ((in & 0x) << 16);
+  return in;
+}
+
+/* { dg-final { scan-tree-dump-times "32 bit bswap implementation found at" 5 
"bswap" } } */
 /* { dg-final { cleanup-tree-dump "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 3c6e935..2023f2e 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2377,11 +2377,16 @@ pass_optimize_bswap::execute (function *fun)
 {
  gimple src_stmt, cur_stmt = gsi_stmt (gsi);
  tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
+ enum tree_code code;
  struct symbolic_number n;
  bool bswap;
 
- if (!is_gimple_assign (cur_stmt)
- || gimple_assign_rhs_code (cur_stmt) != BIT_IOR_EXPR)
+ if (!is_gimple_assign (cur_stmt))
+   continue;
+
+ code = gimple_assign_rhs_code (cur_stmt);
+ if (code != BIT_IOR_EXPR && code != LROTATE_EXPR
+ && code != RROTATE_EXPR)
continue;
 
  src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);

Testing was done by running the testsuite on arm-none-eabi target with QEMU
emulating Cortex-M3: no regression were found. Due to the potential increase
in compilation time, A bootstrap with sequential build (no -j option when 
calling
make) and with default option was made with and without the patch. The
results shows no increase compilation time:

r215662 with patch:
make  6167.48s user 401.03s system 99% cpu 1:49:52.07 total

r215662 without patch
make  6136.63s user 400.32s system 99% cpu 1:49:27.28 total

Is it ok for trunk?

Best regards,

Thomas Preud'homme





[committed] MAINTAINERS (Write After Approval): Add myself.

2014-10-07 Thread Felix Yang
Index: MAINTAINERS
===
--- MAINTAINERS(revision 215985)
+++ MAINTAINERS(working copy)
@@ -583,6 +583,7 @@ Chung-Ju Wu
 Le-Chun Wu
 Mingjie Xing
 Canqun Yang
+Fei Yang
 Jeffrey Yasskin
 Joey Ye
 Greta Yorsh


Cheers,
Felix


Re: [Fortran, Patch] Implement IMPLICIT NONE

2014-10-07 Thread Dominique Dhumieres
Patch:

--- ../_clean/gcc/testsuite/gfortran.dg/implicit_4.f90  2014-10-07 
00:21:56.0 +0200
+++ gcc/testsuite/gfortran.dg/implicit_4.f902014-10-07 19:09:45.0 
+0200
@@ -6,7 +6,7 @@ END
 
 SUBROUTINE a
 IMPLICIT REAL(b-j)
-implicit none  ! { dg-error "Type IMPLICIT NONE statement at .1. following 
an IMPLICIT statement" }
+implicit none  ! { dg-error "IMPLICIT NONE .type. statement at .1. 
following an IMPLICIT statement" }
 END SUBROUTINE a
 
 subroutine b

Note that the loci are badly placed:

/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:4.40:

IMPLICIT NONE ! { dg-error "Duplicate" }
1
Error: Duplicate IMPLICIT NONE statement at (1)
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:9.105:

r "IMPLICIT NONE .type. statement at .1. following an IMPLICIT statement" }
   1
Error: IMPLICIT NONE (type) statement at (1) following an IMPLICIT statement
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:14.8:

implicit real(g-k) ! { dg-error "IMPLICIT statement at .1. following an IMPLICI
1
Error: IMPLICIT statement at (1) following an IMPLICIT NONE (type) statement
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:19.47:

implicit integer (b-c) ! { dg-error "already" }
   1
Error: Letter B already has an IMPLICIT type at (1)
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:20.57:

implicit real(d-f), complex(f-g) ! { dg-error "already" }
 1
Error: Letter F already has an IMPLICIT type at (1)

i.e., at the end of the comment and not where the error occurs.

Dominique


Re: RFA: fix mode confusion in caller-save.c:replace_reg_with_saved_mem

2014-10-07 Thread Joern Rennecke
On 7 October 2014 18:38, Jeff Law  wrote:
> On 10/06/14 20:57, Joern Rennecke wrote:
>>
>> On 6 October 2014 19:58, Jeff Law  wrote:
>>>
>>> What makes word_mode special here?  ie, why is special casing for
>>> word_mode
>>> the right thing to do?
>>
>>
>> The patch does not special-case word mode.  The if condition tests if
>> smode would
>> cover multiple hard registers.
>> If that would be the case, smode is replaced with word_mode.
>
> SO I'll ask another way.  Why do you want to change smode to word_mode?

Because SImode covers four hard registers, wheras the intention is to
have a single
one.

(concatn:SI [
(reg:SI 18 r18)
(reg:SI 19 r19)
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 43 [0x2b])) [6  S1 A8])
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 44 [0x2c])) [6  S1 A8])
])

(see original post) is invalid RTL, and thuis the cause of the later ICE.


[Patch, testsuite] check if -shared is supported

2014-10-07 Thread Christophe Lyon
Hi,

When Jason added the new g++.dg/ipa/devirt-28a.C test  along with his
fix for PR c++/58678
(https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00838.html), this new
test was failing in the ARM and AArch64 configuration I am testing.

For the arm*-none-eabi and aarch64*-none-elf configurations, this was
simply because -shared is not supported by these targets. The attached
patch adds support to test availability of this option, similarly to
what is done for -fpic.

For the record, for the arm*linux configurations, the test was also
failing because testglue.o contained relocations incompatible with
-shared. I managed to have them work by adding
set_board_info wrap_compile_flags "-mword-relocations"
to my .exp dejagnu configuration.

In summary, this patch enables to have devirt-28a.C:
- PASS on arm*linux*
- UNSUPPORTED on arm*-none-eabi and aarch64*-none-elf
instead of FAIL.

Is it OK for trunk, and 4.9 (since Jason's patch was also committed to 4.9) ?

2014-10-08  Christophe Lyon  

* lib/target-supports.exp (check_effective_target_shared): New
function.
* g++.dg/ipa/devirt-28a.C: Check if -shared is supported.

Thanks,

Christophe.
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-28a.C b/gcc/testsuite/g++.dg/ipa/devirt-28a.C
index bdd1682..65d5fcd 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-28a.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-28a.C
@@ -1,6 +1,6 @@
 // PR c++/58678
 // { dg-options "-O3 -flto -shared -fPIC -Wl,--no-undefined" }
-// { dg-do link { target { gld && fpic } } }
+// { dg-do link { target { { gld && fpic } && shared } } }
 
 struct A {
   virtual ~A();
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 77e45cb..7ae6161 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -840,6 +840,19 @@ proc check_effective_target_fpic { } {
 return 0
 }
 
+# Return 1 if -shared is supported, as in no warnings or errors
+# emitted, 0 otherwise.
+
+proc check_effective_target_shared { } {
+# Note that M68K has a multilib that supports -fpic but not
+# -fPIC, so we need to check both.  We test with a program that
+# requires GOT references.
+return [check_no_compiler_messages shared executable {
+	extern int foo (void); extern int bar;
+	int baz (void) { return foo () + bar; }
+} "-shared -fpic"]
+}
+
 # Return 1 if -pie, -fpie and -fPIE are supported, 0 otherwise.
 
 proc check_effective_target_pie { } {


Re: C++ Patch for c++/60894

2014-10-07 Thread Jason Merrill

On 09/24/2014 05:15 PM, Jason Merrill wrote:

On 09/24/2014 05:06 PM, Fabien Chêne wrote:

Unfortunately, just stripping the USING_DECL in lookup_and_check_tag
does not really work because some diagnotic codes expect the
USING_DECL not to be stripped.


It seems to me that the problem is that lookup_and_check_tag is 
rejecting a USING_DECL rather than returning it.  What if we return the 
USING_DECL?


Jason




Re: sort_heap complexity guarantee

2014-10-07 Thread Daniel Krügler
2014-10-07 23:11 GMT+02:00 François Dumont :
> On 06/10/2014 23:05, Daniel Krügler wrote:
>> François, could you please submit a corresponding LWG issue by sending
>> an email using the recipe described here:
>>
>> http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#submit_issue
>>
>> ?
>>
> I just did requesting to use 2N log(N).
>
> And is it ok to commit those ?

Looks fine to me - Thanks!

- Daniel


Re: sort_heap complexity guarantee

2014-10-07 Thread François Dumont

On 06/10/2014 23:05, Daniel Krügler wrote:

2014-10-06 23:00 GMT+02:00 François Dumont :

On 05/10/2014 22:54, Marc Glisse wrote:

On Sun, 5 Oct 2014, François Dumont wrote:


I took a look at PR 61217 regarding pop_heap complexity guarantee.
Looks like we have no test to check complexity of our algos so I start
writing some starting with the heap operations. I found no issue with
make_heap, push_heap and pop_heap despite what the bug report is saying
however the attached testcase for sort_heap is failing.

Standard is saying std::sort_heap shall use less than N * log(N)
comparisons but with my test using 1000 random values the test is showing:

8687 comparisons on 6907.76 max allowed

Is this a known issue of sort_heap ? Do you confirm that the test is
valid ?

I would first look for confirmation that the standard didn't just forget a
big-O or something. I would expect an implementation as n calls to pop_heap
to be legal, and if pop_heap makes 2*log(n) comparisons, that naively sums
to too much. And I don't expect the standard to contain an advanced
amortized analysis or anything like that...


Good point, with n calls to pop_heap it means that limit must be 2*log(1) +
2*log(2) +... + 2*log(n) which is 2*log(n!) and  which is also necessarily <
2*n*log(n). I guess Standard comittee has forgotten the factor 2 in the
limit so this is what I am using as limit in the final test, unless someone
prefer the stricter 2*log(n!) ?

François, could you please submit a corresponding LWG issue by sending
an email using the recipe described here:

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#submit_issue

?


I just did requesting to use 2N log(N).

And is it ok to commit those ?

François



Towards GNU11

2014-10-07 Thread Marek Polacek
Hi!

I'd like to kick off a discussion about moving the default standard
for C from gnu89 to gnu11.

This really shouldn't be much of a surprise: the docs mention that
gnu11 is intended future default for a year now.  I would presume now
is a good time to make this move: together with the new naming scheme
this should make GCC more modern (C89 really is as old as the hills).
And we're still in stage1.

Prerequisites should be largely complete at this point:
- we have -Wc90-c99-compat option that warns about features not present
  in ISO C90, but present in ISO C99,
- we have -Wc99-c11-compat option that warns about features not present
  in ISO C99, but present in ISO C11,
- the testsuite has been adjusted so all the test that pass with gnu89
  default should pass with gnu11 default as well (see my recent batch
  of cleanup patches).  This unfortunately isn't correct for all archs,
  I just don't have enough resources to test everything.  But generally
  the fallout from moving to gnu11 is easy to fix: just add proper decls
  and return types (to fix defaulting to int), or for inline stuff use
  -fgnu89-inline/gnu_inline attribute.  I'd appreciate testing on other
  architectures than x86_64/ppc64.

The things I had to fix in the testsuite nicely reflect what we can expect
in the real life: mostly bunch of new warnings about missing declarations
and defaulting to int (this is probably going to be a pain with -Werror,
but I feel that people really should write proper declarations), different
inline semantics (in C99 semantics, the TU has to have the body of the inline
function etc.), new "return with no value, in function returning non-void"
warnings.  Different rules for constant expressions, the fact that in C90
non-lvalue arrays do not decay to pointers, slightly different rules for
compatible types (?) might come in game as well.

In turn, you can use all C99 and C11 features even with -pedantic.

Comments?

Regtested/bootstrapped on powerpc64-linux and x86_64-linux.

2014-10-07  Marek Polacek  

* doc/invoke.texi: Update to reflect that GNU11 is the default
mode for C.
* c-common.h (c_language_kind): Update comment.
c-family/
* c-opts.c (c_common_init_options): Make -std=gnu11 the default for C.

diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 1e3477f..a895084 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -445,7 +445,7 @@ struct GTY(()) sorted_fields_type {
 
 typedef enum c_language_kind
 {
-  clk_c= 0,/* C90, C94 or C99 */
+  clk_c= 0,/* C90, C94, C99 or C11 */
   clk_objc = 1,/* clk_c with ObjC features.  */
   clk_cxx  = 2,/* ANSI/ISO C++ */
   clk_objcxx   = 3 /* clk_cxx with ObjC features.  */
diff --git gcc/c-family/c-opts.c gcc/c-family/c-opts.c
index 3f295d8..eb078e3 100644
--- gcc/c-family/c-opts.c
+++ gcc/c-family/c-opts.c
@@ -250,6 +250,9 @@ c_common_init_options (unsigned int decoded_options_count,
 
   if (c_language == clk_c)
 {
+  /* The default for C is gnu11.  */
+  set_std_c11 (false /* ISO */);
+
   /* If preprocessing assembly language, accept any of the C-family
 front end options since the driver may pass them through.  */
   for (i = 1; i < decoded_options_count; i++)
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 5fe7e15..fa84ed4 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -1692,8 +1692,7 @@ interfaces) and L (Analyzability).  The name @samp{c1x} 
is deprecated.
 
 @item gnu90
 @itemx gnu89
-GNU dialect of ISO C90 (including some C99 features). This
-is the default for C code.
+GNU dialect of ISO C90 (including some C99 features).
 
 @item gnu99
 @itemx gnu9x
@@ -1701,8 +1700,8 @@ GNU dialect of ISO C99.  The name @samp{gnu9x} is 
deprecated.
 
 @item gnu11
 @itemx gnu1x
-GNU dialect of ISO C11.  This is intended to become the default in a
-future release of GCC.  The name @samp{gnu1x} is deprecated.
+GNU dialect of ISO C11.  This is the default for C code.
+The name @samp{gnu1x} is deprecated.
 
 @item c++98
 @itemx c++03

Marek


[Google 4.9] Backport of r210828

2014-10-07 Thread Sterling Augustine
The enclosed patch for google 4.9 is a backport of r210828 from
trunk.

googleref:b/14623977

The given tests now pass when run by hand, but timeout under dejagnu
I will be sending a different change to fix that.

OK for google 4.9?
The enclosed patch for google 4.9 is a backport of r210828 from
trunk.

googleref:b/14623977

The given tests now pass when run by hand, but timeout under dejagnu
I will be sending a different change to fix that.

OK for google 4.9?

Index: gcc/config/aarch64/aarch64-builtins.c
===
--- gcc/config/aarch64/aarch64-builtins.c   (revision 215958)
+++ gcc/config/aarch64/aarch64-builtins.c   (working copy)
@@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_bui
 enum aarch64_builtins
 {
   AARCH64_BUILTIN_MIN,
+
+  AARCH64_BUILTIN_GET_FPCR,
+  AARCH64_BUILTIN_SET_FPCR,
+  AARCH64_BUILTIN_GET_FPSR,
+  AARCH64_BUILTIN_SET_FPSR,
+
   AARCH64_SIMD_BUILTIN_BASE,
 #include "aarch64-simd-builtins.def"
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
@@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void)
 void
 aarch64_init_builtins (void)
 {
+  tree ftype_set_fpr
+= build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_get_fpr
+= build_function_type_list (unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]
+= add_builtin_function ("__builtin_aarch64_get_fpcr", ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]
+= add_builtin_function ("__builtin_aarch64_set_fpcr", ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]
+= add_builtin_function ("__builtin_aarch64_get_fpsr", ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]
+= add_builtin_function ("__builtin_aarch64_set_fpsr", ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+
   if (TARGET_SIMD)
 aarch64_init_simd_builtins ();
 }
@@ -964,7 +988,37 @@ aarch64_expand_builtin (tree exp,
 {
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   int fcode = DECL_FUNCTION_CODE (fndecl);
+  int icode;
+  rtx pat, op0;
+  tree arg0;
 
+  switch (fcode)
+{
+case AARCH64_BUILTIN_GET_FPCR:
+case AARCH64_BUILTIN_SET_FPCR:
+case AARCH64_BUILTIN_GET_FPSR:
+case AARCH64_BUILTIN_SET_FPSR:
+  if ((fcode == AARCH64_BUILTIN_GET_FPCR)
+ || (fcode == AARCH64_BUILTIN_GET_FPSR))
+   {
+ icode = (fcode == AARCH64_BUILTIN_GET_FPSR) ?
+   CODE_FOR_get_fpsr : CODE_FOR_get_fpcr;
+ target = gen_reg_rtx (SImode);
+ pat = GEN_FCN (icode) (target);
+   }
+  else
+   {
+ target = NULL_RTX;
+ icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ?
+   CODE_FOR_set_fpsr : CODE_FOR_set_fpcr;
+ arg0 = CALL_EXPR_ARG (exp, 0);
+ op0 = expand_normal (arg0);
+ pat = GEN_FCN (icode) (op0);
+   }
+  emit_insn (pat);
+  return target;
+}
+
   if (fcode >= AARCH64_SIMD_BUILTIN_BASE)
 return aarch64_simd_expand_builtin (fcode, exp, target);
 
@@ -1196,6 +1250,106 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator
   return changed;
 }
 
+void
+aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  const unsigned AARCH64_FE_INVALID = 1;
+  const unsigned AARCH64_FE_DIVBYZERO = 2;
+  const unsigned AARCH64_FE_OVERFLOW = 4;
+  const unsigned AARCH64_FE_UNDERFLOW = 8;
+  const unsigned AARCH64_FE_INEXACT = 16;
+  const unsigned HOST_WIDE_INT AARCH64_FE_ALL_EXCEPT = (AARCH64_FE_INVALID
+   | AARCH64_FE_DIVBYZERO
+   | AARCH64_FE_OVERFLOW
+   | AARCH64_FE_UNDERFLOW
+   | AARCH64_FE_INEXACT);
+  const unsigned HOST_WIDE_INT AARCH64_FE_EXCEPT_SHIFT = 8;
+  tree fenv_cr, fenv_sr, get_fpcr, set_fpcr, mask_cr, mask_sr;
+  tree ld_fenv_cr, ld_fenv_sr, masked_fenv_cr, masked_fenv_sr, hold_fnclex_cr;
+  tree hold_fnclex_sr, new_fenv_var, reload_fenv, restore_fnenv, get_fpsr, 
set_fpsr;
+  tree update_call, atomic_feraiseexcept, hold_fnclex, masked_fenv, ld_fenv;
+
+  /* Generate the equivalence of :
+   unsigned int fenv_cr;
+   fenv_cr = __builtin_aarch64_get_fpcr ();
+
+   unsigned int fenv_sr;
+   fenv_sr = __builtin_aarch64_get_fpsr ();
+
+   Now set all exceptions to non-stop
+   unsigned int mask_cr
+   = ~(AARCH64_FE_ALL_EXCEPT << AARCH64_FE_EXCEPT_SHIFT);
+   unsigned int masked_cr;
+   masked_cr = fenv_cr & mask_cr;
+
+   And clear all exception flags
+ 

[PATCH] add overlap function to gcov-tool

2014-10-07 Thread Rong Xu
Hi,

This patch adds overlap functionality to gcov-tool. The overlap score
estimates the similarity of two profiles. Currently it only computes
overlap for arc counters.

The overlap score is defined as
\sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
where p1-counter[i] and p2-counter[2] are two matched counter from
profile1 and profiler2.
p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
profile2, repetitively.

The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
no match and 1.0 mean a perfect match.

This tool can be used in performance triaging and reducing the fdo
training set size (where similar inputs can be pruned).

Tested with spec2006 profiles.

Thanks,

-Rong
2014-10-07  Rong Xu  

* gcc/gcov-tool.c (profile_overlap): New driver function
to compute profile overlap. 
(print_overlap_usage_message): New.
(overlap_usage): New.
(do_overlap): New.
(print_usage): Add calls to overlap function.
(main): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Fix format.
(find_match_gcov_info): Ditto.
(calculate_2_entries): New.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.
(gcov_info_count_all_zero): Ditto.
(extract_file_basename): Ditto.
(get_file_basename): Ditto.
(set_flag): Ditto.
(matched_gcov_info): Ditto.
(calculate_overlap): Ditto.
(gcov_profile_overlap): Ditto.
* libgcc/libgcov-driver.c (compute_summary): Make
it avavilable for external calls.
* gcc/doc/gcov-tool.texi: Add documentation.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 215981)
+++ gcc/gcov-tool.c (working copy)
@@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include 
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
+extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
 extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
@@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
   return ret;
 }
 
+/* Driver function to computer the overlap score b/w profile D1 and D2.
+   Return 1 on error and 0 if OK.  */
+
+static int
+profile_overlap (const char *d1, const char *d2)
+{
+  struct gcov_info *d1_profile;
+  struct gcov_info *d2_profile;
+
+  d1_profile = gcov_read_profile_dir (d1, 0);
+  if (!d1_profile)
+return 1;
+
+  if (d2)
+{
+  d2_profile = gcov_read_profile_dir (d2, 0);
+  if (!d2_profile)
+return 1;
+
+  return gcov_profile_overlap (d1_profile, d2_profile);
+}
+
+  return 1;
+}
+
+/* Usage message for profile overlap.  */
+
+static void
+print_overlap_usage_message (int error_p)
+{
+  FILE *file = error_p ? stderr : stdout;
+
+  fnotice (file, "  overlap [options] Compute the overlap 
of two profiles\n");
+  fnotice (file, "-v, --verbose   Verbose mode\n");
+  fnotice (file, "-h, --hotonly   Only print info for 
hot objects/functions\n");
+  fnotice (file, "-f, --function  Print function level 
info\n");
+  fnotice (file, "-F, --fullname  Print full 
filename\n");
+  fnotice (file, "-o, --objectPrint object level 
info\n");
+  fnotice (file, "-t , --hot_threshold  Set the threshold 
for hotness\n");
+
+}
+
+static const struct option overlap_options[] =
+{
+  { "verbose",no_argument,   NULL, 'v' },
+  { "function",   no_argument,   NULL, 'f' },
+  { "fullname",   no_argument,   NULL, 'F' },
+  { "object", no_argument,   NULL, 'o' },
+  { "hotonly",no_argument,   NULL, 'h' },
+  { "hot_threshold",  required_argument, NULL, 't' },
+  { 0, 0, 0, 0 }
+};
+
+/* Print overlap usage and exit.  */
+
+static void
+overlap_usage (void)
+{
+  fnotice (stderr, "Overlap subcomand usage:");
+  print_overlap_usage_message (true);
+  exit (FATAL_EXIT_CODE);
+}
+
+int overlap_func_level;
+int overlap_obj_level;
+int overlap_hot_only;
+int overlap_use_fullname;
+double overlap_hot_threshold = 0.005;
+
+/* Driver for profile overlap sub-command.  */
+
+static int
+do_overlap (int argc, char **argv)
+{
+  int opt;
+  int ret;
+
+  optind = 0;
+  while ((opt = getopt_long (argc, argv, "vfFoht:", overlap_options, NULL)) != 
-1)
+{
+  switch (opt)
+{
+case 'v':
+  verbose = true;
+  gcov_set_verbose ();
+  break;
+case 'f':
+  overlap_func_level = 1;
+  break;
+case 'F':
+  overlap_use_fullname = 1;
+  break;
+case 'o':
+  overla

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek
On Tue, Oct 07, 2014 at 10:12:22PM +0400, Ilya Verbin wrote:
> > And, is __gnu_offload_{funcs,vars} named that way just because the plugin 
> > isn't able to add
> > symbols around the sections for you?  As it doesn't contain a dot, it would 
> > collide
> > with user declarations put into __attribute__((section 
> > ("__gnu_offload_funcs"))).
> 
> Renamed to .gnu.offload_{funcs,vars}.
> Automatically provided symbols __start__*, __stop__* don't work with shared
> libraries, since the symbols from exec override the respective symbols in dso.

...

Thanks.

One more thing, I've noticed that running target-1.exe testcase also leaves
/tmp/offload_XX directories around (one for each invocation).
That can be useful for debugging, but generally should be cleaned up in
__cxa_atexit callback or similar.

OT, from the various IRC discussions with Kirill on IRC, it seems you or
your colleges typed pretty much all target related tests from OpenMP 4.0.1
examples, can those be also submitted for inclusion in the testsuite?
AFAIK we already have the appendix-a/ testcases and had permissions from
OpenMP committee to use them, so if we put these into the same directory
(sure, it is not appendix-a anymore, but no tests are in that appendix
anymore), it would be appreciated.

Jakub


Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.

2014-10-07 Thread Andreas Schwab
Jonathan Wakely  writes:

> diff --git a/libstdc++-v3/src/c++98/locale_facets.cc 
> b/libstdc++-v3/src/c++98/locale_facets.cc
> index 3669acb..7ed04e6 100644
> --- a/libstdc++-v3/src/c++98/locale_facets.cc
> +++ b/libstdc++-v3/src/c++98/locale_facets.cc
> @@ -69,19 +69,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  if (__flags & ios_base::showpoint)
>*__fptr++ = '#';
>  
> -// As per DR 231: _always_, not only when 
> -// __flags & ios_base::fixed || __prec > 0
> -*__fptr++ = '.';
> -*__fptr++ = '*';
> +ios_base::fmtflags __fltfield = __flags & ios_base::floatfield;
> +
> +if (__fltfield != (ios_base::fixed | ios_base::scientific))
> +  {
> +// As per DR 231: not only when __flags & ios_base::fixed || __prec 
> > 0
> +*__fptr++ = '.';
> +*__fptr++ = '*';
> +  }
>  
>  if (__mod)
>*__fptr++ = __mod;
> -ios_base::fmtflags __fltfield = __flags & ios_base::floatfield;
>  // [22.2.2.2.2] Table 58
>  if (__fltfield == ios_base::fixed)
>*__fptr++ = 'f';
>  else if (__fltfield == ios_base::scientific)
>*__fptr++ = (__flags & ios_base::uppercase) ? 'E' : 'e';
> +#ifdef _GLIBCXX_USE_C99
> +else if (__fltfield == (ios_base::fixed | ios_base::scientific))
> +  *__fptr++ = (__flags & ios_base::uppercase) ? 'A' : 'a';
> +#endif

That cannot work.  std::__convert_from_v always passes __prec before
__v, but the format is "%a".

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [patch] tag ../include/*

2014-10-07 Thread Mike Stump
On Oct 7, 2014, at 9:37 AM, Aldy Hernandez  wrote:
> Is there a reason we don't create etags for toplevel include files?

I don’t think there is.

>  If not, could I please apply this patch?

I’m in favor.

Re: [GOOGLE] Handle missing BINFO for LIPO

2014-10-07 Thread Xinliang David Li
Ok (please also guard it with L_IPO_COMP_MODE).

David

On Tue, Oct 7, 2014 at 11:27 AM, Teresa Johnson  wrote:
> We may have missing BINFO on a type if that type is a builtin, since
> in LIPO mode we will reset builtin types to their original tree nodes
> before parsing subsequent modules. Handle incomplete information by
> returning false so we won't put an entry in the type inheritance graph
> for optimization.
>
> Passes regression tests. Ok for google branches?
>
> Teresa
>
> 2014-10-07  Teresa Johnson  
>
> Google ref b/16511102.
> * ipa-devirt.c (polymorphic_type_binfo_p): Handle missing BINFO.
>
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c(revision 215830)
> +++ ipa-devirt.c(working copy)
> @@ -177,7 +177,10 @@ static inline bool
>  polymorphic_type_binfo_p (tree binfo)
>  {
>/* See if BINFO's type has an virtual table associtated with it.  */
> -  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
> +  tree type_binfo = TYPE_BINFO (BINFO_TYPE (binfo));
> +  if (!type_binfo)
> +return false;
> +  return BINFO_VTABLE (type_binfo);
>  }
>
>  /* One Definition Rule hashtable helpers.  */
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[GOOGLE] Handle missing BINFO for LIPO

2014-10-07 Thread Teresa Johnson
We may have missing BINFO on a type if that type is a builtin, since
in LIPO mode we will reset builtin types to their original tree nodes
before parsing subsequent modules. Handle incomplete information by
returning false so we won't put an entry in the type inheritance graph
for optimization.

Passes regression tests. Ok for google branches?

Teresa

2014-10-07  Teresa Johnson  

Google ref b/16511102.
* ipa-devirt.c (polymorphic_type_binfo_p): Handle missing BINFO.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 215830)
+++ ipa-devirt.c(working copy)
@@ -177,7 +177,10 @@ static inline bool
 polymorphic_type_binfo_p (tree binfo)
 {
   /* See if BINFO's type has an virtual table associtated with it.  */
-  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
+  tree type_binfo = TYPE_BINFO (BINFO_TYPE (binfo));
+  if (!type_binfo)
+return false;
+  return BINFO_VTABLE (type_binfo);
 }

 /* One Definition Rule hashtable helpers.  */


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Fortran, Patch] Implement IMPLICIT NONE

2014-10-07 Thread Andreas Schwab
Tobias Burnus  writes:

> diff --git a/gcc/testsuite/gfortran.dg/implicit_4.f90 
> b/gcc/testsuite/gfortran.dg/implicit_4.f90
> index 2e871b0..9bf8d86 100644
> --- a/gcc/testsuite/gfortran.dg/implicit_4.f90
> +++ b/gcc/testsuite/gfortran.dg/implicit_4.f90
> @@ -5,13 +5,13 @@ IMPLICIT NONE ! { dg-error "Duplicate" }
>  END
>  
>  SUBROUTINE a
> -IMPLICIT REAL(b-j) ! { dg-error "cannot follow" }
> -implicit none  ! { dg-error "cannot follow" }
> +IMPLICIT REAL(b-j)
> +implicit none  ! { dg-error "Type IMPLICIT NONE statement at .1. 
> following an IMPLICIT statement" }

That doesn't match.

/usr/local/gcc/gcc-20141007/gcc/testsuite/gfortran.dg/implicit_4.f90:9:103: 
Err\or: IMPLICIT NONE (type) statement at (1) following an IMPLICIT statement

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[committed] Fix missing include in check_effective_target_fd_truncate

2014-10-07 Thread Marek Polacek
In -std=gnu11 as a default mode many Fortran tests ended up as
UNSUPPORTED, because check_effective_target_fd_truncate routine
was missing the string.h header (it uses strncmp) hence it failed.

Applying to trunk.

2014-10-07  Marek Polacek  

* lib/target-supports.exp (check_effective_target_fd_truncate):
Include .

diff --git gcc/testsuite/lib/target-supports.exp 
gcc/testsuite/lib/target-supports.exp
index 77e45cb..2144683 100644
--- gcc/testsuite/lib/target-supports.exp
+++ gcc/testsuite/lib/target-supports.exp
@@ -5284,6 +5284,7 @@ proc check_effective_target_fd_truncate { } {
#include 
#include 
#include 
+   #include 
int main ()
{
  FILE *f = fopen ("tst.tmp", "wb");

Marek


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin
On 07 Oct 16:30, Jakub Jelinek wrote:
> I think it is useful, doesn't have to be in the initial checkin, but I'd
> certainly prefer if from the (optional) --enable-offload-target argument
> it would figure out everything it needs to add for testing.
> And, if mkoffload isn't flexible enough to be convinced to find it in that
> scenario, it better should be made more flexible.

Ok, then we will implement this in a separate patch.

> I thought .gnu.target_lto* sections hold LTO bytecore and are desirable only 
> in the
> ET_REL objects for ld(1)/lto-wrapper purposes.  For large programs containing 
> large
> target regions the LTO bytecode could be very big, so leaving it in the 
> binary is
> undesirable.

Already fixed in kyukhin/gomp4-offload branch.
 
> For .offload_image_section name, wouldn't it be better to prefix that with 
> .gnu?

Renamed to .gnu.offload_images, I'll update the branch tomorrow after testing.

> And, is __gnu_offload_{funcs,vars} named that way just because the plugin 
> isn't able to add
> symbols around the sections for you?  As it doesn't contain a dot, it would 
> collide
> with user declarations put into __attribute__((section 
> ("__gnu_offload_funcs"))).

Renamed to .gnu.offload_{funcs,vars}.
Automatically provided symbols __start__*, __stop__* don't work with shared
libraries, since the symbols from exec override the respective symbols in dso.
 
> Looking at the symbols:
> perhaps it would be better to have . somewhere in the names too, though if 
> you are
> accessing that from C or declaring them in C, it might be too hard to bother.
> It is all in reserved namespace anyway, but use two underscores prefix 
> instead of one
> for those IMHO.

All these symbols are declared/accessed in C, so I renamed them to __offload_*.

On 07 Oct 16:45, Jakub Jelinek wrote:
> Also, something that I believe has been discussed in the past, but can't
> find it on your wiki page nor in *.opt, are option overrides for the
> offloading target, i.e. some option you can pass to the host compiler driver
> during linking that will tell the driver for which offloading targets (if
> any at all) to produce the offloading support (defaulting to all configured
> offloading target is fine) and optionally what extra options beyond what has
> been passed on the command line should be passed to the offloading compiler.
> 
> Say, if I want to link target-1.exe such that it will only support host
> fallback and not x86_64-intelmicemul-linux-gnu , how do I achieve that now?

Unfortunately, this is still under development.  I hope to have a working patch
in a week.  Now, without it, lto-wrapped builds offload images for all offload
targets, specified during configure.

  -- Ilya


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jason Merrill

On 10/07/2014 11:04 AM, Jakub Jelinek wrote:

adding a hint in this case is less obvious than in the C case, because,
what if this wasn't supposed to be ::abort (), but std::abort (), or
some other namespace abort, or some class abort () method etc.?


It still seems reasonable to offer a hint if no declaration was found.

Jason




Re: [patch] remove dwarf2out's current_function_has_inlines

2014-10-07 Thread Jason Merrill

On 10/07/2014 01:16 PM, Aldy Hernandez wrote:

Errr... a static that only gets written to?

OK to commit?


Yes.  This should have been removed with

2010-09-03  Marcin Baczynski  

* dwarf2out.c (file scope): Remove #if0'd code.
(gen_subprogram_die): Same.

Jason



[jit] Documentation tweaks

2014-10-07 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/jit/ChangeLog.jit:
* docs/internals/index.rst (Overview of code structure): Directly
include the comment from jit-common.h as rst, rather than as a
quoted C++ comment.
* jit-common.h: Convert the summary format to valid reStructured
text for inclusion by docs/internals/index.rst.
* notes.txt: Clarify where libgccjit.c, jit-recording.c and
jit-playback.c fit into the high-level diagram.
---
 gcc/jit/ChangeLog.jit| 10 ++
 gcc/jit/docs/internals/index.rst |  7 +++
 gcc/jit/jit-common.h | 12 +++-
 gcc/jit/notes.txt| 13 +
 4 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 4592002..1a76543 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,15 @@
 2014-10-07  David Malcolm  
 
+   * docs/internals/index.rst (Overview of code structure): Directly
+   include the comment from jit-common.h as rst, rather than as a
+   quoted C++ comment.
+   * jit-common.h: Convert the summary format to valid reStructured
+   text for inclusion by docs/internals/index.rst.
+   * notes.txt: Clarify where libgccjit.c, jit-recording.c and
+   jit-playback.c fit into the high-level diagram.
+
+2014-10-07  David Malcolm  
+
* Make-lang.in (jit_OBJS): Drop jit/internal-api.o.
Add jit/jit-recording.o and jit/jit-playback.o.
 
diff --git a/gcc/jit/docs/internals/index.rst b/gcc/jit/docs/internals/index.rst
index 3065c60..1e3952c 100644
--- a/gcc/jit/docs/internals/index.rst
+++ b/gcc/jit/docs/internals/index.rst
@@ -152,7 +152,6 @@ Overview of code structure
 
 Here is a high-level summary from ``jit-common.h``:
 
-   .. literalinclude:: ../../jit-common.h
-:start-after: /* Summary.  */
-:end-before: namespace gcc {
-:language: c++
+.. include:: ../../jit-common.h
+  :start-after: This comment is included by the docs.
+  :end-before: End of comment for inclusion in the docs.  */
diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 5c41ddd..58e4a8c 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -36,9 +36,9 @@ along with GCC; see the file COPYING3.  If not see
 
 const int NUM_GCC_JIT_TYPES = GCC_JIT_TYPE_FILE_PTR + 1;
 
-/* Summary.  */
+/* This comment is included by the docs.
 
-/* In order to allow jit objects to be usable outside of a compile
+   In order to allow jit objects to be usable outside of a compile
whilst working with the existing structure of GCC's code the
C API is implemented in terms of a gcc::jit::recording::context,
which records the calls made to it.
@@ -79,15 +79,17 @@ const int NUM_GCC_JIT_TYPES = GCC_JIT_TYPE_FILE_PTR + 1;
 
During a playback, we associate objects from the recording with
their counterparts during this playback.  For simplicity, we store this
-   within the recording objects, as "void *m_playback_obj", casting it to
+   within the recording objects, as ``void *m_playback_obj``, casting it to
the appropriate playback object subclass.  For these casts to make
sense, the two class hierarchies need to have the same structure.
 
-   Note that the playback objects that "m_playback_obj" points to are
+   Note that the playback objects that ``m_playback_obj`` points to are
GC-allocated, but the recording objects don't own references:
these associations only exist within a part of the code where
the GC doesn't collect, and are set back to NULL before the GC can
-   run.  */
+   run.
+
+   End of comment for inclusion in the docs.  */
 
 namespace gcc {
 
diff --git a/gcc/jit/notes.txt b/gcc/jit/notes.txt
index 54dca8f..d337cb4 100644
--- a/gcc/jit/notes.txt
+++ b/gcc/jit/notes.txt
@@ -5,8 +5,13 @@ Client Code   . Generated .libgccjit.so
│  .   .  .   .
 ──>  .   .
   .   .│ .   .
-
-  .   (record API calls) .
+  .   .V .   .
+  .   .──> libgccjit.c   .
+  .   .│ (error-checking).
+  .   .│ .
+  .   .──> jit-recording.c
+  .   .  (record API calls)
+  .   .<───  .
   .   .│ .   .
<───  .   .
│  .   .  .   .
@@ -27,8 +32,8 @@ Client Code   . Generated .libgccjit.so
   .   .  .│  .
 ..│..V...
   .   .  .│  .   No GC in here
-  .   .  

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Joseph S. Myers
On Tue, 7 Oct 2014, Marek Polacek wrote:

> 2014-10-07  Marek Polacek  
> 
>   PR c/59717
>   * c-decl.c (header_for_builtin_fn): New function.
>   (implicitly_declare): Suggest which header to include.
> 
>   * gcc.dg/pr59717.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Jan Hubicka
> On Mon, Oct 6, 2014 at 11:52 AM, Eric Botcazou  wrote:
> > Hi,
> >
> > you can have chains of clone functions in the callgraph but 
> > can_inline_edge_p
> > stops at the first clone when it is looking for DECL_STRUCT_FUNCTION, which
> > can fool the following conditions in the predicate.
> >
> > Tested on x86_64-suse-linux, OK for the mainline?
> 
> I wonder if this is worth abstracting into a callee_fn () cgraph edge method?
> 
> Honzas call.

I would rather fix can_inline_edge_p to not use DECL_STRUCT_FUNCTION - it is not
available during WPA and thus all the code using it is wrong.  The 
non_call_exceptions code has FIXME explaining that, I see that someone added 
cilk.
It should be easy to move these flags to cgraph node itself - originally I did 
not
want to duplicate it and worried about performance implications.

Honza
> 
> Thanks,
> Richard.
> 
> >
> > 2014-10-06  Eric Botcazou  
> >
> > * ipa-inline.c (can_inline_edge_p): Recurse on clones to find the
> > DECL_STRUCT_FUNCTION of the original node.
> >
> >
> > --
> > Eric Botcazou


[jit] Eliminate internal-api.c/h in favor of jit-common.h, jit-playback.c/h, jit-recording.c/h

2014-10-07 Thread David Malcolm
jit/internal-api.c and .h were getting large, so I broke them out into:

  * jit-common.h (forward decls of types)
  * jit-recording.h/c (the gcc::jit::recording classes)
  * jit-playback.h/c (the gcc::jit::playback classes)

Committed to branch dmalcolm/jit as 3071567787aef4a8ada8b38c890d01c19b4b998f

Not posting the full patch here as it's 400KB, but it can be seen at:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=3071567787aef4a8ada8b38c890d01c19b4b998f

gcc/jit/ChangeLog.jit:
* Make-lang.in (jit_OBJS): Drop jit/internal-api.o.
Add jit/jit-recording.o and jit/jit-playback.o.

* internal-api.c: Delete, moving content to new files jit-recording.c
and jit-playback.c.
* internal-api.h: Delete, moving content to new files
jit-common.h, jit-playback.h, jit-recording.h.
* jit-common.h: New file, containing the forward decls of classes
formerly in internal-api.h.
* jit-recording.c: New file, containing the gcc::jit::recording
code formerly in internal-api.c, and gcc::jit::dump.
* jit-recording.h: New file, containing the gcc::jit::recording
prototypes formerly in internal-api.h.
* jit-playback.c: New file, containing the gcc::jit::playback
code formerly in internal-api.c.
* jit-playback.h: New file, containing the gcc::jit::playback
prototypes formerly in internal-api.h.

* dummy-frontend.c: Don't include "internal-api.h".  Add includes
of jit-common.h and jit-playback.h.
* jit-builtins.h: Replace include of internal-api.h with
jit-common.h.
* jit-builtins.c: Replace include of internal-api.h with
jit-common.h.  Add include of jit-recording.h.
* libgccjit.c: Likewise.

* docs/internals/index.rst (Overview of code structure): Update
to reflect the above changes.
---
 gcc/jit/ChangeLog.jit|   31 +
 gcc/jit/Make-lang.in |6 +-
 gcc/jit/docs/internals/index.rst |   18 +-
 gcc/jit/dummy-frontend.c |3 +-
 gcc/jit/internal-api.c   | 5473 --
 gcc/jit/internal-api.h   | 2264 
 gcc/jit/jit-builtins.c   |3 +-
 gcc/jit/jit-builtins.h   |2 +-
 gcc/jit/jit-common.h |  180 ++
 gcc/jit/jit-playback.c   | 2098 +++
 gcc/jit/jit-playback.h   |  564 
 gcc/jit/jit-recording.c  | 3415 
 gcc/jit/jit-recording.h  | 1593 +++
 gcc/jit/libgccjit.c  |3 +-
 14 files changed, 7902 insertions(+), 7751 deletions(-)
 delete mode 100644 gcc/jit/internal-api.c
 delete mode 100644 gcc/jit/internal-api.h
 create mode 100644 gcc/jit/jit-common.h
 create mode 100644 gcc/jit/jit-playback.c
 create mode 100644 gcc/jit/jit-playback.h
 create mode 100644 gcc/jit/jit-recording.c
 create mode 100644 gcc/jit/jit-recording.h


Re: RFA: fix mode confusion in caller-save.c:replace_reg_with_saved_mem

2014-10-07 Thread Jeff Law

On 10/06/14 20:57, Joern Rennecke wrote:

On 6 October 2014 19:58, Jeff Law  wrote:

What makes word_mode special here?  ie, why is special casing for word_mode
the right thing to do?


The patch does not special-case word mode.  The if condition tests if
smode would
cover multiple hard registers.
If that would be the case, smode is replaced with word_mode.

SO I'll ask another way.  Why do you want to change smode to word_mode?

Jeff



[PATCH] More testsuite cleanups

2014-10-07 Thread Marek Polacek
Some more cleanups revealed by testing on ppc64.

Applying to trunk.

2014-10-07  Marek Polacek  

* gcc.dg/guality/pr41616-1.c: Use -fgnu89-inline.
* gcc.dg/iftrap-1.c: Fix implicit declarations.
* gcc.target/powerpc/pr26350.c: Likewise.
* gcc.target/powerpc/altivec-consts.c: Likewise.
* gcc.target/powerpc/altivec-varargs-1.c: Likewise.
* gcc.target/powerpc/le-altivec-consts.c: Likewise.
* gcc.target/powerpc/ppc-vector-memcpy.c: Likewise.
* gcc.target/powerpc/ppc-vector-memset.c: Likewise.
* gcc.target/powerpc/pr47862.c: Likewise.
* gcc.target/powerpc/pr48053-1.c: Likewise.
* gcc.target/powerpc/pr53487.c: Likewise.
* gcc.dg/vect/pr48765.c: Fix implicit declarations and defaulting
to int.
* gcc.target/powerpc/20050603-1.c: Fix defaulting to int.
* gcc.target/powerpc/altivec-2.c: Likewise.
* gcc.target/powerpc/pr47755-2.c: Likewise.

diff --git gcc/testsuite/gcc.dg/guality/pr41616-1.c 
gcc/testsuite/gcc.dg/guality/pr41616-1.c
index 24f64ab..fcd1ad5 100644
--- gcc/testsuite/gcc.dg/guality/pr41616-1.c
+++ gcc/testsuite/gcc.dg/guality/pr41616-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run { xfail *-*-* } } */
-/* { dg-options "-g" } */
+/* { dg-options "-g -fgnu89-inline" } */
 
 #include "guality.h"
 
diff --git gcc/testsuite/gcc.dg/iftrap-1.c gcc/testsuite/gcc.dg/iftrap-1.c
index 1427820..c6d5584 100644
--- gcc/testsuite/gcc.dg/iftrap-1.c
+++ gcc/testsuite/gcc.dg/iftrap-1.c
@@ -3,6 +3,8 @@
 /* { dg-do compile { target rs6000-*-* powerpc*-*-* sparc*-*-* ia64-*-* } } */
 /* { dg-final { scan-assembler-not "^\t(trap|ta|break)\[ \t\]" } } */
 
+void bar (void);
+
 void f1(int p)
 {
   if (p)
diff --git gcc/testsuite/gcc.dg/vect/pr48765.c 
gcc/testsuite/gcc.dg/vect/pr48765.c
index 50839e3..2b2907b 100644
--- gcc/testsuite/gcc.dg/vect/pr48765.c
+++ gcc/testsuite/gcc.dg/vect/pr48765.c
@@ -33,8 +33,10 @@ static char *regs_change_size;
 static HARD_REG_SET *after_insn_hard_regs;
 static int stupid_find_reg (int, enum reg_class, enum machine_mode, int, int,
int);
+enum reg_class reg_preferred_class (int);
 void
 stupid_life_analysis (f, nregs, file)
+ int nregs, file;
  rtx f;
 {
   register int i;
@@ -52,7 +54,7 @@ stupid_life_analysis (f, nregs, file)
 static int
 stupid_find_reg (call_preserved, class, mode, born_insn, dead_insn,
 changes_size)
- int call_preserved;
+ int call_preserved, born_insn, dead_insn, changes_size;
  enum reg_class class;
  enum machine_mode mode;
 {
diff --git gcc/testsuite/gcc.target/powerpc/20050603-1.c 
gcc/testsuite/gcc.target/powerpc/20050603-1.c
index 041551b..f801c43 100644
--- gcc/testsuite/gcc.target/powerpc/20050603-1.c
+++ gcc/testsuite/gcc.target/powerpc/20050603-1.c
@@ -15,6 +15,7 @@ test_reg_save_restore (int *p)
 setlocale (LC_ALL, "C");
 testreg = ext_func(p);
 }
+int
 main() {
   testreg = &x;
   test_reg_save_restore (&y);
diff --git gcc/testsuite/gcc.target/powerpc/altivec-2.c 
gcc/testsuite/gcc.target/powerpc/altivec-2.c
index 4f341dd..a91ac0c 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-2.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-2.c
@@ -23,6 +23,7 @@ int xxx[sizeof(foobar) == 16 ? 69 : -1];
 
 int nc17[sizeof(shoe) == sizeof (char *) ? 69 : -1];
 
+void
 code ()
 {
   *shoe = polish;
diff --git gcc/testsuite/gcc.target/powerpc/altivec-consts.c 
gcc/testsuite/gcc.target/powerpc/altivec-consts.c
index 2afd13f..36cb60c 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-consts.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-consts.c
@@ -6,6 +6,7 @@
 /* Check that "easy" AltiVec constants are correctly synthesized.  */
 
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 typedef __attribute__ ((vector_size (16))) unsigned char v16qi;
 typedef __attribute__ ((vector_size (16))) unsigned short v8hi;
diff --git gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c 
gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
index 1349ae5..d62f5bb 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
@@ -7,6 +7,7 @@
 
 extern void exit (int);
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 #define vector __attribute__((vector_size (16)))
 
diff --git gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c 
gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
index 75733d6..15ec650 100644
--- gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
+++ gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
@@ -6,6 +6,7 @@
 /* Check that "easy" AltiVec constants are correctly synthesized.  */
 
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 typedef __attribute__ ((vector_size (16))) unsigned char v16qi;
 typedef __attribute__ ((vector_size (16))) unsigned short v8hi;
diff --git gcc/testsuite/gcc

[patch] remove dwarf2out's current_function_has_inlines

2014-10-07 Thread Aldy Hernandez

Errr... a static that only gets written to?

OK to commit?
commit 7b1c19385fd06d6a2d0844d453bf1c7683071440
Author: Aldy Hernandez 
Date:   Tue Oct 7 10:14:02 2014 -0700

* dwarf2out.c: Remove current_function_has_inlines.
(gen_subprogram_die): Same.
(gen_inlined_subroutine_die): Same.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index b5fcfa4..1b30ea9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -2954,9 +2954,6 @@ static GTY(()) unsigned int loclabel_num;
 /* Unique label counter for point-of-call tables.  */
 static GTY(()) unsigned int poc_label_num;
 
-/* Record whether the function being analyzed contains inlined functions.  */
-static int current_function_has_inlines;
-
 /* The last file entry emitted by maybe_emit_file().  */
 static GTY(()) struct dwarf_file_data * last_emitted_file;
 
@@ -18613,7 +18610,6 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
   if (DECL_NAME (DECL_RESULT (decl)))
gen_decl_die (DECL_RESULT (decl), NULL, subr_die);
 
-  current_function_has_inlines = 0;
   decls_for_scope (outer_scope, subr_die, 0);
 
   if (call_arg_locations && !dwarf_strict)
@@ -19270,7 +19266,6 @@ gen_inlined_subroutine_die (tree stmt, dw_die_ref 
context_die, int depth)
   add_call_src_coords_attributes (stmt, subr_die);
 
   decls_for_scope (stmt, subr_die, depth);
-  current_function_has_inlines = 1;
 }
 }
 


Re: [Patch, MIPS] Add Octeon3 support

2014-10-07 Thread Joseph S. Myers
Patches adding new -march= values need to update invoke.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Tom Tromey
Marek> I saw declarations of JvRunMain{,Name} with no parameters and with
Marek> some parameters.

Oh yeah, duh.

Marek>  So I decided to make it prototype-less function
Marek> declaration for now.  I think we don't have to worry about
Marek> -Wstrict-prototypes for now.

Thanks for looking.

Tom


Re: SD-6 C++ feature-testing macros for 4.9

2014-10-07 Thread Jason Merrill

On 10/04/2014 07:28 PM, Ed Smith-Rowland wrote:

This really does build clean and test clean on x86_64-linux.
It's basically the same as for 5.0 except experimental/any isn't in and
variable templates aren't in.


OK.

Jason



[patch] tag ../include/*

2014-10-07 Thread Aldy Hernandez
Is there a reason we don't create etags for toplevel include files?  If 
not, could I please apply this patch?


Thanks.
Aldy
commit a679529d14f005d8c88517f72d2b5295d8c82f0f
Author: Aldy Hernandez 
Date:   Tue Oct 7 09:32:21 2014 -0700

* Makefile.in (TAGS): Tag ../include files.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 97b439a..df43b9c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3772,6 +3772,7 @@ TAGS: lang.tags
  fi;   \
done;   \
etags -o TAGS.sub c-family/*.h c-family/*.c *.h *.c *.cc \
+ ../include/*.h \
  --language=none --regex="/\(char\|unsigned 
int\|int\|bool\|void\|HOST_WIDE_INT\|enum [A-Za-z_0-9]+\) 
[*]?\([A-Za-z_0-9]+\)/\2/" common.opt\
  --language=none 
--regex="/\(DEF_RTL_EXPR\|DEFTREECODE\|DEFGSCODE\).*(\([A-Za-z_0-9]+\)/\2/" 
rtl.def tree.def gimple.def \
  --language=none --regex="/DEFTIMEVAR (\([A-Za-z_0-9]+\)/\1/" 
timevar.def \


Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek
On Tue, Oct 07, 2014 at 10:03:26AM -0600, Tom Tromey wrote:
> > "Marek" == Marek Polacek  writes:
> 
> Marek> [CCing java-patches now]
> Marek> Java testsuite breaks with -std=gnu11 as a default and/or with 
> Marek> -Wimplicit-function-declaration on
> 
> I don't recall how one gets warnings when compiling this generated code,
> but if it is generally possible then I think this:

I'm not sure I understand, but this piece of code gets compiled when
running the libjava testsuite.  And when the warning triggers, we get
many fails.
 
> Marek> +  if (indirect)
> Marek> +fprintf (stream, "extern void JvRunMainName ();\n");
> Marek> +  else
> Marek> +fprintf (stream, "extern void JvRunMain ();\n");
> 
> ... will fail with -Wstrict-prototypes, since in C those should
> read "(void)" rather than "()".
> 
> If it's not possible then no big deal.

I saw declarations of JvRunMain{,Name} with no parameters and with
some parameters.  So I decided to make it prototype-less function
declaration for now.  I think we don't have to worry about
-Wstrict-prototypes for now.

Marek


Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Tom Tromey
> "Marek" == Marek Polacek  writes:

Marek> [CCing java-patches now]
Marek> Java testsuite breaks with -std=gnu11 as a default and/or with 
Marek> -Wimplicit-function-declaration on

I don't recall how one gets warnings when compiling this generated code,
but if it is generally possible then I think this:

Marek> +  if (indirect)
Marek> +fprintf (stream, "extern void JvRunMainName ();\n");
Marek> +  else
Marek> +fprintf (stream, "extern void JvRunMain ();\n");

... will fail with -Wstrict-prototypes, since in C those should
read "(void)" rather than "()".

If it's not possible then no big deal.

Tom


Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-10-07 Thread Ilya Enkovich
2014-10-03 23:59 GMT+04:00 Jeff Law :
> On 10/03/14 02:50, Ilya Enkovich wrote:
>>
>> Attached is an updated version of the patch.  It has disabled
>> instrumenttation for builtin calls.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2014-10-02  Ilya Enkovich
>>
>> * tree-chkp.c: New.
>> * tree-chkp.h: New.
>> * rtl-chkp.c: New.
>> * rtl-chkp.h: New.
>> * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
>> (GTFILES): Add tree-chkp.c.
>> * c-family/c.opt (fchkp-check-incomplete-type): New.
>> (fchkp-zero-input-bounds-for-main): New.
>> (fchkp-first-field-has-own-bounds): New.
>> (fchkp-narrow-bounds): New.
>> (fchkp-narrow-to-innermost-array): New.
>> (fchkp-optimize): New.
>> (fchkp-use-fast-string-functions): New.
>> (fchkp-use-nochk-string-functions): New.
>> (fchkp-use-static-bounds): New.
>> (fchkp-use-static-const-bounds): New.
>> (fchkp-treat-zero-dynamic-size-as-infinite): New.
>> (fchkp-check-read): New.
>> (fchkp-check-write): New.
>> (fchkp-store-bounds): New.
>> (fchkp-instrument-calls): New.
>> (fchkp-instrument-marked-only): New.
>> * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
>> __CHKP__ macro when Pointer Bounds Checker is on.
>> * passes.def (pass_ipa_chkp_versioning): New.
>> (pass_early_local_passes): Removed.
>> (pass_build_ssa_passes): New.
>> (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
>> (pass_chkp_instrumentation_passes): New.
>> (pass_ipa_chkp_produce_thunks): New.
>> (pass_local_optimization_passes): New.
>> (pass_chkp_opt): New.
>> * toplev.c: include tree-chkp.h.
>> (compile_file): Add chkp_finish_file call.
>> * tree-pass.h (make_pass_ipa_chkp_versioning): New.
>> (make_pass_ipa_chkp_produce_thunks): New.
>> (make_pass_chkp): New.
>> (make_pass_chkp_opt): New.
>> (make_pass_early_local_passes): Removed.
>> (make_pass_build_ssa_passes): New.
>> (make_pass_chkp_instrumentation_passes): New.
>> (make_pass_local_optimization_passes): New.
>> * tree.h (called_as_built_in): New.
>> * builtins.c (called_as_built_in): Not static anymore.
>> * passes.c (pass_manager::execute_early_local_passes): Execute
>> early passes in three steps.
>> (execute_all_early_local_passes): Removed.
>> (pass_data_early_local_passes): Removed.
>> (pass_early_local_passes): Removed.
>> (execute_build_ssa_passes): New.
>> (pass_data_build_ssa_passes): New.
>> (pass_build_ssa_passes): New.
>> (pass_data_chkp_instrumentation_passes): New.
>> (pass_chkp_instrumentation_passes): New.
>> (pass_data_local_optimization_passes): New.
>> (pass_local_optimization_passes): New.
>> (make_pass_early_local_passes): Removed.
>> (make_pass_build_ssa_passes): New.
>> (make_pass_chkp_instrumentation_passes): New.
>> (make_pass_local_optimization_passes): New.
>>
>> gcc/testsuite
>>
>> 2014-10-02  Ilya Enkovich
>>
>> * gcc.dg/pr37858.c: Replace early_local_cleanups pass name
>> with build_ssa_passes.
>
> General question.  At the RTL level you represent the bounds with an RTX
> which is perfectly reasonable.  What are the structure sharing assumptions
> of those values?  Do they follow the existing RTL structure sharing
> assumptions?
>
> Minor nit 2014 in the copyright year for all these files ;-)
>
> So, for example if there are two references to the same bounds in RTL, are
> they distinct RTXs with the same underlying values?  Or is it a single rtx
> object that is shared?  It looks like you generally create new RTXs, but I'm
> a bit concerned that you might shove those things into a hash table and
> return them and embed a single reference into multiple hunks of parent RTL.

For expander bounds are quite regular vars and SSA names which are
expanded as all other values and therefore I believe regular sharing
assumptions are followed.

Hash tables are used just to link pointer values returned by call with
returned bounds.  It is required to expand retbnd calls.  Similarly
returned bounds are associated with DECL_RESULT using
SET_DECL_BOUNDS_RTL.

>
>
>
>
>
>
>>
>>
>> mpx-9-pass.patch
>>
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 17754e5..78ac91f 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -255,7 +255,7 @@ is_builtin_fn (tree decl)
>>  of the optimization level.  This means whenever a function is invoked
>> with
>>  its "internal" name, which normally contains the prefix "__builtin".
>> */
>>
>> -static bool
>> +bool
>>   called_as_built_in (tree node)
>>   {
>> /* Note that we must use DECL_NAME, not DECL_ASSEMBLER_NAME_SET_P
>> since
>
> Is there some reason yo

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek
On Tue, Oct 07, 2014 at 05:00:26PM +0200, Richard Biener wrote:
> On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek  wrote:
> > On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
> >> Why not annotate builtins.def with the info?
> >
> > Because I think that would be more hairy, I'd have to change DEF_BUILTIN
> > and all the builtins.  That seemed superfluous given that this hint is
> > only for a C FE...
> 
> All C family frontends, no?  And builtins.def is used by (and only by)
> all C family frontends...

As Jakub pointed out, only C and ObjC for now.
 
> Well - just a suggestion ;)

Thanks - builtins.def was where I originally started.

Marek


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek
On Tue, Oct 07, 2014 at 05:00:05PM +0200, Jakub Jelinek wrote:
> On Tue, Oct 07, 2014 at 04:51:31PM +0200, Marek Polacek wrote:
> > On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
> > > Why not annotate builtins.def with the info?
> > 
> > Because I think that would be more hairy, I'd have to change DEF_BUILTIN
> > and all the builtins.  That seemed superfluous given that this hint is
> > only for a C FE...
> 
> Guess it depends on how many DEF_*_BUILTIN classes would this affect,

At least the following:
DEF_LIB_BUILTIN
DEF_C94_BUILTIN
DEF_C99_BUILTIN
DEF_C11_BUILTIN
DEF_C99_COMPL_BUILTIN
DEF_C99_C90RES_BUILTIN
I think that is quite a lot.

> if just a couple, you could add DEF_*_BUILTIN_WITH_C_HINT, with an extra
> arg.  But as the builtins.def info already has quite long lines, making them
> even longer might not be best.  So perhaps the switch is good enough.

Yeah, that the lines are long enough already was one of the things
that discouraged me from tweaking builtins.def.

Marek


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jakub Jelinek
On Tue, Oct 07, 2014 at 05:00:26PM +0200, Richard Biener wrote:
> On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek  wrote:
> > On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
> >> Why not annotate builtins.def with the info?
> >
> > Because I think that would be more hairy, I'd have to change DEF_BUILTIN
> > and all the builtins.  That seemed superfluous given that this hint is
> > only for a C FE...
> 
> All C family frontends, no?  And builtins.def is used by (and only by)
> all C family frontends...

Well, the C++ FE on say:
void
bar (void)
{
  abort ();
}

just errors out:
/tmp/a.c: In function ‘void bar()’:
/tmp/a.c:4:10: error: ‘abort’ was not declared in this scope
   abort ();
  ^
adding a hint in this case is less obvious than in the C case, because,
what if this wasn't supposed to be ::abort (), but std::abort (), or
some other namespace abort, or some class abort () method etc.?

Jakub


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek  wrote:
> On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
>> Why not annotate builtins.def with the info?
>
> Because I think that would be more hairy, I'd have to change DEF_BUILTIN
> and all the builtins.  That seemed superfluous given that this hint is
> only for a C FE...

All C family frontends, no?  And builtins.def is used by (and only by)
all C family frontends...

Well - just a suggestion ;)

I'd like to see some easier to grok specification of the number of arguments
expected to the builtins for example (for the match-and-simplify work).

Richard.

> Marek


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jakub Jelinek
On Tue, Oct 07, 2014 at 04:51:31PM +0200, Marek Polacek wrote:
> On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
> > Why not annotate builtins.def with the info?
> 
> Because I think that would be more hairy, I'd have to change DEF_BUILTIN
> and all the builtins.  That seemed superfluous given that this hint is
> only for a C FE...

Guess it depends on how many DEF_*_BUILTIN classes would this affect,
if just a couple, you could add DEF_*_BUILTIN_WITH_C_HINT, with an extra
arg.  But as the builtins.def info already has quite long lines, making them
even longer might not be best.  So perhaps the switch is good enough.

Jakub


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek
On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
> Why not annotate builtins.def with the info?

Because I think that would be more hairy, I'd have to change DEF_BUILTIN
and all the builtins.  That seemed superfluous given that this hint is
only for a C FE...

Marek


Re: [patch] Fix miscompilation of gnat1 in LTO bootstrap

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 10:04 AM, Eric Botcazou  wrote:
>> Testcase?  I think it would be better to handle this in the canonical type
>> merging code in lto.c - or how does it end up working without LTO?  That is,
>> what does the Ada frontend do to make sure get_alias_set handles this
>> correctly?
>
> It manages the alias sets, see gcc-interface/utils.c:relate_alias_sets.

Ugh :/

I can't see how this can work with LTO.  We need a middle-end way
to represent the alias relation of those types.  At least I can't see how
your simple patch covers all cases here?

With LTO we preserve TYPE_ALIAS_SET == 0, so another way to
fix this (and which I'd like more) is to do your patch in the Ada frontend,
that is, use alias-set zero for all types you relate if flag_lto.

Another way is to make LTO canonical type merging handle the
case of type_contains_placeholder_p "better", that is by treating
two types with those equivalent more easily.  For arrays this simply
means hashing and comparing non-constant TYPE_DOMAIN the
same / as equal.  There is already some code handling PLACEHODER_EXPR
special, but it doesn't seem to be enough (why in this case)?

Thanks,
Richard.

> --
> Eric Botcazou


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin
On 07 Oct 16:30, Jakub Jelinek wrote:
> Another thing I've noticed, when target-1.exe is built, there are tons of
> sections that IMHO should have been stripped away:

Could you please re-checkout the branch?  I fixed this issue a week ago.

Thanks,
  -- Ilya


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek
Hi!

Also, something that I believe has been discussed in the past, but can't
find it on your wiki page nor in *.opt, are option overrides for the
offloading target, i.e. some option you can pass to the host compiler driver
during linking that will tell the driver for which offloading targets (if
any at all) to produce the offloading support (defaulting to all configured
offloading target is fine) and optionally what extra options beyond what has
been passed on the command line should be passed to the offloading compiler.

Say, if I want to link target-1.exe such that it will only support host
fallback and not x86_64-intelmicemul-linux-gnu , how do I achieve that now?

Jakub


Re: [PATCH] PR lto/59441 Add initialization and release of bitmap obstack

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 2:55 PM, Ilya Palachev  wrote:
> Hi all,
>
> Attached patch fixes PR lto/59441.
> The reason of failure was that the default bitmap obstack was released just
> before the execution of early local passes.
> The error was found using valgrind. It reported that there were 153 invalid
> reads and 173 invalid writes into the field of the default bitmap obstack
> structure,
> and all of them were trying to access data that was free'd previously (at
> the same point of the program).
>
> The solution is to add initialization and release of the bitmap obstack
> before and after the execution of early local passes.
> After applying this patch valgrind does not report any errors for the same
> testcase.
>
> The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
>
> Ok for trunk?

Ok.

Thanks,
Richard.

> Best regards,
> Ilya Palachev


Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 2:53 PM, Marek Polacek  wrote:
> PR59717 is a request for hints which header to include if the compiler warns
> about incompatible implicit declarations.  E.g., if one uses abort
> without declaring it first, we now say
> note: include ‘’ or provide a declaration of ‘abort’
> I've added hints only for standard functions which means we won't display
> the hint for functions such as mempcpy.
>
> The implementation is based on a function that just maps built_in_function
> codes to header names.
>
> Two remarks:
> * header_for_builtin_fn is long and I don't want to unnecessarily
>   inflate already big c-decl.c file, so it might make sense to move
>   the function into c-errors.c;
> * we don't issue "incompatible implicit declaration of built-in function"
>   warning for functions that return int and whose parameter types don't need
>   default promotions - for instance putc, fputs, ilogb, strcmp, vprintf, 
> isnan,
>   isalpha, ...  Therefore for such functions we don't print the hint neither.
>   header_for_builtin_fn is ready for them, though.  (The cases for 
>   and  could be removed.)
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Why not annotate builtins.def with the info?

Richard.

> 2014-10-07  Marek Polacek  
>
> PR c/59717
> * c-decl.c (header_for_builtin_fn): New function.
> (implicitly_declare): Suggest which header to include.
>
> * gcc.dg/pr59717.c: New test.
>
> diff --git gcc/c/c-decl.c gcc/c/c-decl.c
> index ce5a8de..e23284a 100644
> --- gcc/c/c-decl.c
> +++ gcc/c/c-decl.c
> @@ -2968,6 +2968,189 @@ implicit_decl_warning (location_t loc, tree id, tree 
> olddecl)
>  }
>  }
>
> +/* This function represents mapping of a function code FCODE
> +   to its respective header.  */
> +
> +static const char *
> +header_for_builtin_fn (enum built_in_function fcode)
> +{
> +  switch (fcode)
> +{
> +CASE_FLT_FN (BUILT_IN_ACOS):
> +CASE_FLT_FN (BUILT_IN_ACOSH):
> +CASE_FLT_FN (BUILT_IN_ASIN):
> +CASE_FLT_FN (BUILT_IN_ASINH):
> +CASE_FLT_FN (BUILT_IN_ATAN):
> +CASE_FLT_FN (BUILT_IN_ATANH):
> +CASE_FLT_FN (BUILT_IN_ATAN2):
> +CASE_FLT_FN (BUILT_IN_CBRT):
> +CASE_FLT_FN (BUILT_IN_CEIL):
> +CASE_FLT_FN (BUILT_IN_COPYSIGN):
> +CASE_FLT_FN (BUILT_IN_COS):
> +CASE_FLT_FN (BUILT_IN_COSH):
> +CASE_FLT_FN (BUILT_IN_ERF):
> +CASE_FLT_FN (BUILT_IN_ERFC):
> +CASE_FLT_FN (BUILT_IN_EXP):
> +CASE_FLT_FN (BUILT_IN_EXP2):
> +CASE_FLT_FN (BUILT_IN_EXPM1):
> +CASE_FLT_FN (BUILT_IN_FABS):
> +CASE_FLT_FN (BUILT_IN_FDIM):
> +CASE_FLT_FN (BUILT_IN_FLOOR):
> +CASE_FLT_FN (BUILT_IN_FMA):
> +CASE_FLT_FN (BUILT_IN_FMAX):
> +CASE_FLT_FN (BUILT_IN_FMIN):
> +CASE_FLT_FN (BUILT_IN_FMOD):
> +CASE_FLT_FN (BUILT_IN_FREXP):
> +CASE_FLT_FN (BUILT_IN_HYPOT):
> +CASE_FLT_FN (BUILT_IN_ILOGB):
> +CASE_FLT_FN (BUILT_IN_LDEXP):
> +CASE_FLT_FN (BUILT_IN_LGAMMA):
> +CASE_FLT_FN (BUILT_IN_LLRINT):
> +CASE_FLT_FN (BUILT_IN_LLROUND):
> +CASE_FLT_FN (BUILT_IN_LOG):
> +CASE_FLT_FN (BUILT_IN_LOG10):
> +CASE_FLT_FN (BUILT_IN_LOG1P):
> +CASE_FLT_FN (BUILT_IN_LOG2):
> +CASE_FLT_FN (BUILT_IN_LOGB):
> +CASE_FLT_FN (BUILT_IN_LRINT):
> +CASE_FLT_FN (BUILT_IN_LROUND):
> +CASE_FLT_FN (BUILT_IN_MODF):
> +CASE_FLT_FN (BUILT_IN_NAN):
> +CASE_FLT_FN (BUILT_IN_NEARBYINT):
> +CASE_FLT_FN (BUILT_IN_NEXTAFTER):
> +CASE_FLT_FN (BUILT_IN_NEXTTOWARD):
> +CASE_FLT_FN (BUILT_IN_POW):
> +CASE_FLT_FN (BUILT_IN_REMAINDER):
> +CASE_FLT_FN (BUILT_IN_REMQUO):
> +CASE_FLT_FN (BUILT_IN_RINT):
> +CASE_FLT_FN (BUILT_IN_ROUND):
> +CASE_FLT_FN (BUILT_IN_SCALBLN):
> +CASE_FLT_FN (BUILT_IN_SCALBN):
> +CASE_FLT_FN (BUILT_IN_SIN):
> +CASE_FLT_FN (BUILT_IN_SINH):
> +CASE_FLT_FN (BUILT_IN_SINCOS):
> +CASE_FLT_FN (BUILT_IN_SQRT):
> +CASE_FLT_FN (BUILT_IN_TAN):
> +CASE_FLT_FN (BUILT_IN_TANH):
> +CASE_FLT_FN (BUILT_IN_TGAMMA):
> +CASE_FLT_FN (BUILT_IN_TRUNC):
> +case BUILT_IN_ISINF:
> +case BUILT_IN_ISNAN:
> +  return "";
> +CASE_FLT_FN (BUILT_IN_CABS):
> +CASE_FLT_FN (BUILT_IN_CACOS):
> +CASE_FLT_FN (BUILT_IN_CACOSH):
> +CASE_FLT_FN (BUILT_IN_CARG):
> +CASE_FLT_FN (BUILT_IN_CASIN):
> +CASE_FLT_FN (BUILT_IN_CASINH):
> +CASE_FLT_FN (BUILT_IN_CATAN):
> +CASE_FLT_FN (BUILT_IN_CATANH):
> +CASE_FLT_FN (BUILT_IN_CCOS):
> +CASE_FLT_FN (BUILT_IN_CCOSH):
> +CASE_FLT_FN (BUILT_IN_CEXP):
> +CASE_FLT_FN (BUILT_IN_CIMAG):
> +CASE_FLT_FN (BUILT_IN_CLOG):
> +CASE_FLT_FN (BUILT_IN_CONJ):
> +CASE_FLT_FN (BUILT_IN_CPOW):
> +CASE_FLT_FN (BUILT_IN_CPROJ):
> +CASE_FLT_FN (BUILT_IN_CREAL):
> +CASE_FLT_FN (BUILT_IN_CSIN):
> +CASE_FLT_FN (BUILT_IN_CSINH):
> +CASE_FLT_FN (BUILT_IN_CSQRT):
> +CASE_FLT_FN (BUILT_IN_CTAN):
> +CASE_FLT_FN (BUILT_IN_CTANH):
> +  return "";
> +case BUILT_IN_MEMCHR:
> +case BUILT_IN_M

[jit] Use the full name of the installed driver binary

2014-10-07 Thread David Malcolm
On Fri, 2014-09-26 at 21:55 +, Joseph S. Myers wrote:
On Thu, 25 Sep 2014, David Malcolm wrote:
> 
> > Should this have the $(exeext) suffix seen in Makefile.in?
> >   $(target_noncanonical)-gcc-$(version)$(exeext)
> 
> Depends on whether that's needed for the pex code to find it.
> > As for (B), would it make sense to "bake in" the path to the binary into
> > the pex invocation, and hence to turn off PEX_SEARCH?  If so, presumably
> > I need to somehow expand the Makefile's value of $(bindir) into
> > internal-api.c, right?  (I tried this in configure.ac, but merely got
> > "$(exec_prefix)/bin" iirc).
> 
> An installation must be relocatable.  Thus, you can't just hardcode 
> looking in the configured prefix; you'd need to locate it relative to 
> libgccjit.so in some way (i.e. using make_relative_prefix, but I don't 
> know offhand how libgccjit.so would locate itself).
> 
> > A better long-term approach to this would be to extract the spec
> > machinery from gcc.c (perhaps into a "libdriver.a"?) and run it directly
> > from the jit library - but that's a rather involved patch, I suspect.
> 
> And you'd still need libgccjit.so to locate itself for proper 
> relocatability in finding other pieces such as assembler and linker.
> 
> > I wonder if the appropriate approach here is to have a single library
> > with multiple plugin backends e.g. one for the CPU, one for each GPU
> > family, with the ability to load multiple "backends" at once.
> 
> If you can get that working, sure.
> 
> > Unfortunately, "backend" is horribly overloaded here - I mean basically
> > all of gcc here, everything other than the libgccjit.h API seen by
> > client code.
> 
> (Though preferably as much as possible could be shared, i.e. properly 
> define the parts of GCC that need building separately for each target and 
> limit them as much as possible.  Joern's multi-target patches from 2010 
> that selectively built parts of GCC using namespaces while sharing others 
> without an obvious clear separation seemed very fragile.  For something 
> robust you either build everything separately for each target, or have a 
> well-defined separation between bits needing building separately and bits 
> that can be built once and ways to avoid non-obvious target dependencies 
> in bits built once.)

I've been experimenting with directly embedding the gcc.c driver code
in-process, but that patch was getting unwieldy, so for now, I'm going
with the simpler approach: just call the driver out-of-process,
specifying the full installed name:
  $(target_noncanonical)-gcc-$(version)$(exeext)
as expanded at configuration time, requiring it to be on the PATH.

Hopefully this addresses the last of the concerns raised in your initial
review; I'll do some more testing and then try to resubmit to the list
(I'm also thinking about breaking up internal-api.c/h, as they've become
rather large, into jit-recording/jit-playback.c/h)

Committed to branch dmalcolm/jit:

gcc/ChangeLog.jit:
* Makefile.in (site.exp): When constructing site.exp, add a line
to set "bindir".
* configure.ac: Generate a gcc-driver-name.h file containing
GCC_DRIVER_NAME for the benefit of jit/internal-api.c.
* configure: Regenerate.

gcc/jit/ChangeLog.jit:
* docs/internals/index.rst
(Using a working copy without installing): Rename to...
(Using a working copy without installing every time): ...this, and
update to reflect the need to have installed the driver binary
when running directly from a build directory.
(Running the test suite): Add PATH setting to the example.
* docs/intro/install.rst ("Hello world"): Likewise.
* internal-api.c: Include new autogenerated header
"gcc-driver-name.h".
(gcc::jit::playback::context::compile): Rather than looking for a
"gcc" on the path, look for GCC_DRIVER_NAME from gcc-driver-name.h,
as created by the configure script, so that we are using one for
the correct target.

gcc/testsuite/ChangeLog.jit:
* jit.dg/jit.exp (jit-dg-test): Prepend the installed bindir to
the PATH before invoking built binaries using the library, so that
the library can find the driver.  Restore the PATH immediately
afterwards.
---
 gcc/ChangeLog.jit|  8 +
 gcc/Makefile.in  |  1 +
 gcc/configure|  6 
 gcc/configure.ac |  6 
 gcc/jit/ChangeLog.jit| 16 ++
 gcc/jit/docs/internals/index.rst | 64 
 gcc/jit/docs/intro/install.rst   | 36 ++
 gcc/jit/internal-api.c   | 10 +--
 gcc/testsuite/ChangeLog.jit  |  7 +
 gcc/testsuite/jit.dg/jit.exp | 14 +
 10 files changed, 147 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index e71f7c4..ca73c04 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek
On Tue, Oct 07, 2014 at 05:51:53PM +0400, Ilya Verbin wrote:
> On 07 Oct 15:06, Jakub Jelinek wrote:
> > Still have issues with the non-installed testing.
> 
> The idea was that the offload compiler should be installed.
> 
> > If I add
> > -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
> > -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
> 
> Yes, since lto-wrapper uses COMPILER_PATH + "/accel//" to find
> mkoffload, it requires that the offload compiler with mkoffload are installed.
> Probably, it can be extended to search in the build paths, specified by
> --enable-offload-targets option.
> 
> > to the command line so it at least finds mkoffload, it then can't find for
> > some reason the offload compiler:
> 
> mkoffload itself also wants the offload compiler with correct name
> (-accel--gcc).  It can be extended to use xgcc.  But I don't 
> know,
> how to construct all paths for it (-B, -I, -L)?
> 
> > So, what exactly should be added (by libgomp.exp) so that the testing 
> > succeeds in
> > the case of non-installed offload and non-installed host compilers?
> 
> Looks like, that non-installed offload compiler requires some complications.
> Is this really necessary?

I think it is useful, doesn't have to be in the initial checkin, but I'd
certainly prefer if from the (optional) --enable-offload-target argument
it would figure out everything it needs to add for testing.
And, if mkoffload isn't flexible enough to be convinced to find it in that
scenario, it better should be made more flexible.

Another thing I've noticed, when target-1.exe is built, there are tons of
sections that IMHO should have been stripped away:

  [ 0]   NULL 00 00 00  
0   0  0
  [ 1] .interp   PROGBITS00400238 000238 1c 00   A  
0   0  1
  [ 2] .note.ABI-tag NOTE00400254 000254 20 00   A  
0   0  4
  [ 3] .hash HASH00400278 000278 94 04   A  
4   0  8
  [ 4] .dynsym   DYNSYM  00400310 000310 0001b0 18   A  
5   1  8
  [ 5] .dynstr   STRTAB  004004c0 0004c0 000189 00   A  
0   0  1
  [ 6] .gnu.version  VERSYM  0040064a 00064a 24 02   A  
4   0  2
  [ 7] .gnu.version_rVERNEED 00400670 000670 70 00   A  
5   2  8
  [ 8] .rela.dyn RELA004006e0 0006e0 18 18   A  
4   0  8
  [ 9] .rela.plt RELA004006f8 0006f8 000150 18   A  
4  11  8
  [10] .init PROGBITS00400848 000848 1a 00  AX  
0   0  4
  [11] .plt  PROGBITS00400870 000870 f0 10  AX  
0   0 16
  [12] .text PROGBITS00400960 000960 000b44 00  AX  
0   0 16
  [13] .fini PROGBITS004014a4 0014a4 09 00  AX  
0   0  4
  [14] .rodata   PROGBITS004014b0 0014b0 20 00   A  
0   0  8
  [15] .eh_frame_hdr PROGBITS004014d0 0014d0 94 00   A  
0   0  4
  [16] .eh_frame PROGBITS00401568 001568 00032c 00   A  
0   0  8
  [17] .init_array   INIT_ARRAY  00601dd8 001dd8 10 00  WA  
0   0  8
  [18] .fini_array   FINI_ARRAY  00601de8 001de8 08 00  WA  
0   0  8
  [19] .jcr  PROGBITS00601df0 001df0 08 00  WA  
0   0  8
  [20] .dynamic  DYNAMIC 00601df8 001df8 000200 10  WA  
5   0  8
  [21] .got  PROGBITS00601ff8 001ff8 08 08  WA  
0   0  8
  [22] .got.plt  PROGBITS00602000 002000 88 08  WA  
0   0  8
  [23] .data PROGBITS006020a0 0020a0 000120 00  WA  
0   0 32
  [24] .offload_image_section PROGBITS006021c0 0021c0 003439 00 
 WA  0   0 16
  [25] __gnu_offload_funcs PROGBITS00605600 005600 18 00  
WA  0   0  8
  [26] __gnu_offload_vars PROGBITS00605618 005618 10 00  WA 
 0   0  8
  [27] .bss  NOBITS  00605628 005628 08 00  WA  
0   0  4
  [28] .comment  PROGBITS 005628 55 01  MS  
0   0  1
  [29] .gnu.target_lto_.profile.3e3ce5aae4e95dd4 PROGBITS
 00567d 14 00  0   0  1
  [30] .gnu.target_lto_.jmpfuncs.3e3ce5aae4e95dd4 PROGBITS
 005691 28 00  0   0  1
  [31] .gnu.target_lto_.inline.3e3ce5aae4e95dd4 PROGBITS
 0056b9 000130 00  0   0  1
  [32] .gnu.target_lto_.pureconst.3e3ce5aae4e95dd4 PROGBITS
 0057e9 1d 00  0   0  1
  [33] .gnu.target_lto_fn2._omp_fn.1.3e3ce5aae4e95dd4 PROGBITS
 005806 0005fc 00  0   0  1
  [34] .gnu.target_lto_fn2._omp_fn.0.3e3ce5aae4e95dd4 PROGBITS
 005e02 000765

Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Siva Chandra
On Tue, Oct 7, 2014 at 4:05 AM, Mark Wielaard  wrote:
> To be honest my original patches for a deleted/defaulted markers on
> special member functions was really just meant to give the consumer a
> way to know why GCC produced a declaration in the first place. Which I
> still think is useful information for the consumer to have, but
> certainly not enough to solve the abi problem with inferior function
> calls Siva was seeing. Maybe GDB has enough information/smarts, but I
> don't think other consumers have. So an explicit "trivial/non-trivial"
> marker on special member functions seems like a good idea.
>
> But looking at the definition of trivial copy constructor and trivial
> destructor they do look more like class concepts instead of individual
> constructor/destructor concepts (since they rely on properties of other
> members and the base class). Currently GCC doesn't output declarations
> unless the user declares them. So an implicit copy constructor or
> destructor doesn't get a DWARF class member declaration. But I don't
> think a consumer can conclude just from that fact that the copy
> constructor or destructor is trivial. Nor can it asssume they are
> non-trivial just because they are are respresented in DWARF. So should
> we always output them and add a flag value to indicate
> (non-trivialness). Or should we add attributes on the class itself?

I also feel that triviality of special methods is more like a class
concept. Also, this concept is specified by the language.

> Taking a step back and looking at the actual function that is causing
> the trouble because abi/calling convention seems unclear. Which makes me
> wonder if the issue isn't actually with the DWARF declaration of the
> function that has special calling conventions. I am slightly surprised
> the special return value passed in rule isn't expressed in the mangling
> of the function name (or is it?). So the calling convention needs to be
> interpreted from the DWARF representation. We already add a synthetic
> formal parameter for "this" if necessary to be passed in. Why don't we
> just add a similar synthetic "return" formal parameter if that is how
> the function is really being invoked? That seems like a more direct way
> to solve the inferior function call issue.

Triviality (or not) is specified by the language. Similarly, the
'this' pointer is specified by the language. However, function calling
convention is specified by the ABI. ISTR that DWARF cannot/should not
describe the ABI. May be I am wrong, but if it is indeed possible to
specify the ABI in DWARF, then I agree that it probably is the best
solution for function call issue.

Thanks,
Siva Chandra


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Thomas Schwinge
Hi!

On Tue, 7 Oct 2014 17:51:53 +0400, Ilya Verbin  wrote:
> On 07 Oct 15:06, Jakub Jelinek wrote:
> > Still have issues with the non-installed testing.
> 
> The idea was that the offload compiler should be installed.
> 
> > If I add
> > -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
> > -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
> 
> Yes, since lto-wrapper uses COMPILER_PATH + "/accel//" to find
> mkoffload, it requires that the offload compiler with mkoffload are installed.
> Probably, it can be extended to search in the build paths, specified by
> --enable-offload-targets option.
> 
> > to the command line so it at least finds mkoffload, it then can't find for
> > some reason the offload compiler:
> 
> mkoffload itself also wants the offload compiler with correct name
> (-accel--gcc).  It can be extended to use xgcc.  But I don't 
> know,
> how to construct all paths for it (-B, -I, -L)?

For what it's worth, I first build accel-nvptx GCC (in
$T/build-gcc-accel-nvptx/), then "normal" GCC ($PWD, that is, in
$T/build-gcc/), and use the following steps to make offloading work for
build-tree testing of both GCC builds:

[...]
mkdir -p gcc/accel/nvptx-none &&
ln -vsf \
  "$T"/build-gcc-accel-nvptx/gcc/lto1 \
  "$T"/build-gcc-accel-nvptx/gcc/mkoffload \
  "$T"/build-gcc-accel-nvptx/gcc/xgcc \
  gcc/accel/nvptx-none/ &&
cat > gcc/x86_64-unknown-linux-gnu-accel-nvptx-none-gcc <<"EOF" &&
#! /bin/sh
set -e
d=$(dirname "$0")
"$d"/accel/nvptx-none/xgcc -B"$d"/accel/nvptx-none/ "$@"
EOF
chmod +x gcc/x86_64-unknown-linux-gnu-accel-nvptx-none-gcc &&
[...]

> > So, what exactly should be added (by libgomp.exp) so that the testing 
> > succeeds in
> > the case of non-installed offload and non-installed host compilers?
> 
> Looks like, that non-installed offload compiler requires some complications.
> Is this really necessary?


Grüße,
 Thomas


pgph12Jtgys7g.pgp
Description: PGP signature


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin
On 07 Oct 15:06, Jakub Jelinek wrote:
> Still have issues with the non-installed testing.

The idea was that the offload compiler should be installed.

> If I add
> -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
> -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/

Yes, since lto-wrapper uses COMPILER_PATH + "/accel//" to find
mkoffload, it requires that the offload compiler with mkoffload are installed.
Probably, it can be extended to search in the build paths, specified by
--enable-offload-targets option.

> to the command line so it at least finds mkoffload, it then can't find for
> some reason the offload compiler:

mkoffload itself also wants the offload compiler with correct name
(-accel--gcc).  It can be extended to use xgcc.  But I don't know,
how to construct all paths for it (-B, -I, -L)?

> So, what exactly should be added (by libgomp.exp) so that the testing 
> succeeds in
> the case of non-installed offload and non-installed host compilers?

Looks like, that non-installed offload compiler requires some complications.
Is this really necessary?

Thanks,
  -- Ilya


Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Siva Chandra
On Mon, Oct 6, 2014 at 5:55 PM, Jason Merrill  wrote:
> On 10/06/2014 08:50 PM, Siva Chandra wrote:
>> But, the question is whether it is required to determine the parameter
>> passing ABI. If there is no special marker to indicate that the user
>> declared 'tor is explicitly defaulted, then GDB could (in the absence
>> of other properties which make the 'tor non-trivial) incorrectly
>> conclude that the the 'tor is user defined, and hence not-trivial.
>
> I've been thinking that we should just mark the 'tor as trivial or not
> directly rather than hint at it.  Does GDB have enough information to
> determine triviality if we just add defaulted info?

Barring some incompleteness, for which patches are very close to
getting committed, I believe GDB has the rest of the information.
After those patches are committed, the algorithm used by GDB to
determine whether a value is returned in a hidden param or not is as
follows:

1. If the value is of a dynamic class (as in, has virtual bases or
virtual functions), return in hidden param.
2. Else, go over all methods that are found in the DWARF:
2a. If a method is marked artificial, ignore it.
2b. If the method is a copy-constructor or a destructor, conclude
that a pointer to the value is to be returned in the hidden first
param.
 This is because, presence of a copy-ctor or dtor which are
nor artificial indicates that they were user declared and not
implicit.
3. If a decision was not made in 2, do 1 and 2 for base class
subobjects and non-static members.
4. If a decision was not made in 3, then conclude that it should not
be passed in a hidden param.

If an explicitly defaulted copy-ctor or dtor is not marked as such,
step 2 is broken.


Re: [Patch ARM-AArch64/testsuite v2 01/21] Neon intrinsics execution tests initial framework.

2014-10-07 Thread Christophe Lyon
On 1 October 2014 17:11, Marcus Shawcroft  wrote:
> On 30 September 2014 15:27, Christophe Lyon  
> wrote:
>> On 10 July 2014 12:12, Marcus Shawcroft  wrote:
>>> On 1 July 2014 11:05, Christophe Lyon  wrote:
 * documentation (README)
 * dejanu driver (neon-intrinsics.exp)
 * support macros (arm-neon-ref.h, compute-ref-data.h)
 * Tests for 3 intrinsics: vaba, vld1, vshl
>>>
>>> Hi, The terminology in armv8 is advsimd rather than neon.  Can we
>>> rename neon-intrinsics to advsimd-intrinsics or simd-intrinsics
>>> throughout please.  The existing gcc.target/aarch64/simd directory of
>>> tests will presumably be superseded by this more comprehensive set of
>>> tests so I suggest these tests go in gcc.target/aarch64/advsimd and we
>>> eventually remove gcc.target/aarch64/simd/ directory.
>>>
>>> GNU style should apply throughout this patch series, notably double
>>> space after period in comments and README text.  Space before left
>>> parenthesis in function/macro call and function declaration.  The
>>> function name in a declaration goes on a new line.  The GCC wiki notes
>>> on test case state individual test should have file names ending in
>>> _, see here https://gcc.gnu.org/wiki/TestCaseWriting
>>>
>>
>> Hi,
>>
>> For the record, these tests are based on a testsuite I wrote quite
>> some time ago:
>> https://gitorious.org/arm-neon-tests/
>>
>> where obviously I had no such requirement (and v8 wasn't public yet)
>>
>> So I prefer to apply the changes you request in my main version before
>> re-submitting it here.
>> (libsanitizer-style, sort-of).
>>
>> This will take me some time, so the next version of my patch series
>> should not be expected really soon :-(
>
>
Ramana, Marcus,

> Hi Christophe,   Given that this test suite code is an existing body
> of work I see no reason to impose the GNU style change I originally
> asked for. I withdraw my original comment that these patches should
> conform to GNU style.  My comment on file names is also withdrawn.  I
> would like to see the terminology corrected.
>

Thanks, I have updated my patch according to this.

But meanwhile I have also updated my testsuite, and fixed the #define
flag I used to toggle float16 tests: I now use __ARM_FP16_FORMAT_IEEE,
such as:
#if defined(__ARM_FP16_FORMAT_IEEE)
  TEST_VLD1(vector, buffer, , float, f, 16, 4);
  TEST_VLD1(vector, buffer, q, float, f, 16, 8);
#endif

Which reminded me that:
- on ARM (AArch32), float16x4_t is supported, but float16x8_t isn't yet
- on AArch64, -mfp16-format=ieee is rejected, and I didn't see a
similar option in the doc

What do you prefer me to do for these tests? I can think of:
- do not include them at all until fp16 is fully supported on both
AArch32 and AArch64
- include only those with float16x4_t
- include both float16x4_t and float16x8_t tests, leaving float16x8_t commented
- include both, uncommented, but do not test with -mfp16-format=ieee

Thanks,

Christophe.


> Thanks
> /Marcus


Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek
On Mon, Oct 06, 2014 at 07:53:17PM +0400, Ilya Verbin wrote:
> This patch adds plugin support to libgomp, as well as memory mapping and
> interaction with target devices through plugin's interface.

Still have issues with the non-installed testing.

( mkdir objmic && cd objmic && ../configure 
--build=x86_64-intelmicemul-linux-gnu \
--host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu \
--enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap \
&& make && make install DESTDIR=`cd ..; pwd`/objinst )
( mkdir objhost && cd objhost && ../configure --build=x86_64-pc-linux-gnu \
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu \
--enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gcc-git/objmic
--disable-bootstrap && make )
( mkdir objhost2 && cd objhost2 && ../configure --build=x86_64-pc-linux-gnu \
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu \
--enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gcc-git/objinst/usr/local
--disable-bootstrap && make )

All 3 succeeded for me.

Now, in objhost make check-target-libgomp doesn't really work, in objhost2
it does.

E.g. trying to link target-1.exe, I get:

lto-wrapper: fatal error: Problem with building target image for 
x86_64-intelmicemul-linux-gnu.

compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

If I add
-B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
-B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
to the command line so it at least finds mkoffload, it then can't find for
some reason the offload compiler:

(null): fatal error: offload compiler 
x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc not found.
compilation terminated.
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload
 returned 1 exit status
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

So, what exactly should be added (by libgomp.exp) so that the testing succeeds 
in
the case of non-installed offload and non-installed host compilers?

Jakub


[PATCH] PR lto/59441 Add initialization and release of bitmap obstack

2014-10-07 Thread Ilya Palachev

Hi all,

Attached patch fixes PR lto/59441.
The reason of failure was that the default bitmap obstack was released 
just before the execution of early local passes.
The error was found using valgrind. It reported that there were 153 
invalid reads and 173 invalid writes into the field of the default 
bitmap obstack structure,
and all of them were trying to access data that was free'd previously 
(at the same point of the program).


The solution is to add initialization and release of the bitmap obstack 
before and after the execution of early local passes.
After applying this patch valgrind does not report any errors for the 
same testcase.


The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.

Ok for trunk?

Best regards,
Ilya Palachev
>From 9bf2878c0a74475283b5424f24e46b31feb13cf7 Mon Sep 17 00:00:00 2001
From: Ilya Palachev 
Date: Tue, 7 Oct 2014 16:09:25 +0400
Subject: [PATCH] Add initialization and release of bitmap obstack

gcc/

2014-10-07  Ilya Palachev  

	* cgraphunit.c (process_new_functions): Add initialization and
	release of bitmap obstack before and after running of passes.

gcc/testsuite/

2014-10-07  Ilya Palachev  

	* g++.dg/lto/pr59441_0.C: New test from bugzilla.
---
 gcc/cgraphunit.c |  6 +-
 gcc/testsuite/g++.dg/lto/pr59441_0.C | 26 ++
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr59441_0.C

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index d463505..ee42ad1 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -323,7 +323,11 @@ symbol_table::process_new_functions (void)
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  if (state == IPA_SSA
 	  && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
-	g->get_passes ()->execute_early_local_passes ();
+	{
+	  bitmap_obstack_initialize (NULL);
+	  g->get_passes ()->execute_early_local_passes ();
+	  bitmap_obstack_release (NULL);
+	}
 	  else if (inline_summary_vec != NULL)
 	compute_inline_parameters (node, true);
 	  free_dominance_info (CDI_POST_DOMINATORS);
diff --git a/gcc/testsuite/g++.dg/lto/pr59441_0.C b/gcc/testsuite/g++.dg/lto/pr59441_0.C
new file mode 100644
index 000..3c766e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr59441_0.C
@@ -0,0 +1,26 @@
+// { dg-lto-do assemble }
+// { dg-lto-options { { -shared -fPIC -flto -O -fvtable-verify=std } } }
+
+template < typename T > struct A
+{
+  T foo ();
+};
+
+template < typename T > struct C: virtual public A < T >
+{
+  C & operator<< (C & (C &));
+};
+
+template < typename T >
+C < T > &endl (C < int > &c)
+{
+  c.foo ();
+return c;
+}
+
+C < int > cout;
+void
+fn ()
+{
+  cout << endl;
+}
-- 
2.1.1



[C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek
PR59717 is a request for hints which header to include if the compiler warns
about incompatible implicit declarations.  E.g., if one uses abort
without declaring it first, we now say
note: include ‘’ or provide a declaration of ‘abort’
I've added hints only for standard functions which means we won't display
the hint for functions such as mempcpy.

The implementation is based on a function that just maps built_in_function
codes to header names.

Two remarks:
* header_for_builtin_fn is long and I don't want to unnecessarily
  inflate already big c-decl.c file, so it might make sense to move
  the function into c-errors.c;
* we don't issue "incompatible implicit declaration of built-in function"
  warning for functions that return int and whose parameter types don't need
  default promotions - for instance putc, fputs, ilogb, strcmp, vprintf, isnan,
  isalpha, ...  Therefore for such functions we don't print the hint neither.
  header_for_builtin_fn is ready for them, though.  (The cases for 
  and  could be removed.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-07  Marek Polacek  

PR c/59717
* c-decl.c (header_for_builtin_fn): New function.
(implicitly_declare): Suggest which header to include.

* gcc.dg/pr59717.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index ce5a8de..e23284a 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -2968,6 +2968,189 @@ implicit_decl_warning (location_t loc, tree id, tree 
olddecl)
 }
 }
 
+/* This function represents mapping of a function code FCODE
+   to its respective header.  */
+
+static const char *
+header_for_builtin_fn (enum built_in_function fcode)
+{
+  switch (fcode)
+{
+CASE_FLT_FN (BUILT_IN_ACOS):
+CASE_FLT_FN (BUILT_IN_ACOSH):
+CASE_FLT_FN (BUILT_IN_ASIN):
+CASE_FLT_FN (BUILT_IN_ASINH):
+CASE_FLT_FN (BUILT_IN_ATAN):
+CASE_FLT_FN (BUILT_IN_ATANH):
+CASE_FLT_FN (BUILT_IN_ATAN2):
+CASE_FLT_FN (BUILT_IN_CBRT):
+CASE_FLT_FN (BUILT_IN_CEIL):
+CASE_FLT_FN (BUILT_IN_COPYSIGN):
+CASE_FLT_FN (BUILT_IN_COS):
+CASE_FLT_FN (BUILT_IN_COSH):
+CASE_FLT_FN (BUILT_IN_ERF):
+CASE_FLT_FN (BUILT_IN_ERFC):
+CASE_FLT_FN (BUILT_IN_EXP):
+CASE_FLT_FN (BUILT_IN_EXP2):
+CASE_FLT_FN (BUILT_IN_EXPM1):
+CASE_FLT_FN (BUILT_IN_FABS):
+CASE_FLT_FN (BUILT_IN_FDIM):
+CASE_FLT_FN (BUILT_IN_FLOOR):
+CASE_FLT_FN (BUILT_IN_FMA):
+CASE_FLT_FN (BUILT_IN_FMAX):
+CASE_FLT_FN (BUILT_IN_FMIN):
+CASE_FLT_FN (BUILT_IN_FMOD):
+CASE_FLT_FN (BUILT_IN_FREXP):
+CASE_FLT_FN (BUILT_IN_HYPOT):
+CASE_FLT_FN (BUILT_IN_ILOGB):
+CASE_FLT_FN (BUILT_IN_LDEXP):
+CASE_FLT_FN (BUILT_IN_LGAMMA):
+CASE_FLT_FN (BUILT_IN_LLRINT):
+CASE_FLT_FN (BUILT_IN_LLROUND):
+CASE_FLT_FN (BUILT_IN_LOG):
+CASE_FLT_FN (BUILT_IN_LOG10):
+CASE_FLT_FN (BUILT_IN_LOG1P):
+CASE_FLT_FN (BUILT_IN_LOG2):
+CASE_FLT_FN (BUILT_IN_LOGB):
+CASE_FLT_FN (BUILT_IN_LRINT):
+CASE_FLT_FN (BUILT_IN_LROUND):
+CASE_FLT_FN (BUILT_IN_MODF):
+CASE_FLT_FN (BUILT_IN_NAN):
+CASE_FLT_FN (BUILT_IN_NEARBYINT):
+CASE_FLT_FN (BUILT_IN_NEXTAFTER):
+CASE_FLT_FN (BUILT_IN_NEXTTOWARD):
+CASE_FLT_FN (BUILT_IN_POW):
+CASE_FLT_FN (BUILT_IN_REMAINDER):
+CASE_FLT_FN (BUILT_IN_REMQUO):
+CASE_FLT_FN (BUILT_IN_RINT):
+CASE_FLT_FN (BUILT_IN_ROUND):
+CASE_FLT_FN (BUILT_IN_SCALBLN):
+CASE_FLT_FN (BUILT_IN_SCALBN):
+CASE_FLT_FN (BUILT_IN_SIN):
+CASE_FLT_FN (BUILT_IN_SINH):
+CASE_FLT_FN (BUILT_IN_SINCOS):
+CASE_FLT_FN (BUILT_IN_SQRT):
+CASE_FLT_FN (BUILT_IN_TAN):
+CASE_FLT_FN (BUILT_IN_TANH):
+CASE_FLT_FN (BUILT_IN_TGAMMA):
+CASE_FLT_FN (BUILT_IN_TRUNC):
+case BUILT_IN_ISINF:
+case BUILT_IN_ISNAN:
+  return "";
+CASE_FLT_FN (BUILT_IN_CABS):
+CASE_FLT_FN (BUILT_IN_CACOS):
+CASE_FLT_FN (BUILT_IN_CACOSH):
+CASE_FLT_FN (BUILT_IN_CARG):
+CASE_FLT_FN (BUILT_IN_CASIN):
+CASE_FLT_FN (BUILT_IN_CASINH):
+CASE_FLT_FN (BUILT_IN_CATAN):
+CASE_FLT_FN (BUILT_IN_CATANH):
+CASE_FLT_FN (BUILT_IN_CCOS):
+CASE_FLT_FN (BUILT_IN_CCOSH):
+CASE_FLT_FN (BUILT_IN_CEXP):
+CASE_FLT_FN (BUILT_IN_CIMAG):
+CASE_FLT_FN (BUILT_IN_CLOG):
+CASE_FLT_FN (BUILT_IN_CONJ):
+CASE_FLT_FN (BUILT_IN_CPOW):
+CASE_FLT_FN (BUILT_IN_CPROJ):
+CASE_FLT_FN (BUILT_IN_CREAL):
+CASE_FLT_FN (BUILT_IN_CSIN):
+CASE_FLT_FN (BUILT_IN_CSINH):
+CASE_FLT_FN (BUILT_IN_CSQRT):
+CASE_FLT_FN (BUILT_IN_CTAN):
+CASE_FLT_FN (BUILT_IN_CTANH):
+  return "";
+case BUILT_IN_MEMCHR:
+case BUILT_IN_MEMCMP:
+case BUILT_IN_MEMCPY:
+case BUILT_IN_MEMMOVE:
+case BUILT_IN_MEMSET:
+case BUILT_IN_STRCAT:
+case BUILT_IN_STRCHR:
+case BUILT_IN_STRCMP:
+case BUILT_IN_STRCPY:
+case BUILT_IN_STRCSPN:
+case BUILT_IN_STRLEN:
+case BUILT_IN_STRNCAT:
+case BUILT_IN_STRNCMP:
+case BUILT_IN_STRNCPY:
+case BUILT_IN_STRPBRK:
+ca

Re: RFA: Merge definitions of get_some_local_dynamic_name

2014-10-07 Thread Richard Sandiford
Richard Sandiford  writes:
> Rainer Orth  writes:
>> Hi Richard,
>>> Rainer Orth  writes:
 Hi Richard,
>> It seems the new get_some_local_dynamic_name implementation in
>> function.c lost the non-NULL check the sparc.c version had.  I'm
>> currently testing the following patch:
>
> Could you do a:
>
>   call debug_rtx (...)
>
> on the insn that contains a null pointer?  Normally insn patterns
> shouldn't contain nulls, so I was wondering whether this was some
> SPARC-specific construct.

 proved a bit difficult to do: at the default -O2, insn was optimized
 away, at -g3 -O0, I only got

 can't compute CFA for this frame

 with gdb 7.8 even after recompiling all of the gcc dir with -g3 -O0.

 Here's what I find after inserting the call in the source:

 (insn 30 6 28 (sequence [
 (call_insn:TI 8 6 7 (parallel [
 (set (reg:SI 8 %o0)
 (call (mem:SI (unspec:SI [
 (symbol_ref:SI 
 ("__tls_get_addr"))
 ] UNSPEC_TLSLDM) [0  S4 A32])
 (const_int 1 [0x1])))
 (clobber (reg:SI 15 %o7))
 ]) 
 /vol/gcc/src/hg/trunk/local/libgo/runtime/proc.c:936 390 {tldm_call32}
  (expr_list:REG_EH_REGION (const_int -2147483648 
 [0x8000])
 (nil))
 (expr_list (use (reg:SI 8 %o0))
 (nil)))
 (insn 7 8 28 (set (reg:SI 8 %o0)
 (plus:SI (reg:SI 23 %l7)
 (unspec:SI [
 (reg:SI 8 %o0 [112])
 ] UNSPEC_TLSLDM))) 388 {tldm_add32}
  (nil))
 ]) /vol/gcc/src/hg/trunk/local/libgo/runtime/proc.c:936 -1
  (nil))
>>>
>>> Bah, a sequence.  Hadn't thought of that.
>>>
>>> IMO it's a bug for a walk on a PATTERN to pull in non-PATTERN parts
>>> of an insn.  We should really be looking at the patterns of the two
>>> subinsns instead and ignore the other stuff.  Let me have a think
>>> about it.
>>
>> did you come to a conclusion here?
>
> Sorry, forgot to come back to this.  I have a patch that iterates over
> PATTERNs of a SEQUENCE if the SEQUENCE (rather than its containing insn)
> is the topmost iterated rtx.  So if PATTERN (insn) is a SEQUENCE:
>
>FOR_EACH_SUBRTX (, insn, x)
>  ...
>
> will iterate over the insns in the SEQUENCE (including pattern, notes,
> jump label, etc.), whereas:
>
>FOR_EACH_SUBRTX (, PATTERN (insn), x)
>  ...
>
> would only iterate over the patterns of the insns in the SEQUENCE.

Does this work for you?  I tested it on x86_64-linux-gnu but obviously
that's not particularly useful for SEQUENCEs.

Thanks,
Richard

gcc/
* rtlanal.c (generic_subrtx_iterator ::add_subrtxes_to_queue):
Add the parts of an insn in reverse order, with the pattern at
the top of the queue.  Detect when we're iterating over a SEQUENCE
pattern and in that case just consider patterns of subinstructions.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   2014-09-25 16:40:44.944406590 +0100
+++ gcc/rtlanal.c   2014-10-07 13:13:57.698132753 +0100
@@ -128,29 +128,58 @@ generic_subrtx_iterator ::add_subrtxe
value_type *base,
size_t end, rtx_type x)
 {
-  const char *format = GET_RTX_FORMAT (GET_CODE (x));
+  enum rtx_code code = GET_CODE (x);
+  const char *format = GET_RTX_FORMAT (code);
   size_t orig_end = end;
-  for (int i = 0; format[i]; ++i)
-if (format[i] == 'e')
-  {
-   value_type subx = T::get_value (x->u.fld[i].rt_rtx);
-   if (__builtin_expect (end < LOCAL_ELEMS, true))
- base[end++] = subx;
-   else
- base = add_single_to_queue (array, base, end++, subx);
-  }
-else if (format[i] == 'E')
-  {
-   int length = GET_NUM_ELEM (x->u.fld[i].rt_rtvec);
-   rtx *vec = x->u.fld[i].rt_rtvec->elem;
-   if (__builtin_expect (end + length <= LOCAL_ELEMS, true))
- for (int j = 0; j < length; j++)
-   base[end++] = T::get_value (vec[j]);
-   else
- for (int j = 0; j < length; j++)
-   base = add_single_to_queue (array, base, end++,
-   T::get_value (vec[j]));
-  }
+  if (__builtin_expect (INSN_P (x), false))
+{
+  /* Put the pattern at the top of the queue, since that's what
+we're likely to want most.  It also allows for the SEQUENCE
+code below.  */
+  for (int i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; --i)
+   if (format[i] == 'e')
+ {
+

Re: [wwwdocs] Add feature-testing macros and std::is_final to gcc-5/changes.html

2014-10-07 Thread Jonathan Wakely

On 07/10/14 08:39 -0400, Ed Smith-Rowland wrote:
OK, here is a patch for both using typename as a class key for 
template template parms and for __has_include, etc.

Are these too wordy?


They look OK to me, although you say "__has_include_next and
__has_include_next" in both places, the first should be just
__has_include.

I can make that change and commit it, thanks.


Re: [wwwdocs] Add feature-testing macros and std::is_final to gcc-5/changes.html

2014-10-07 Thread Ed Smith-Rowland

On 10/02/2014 10:24 AM, Jonathan Wakely wrote:

On 02/10/14 10:09 -0400, Ed Smith-Rowland wrote:

On 10/02/2014 06:14 AM, Jonathan Wakely wrote:

On 02/10/14 11:12 +0100, Jonathan Wakely wrote:

Note Ed's recent changes. Committed to CVS.


And fix a markup error that I expected xmllint to catch :-(
Thank you! I tried to do this and couldn't for permissions.  I'm 
probably not doing it right.


If I remember my cvs-fu you need CVS_RHS=ssh and use
CVSROOT=:ext:$u...@gcc.gnu.org:/cvs/gcc (with your sourceware.org
username as $USER) and then it should work over SSH just like svn and
git.

Anyway, the real thing I wanted to suggest is we put a line for 
C-family about the availability of __has_include and 
__has_include_next.  We could mention clang has it.


Good idea, I'm happy to commit a patch if you can prepare something.


OK, here is a patch for both using typename as a class key for template 
template parms and for __has_include, etc.

Are these too wordy?

Ed

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.14
diff -r1.14 changes.html
56a57,81
> New preprocessor constructs, __has_include_next
> and __has_include_next, to test the availability of 
> headers
> have been added.
> This demonstrates a way to include the header 
> 
> only if it is available:
> 
> #ifdef __has_include
> #  if __has_include()
> #include 
> #define have_optional 1
> #  elif __has_include()
> #include 
> #define have_optional 1
> #define experimental_optional
> #  else
> #define have_optional 0
> #  endif
> #endif
> 
> The header search paths for __has_include_next
> and __has_include_next are equivalent to those
> of the standard directive #include
> and the extension #include_next respectively.
> 
> 
88a114,117
>   G++ now allows typename in a template template parameter.
> 
>   template typename X> struct D; // 
> OK
> 


[wwwdocs] Update libstdc++ section of gcc-5/changes.html

2014-10-07 Thread Jonathan Wakely

Document the latest additions.

? gcc-5/.changes.html.swp
Index: gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.14
diff -u -u -r1.14 changes.html
--- gcc-5/changes.html  2 Oct 2014 10:13:36 -   1.14
+++ gcc-5/changes.html  7 Oct 2014 12:34:27 -
@@ -96,15 +96,33 @@
  std::deque meets the allocator-aware container 
requirements;
  movable and swappable iostream classes;
  support for std::aligned_union;
+ I/O manipulators std::hexfloat and
+std::defaultfloat;
+
   
 
+ Support for the C++11 hexfloat manipulator changes how
+  the num_put facet formats floating point types when
+  ios_base::fixed|ios_base::scientific is set in a stream's
+  fmtflags. This change affects all language modes, even
+  though the C++98 standard gave no special meaning to that combination
+  of flags. To prevent the use of hexadecimal notation for floating point
+  types use str.unsetf(std::ios_base::floatfield) to clear
+  the relevant bits in str.flags().
+
 https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014";>
   Improved experimental support for C++14, including:
   
  std::is_final type trait; 
   
 
-An implementation of std::experimental::any.
+https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014";>
+  Improved experimental support for the Library Fundamentals TS, 
including:
+  
+ Class std::experimental::any; 
+ Function template std::experimental::apply; 
+  
+
 New random number distributions logistic_distribution and
   uniform_on_sphere_distribution as extensions.
 https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html";>GDB


Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Andrew Haley
On 10/07/2014 09:31 AM, Marek Polacek wrote:
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK, thanks.

Andrew.



Re: [PATCH Fortran] move more diagnostics to the common machinery (try 2)

2014-10-07 Thread Tobias Burnus
Manuel López-Ibáñez wrote:
> and of course, with -Werror=missing-include-dirs you get:
> 
> f951: Error: Nonexistent include directory [...]
> [-Werror=missing-include-dirs]
> f951: some warnings being treated as errors
>
> plus colors!

Awesome!

> Bootstrapped and regression tested on x86_64-linux-gnu.
> OK?

OK. Thanks for the patch!

Tobias


> gcc/fortran/ChangeLog:
>
> 2014-10-04  Manuel López-Ibáñez  
> 
> * gfortran.h (gfc_warning_cmdline): Add overload that takes an
> option.
> (gfc_error_cmdline): Declare.
> * error.c (gfc_warning_cmdline): New overload that takes an option.
> (gfc_error_cmdline): New.
> * lang.opt (Wmissing-include-dirs): New.
> * scanner.c (add_path_to_list): Use the new functions.
> (load_file): Likewise.
> * options.c (gfc_init_options): Wmissing-include-dirs is enabled
> by default in Fortran.
> (gfc_handle_option): Accept automatically handled options.


Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Mark Wielaard
On Mon, 2014-10-06 at 20:55 -0400, Jason Merrill wrote:
> On 10/06/2014 08:50 PM, Siva Chandra wrote:
> > On Sat, Oct 4, 2014 at 11:14 AM, Jason Merrill  wrote:
> >> On 10/03/2014 05:41 PM, Siva Chandra wrote:
> >>>
> >>> I understand that knowing whether a copy-ctor or a d-tor has been
> >>> explicitly defaulted is not sufficient to determine the parameter
> >>> passing ABI. However, why is it not necessary? I could be wrong, but
> >>> doesn't the example I have given show that it is necessary?
> >>
> >> An implicitly declared 'tor can also be trivial.
> >
> > But, the question is whether it is required to determine the parameter
> > passing ABI. If there is no special marker to indicate that the user
> > declared 'tor is explicitly defaulted, then GDB could (in the absence
> > of other properties which make the 'tor non-trivial) incorrectly
> > conclude that the the 'tor is user defined, and hence not-trivial.
> 
> I've been thinking that we should just mark the 'tor as trivial or not 
> directly rather than hint at it.  Does GDB have enough information to 
> determine triviality if we just add defaulted info?

To be honest my original patches for a deleted/defaulted markers on
special member functions was really just meant to give the consumer a
way to know why GCC produced a declaration in the first place. Which I
still think is useful information for the consumer to have, but
certainly not enough to solve the abi problem with inferior function
calls Siva was seeing. Maybe GDB has enough information/smarts, but I
don't think other consumers have. So an explicit "trivial/non-trivial"
marker on special member functions seems like a good idea.

But looking at the definition of trivial copy constructor and trivial
destructor they do look more like class concepts instead of individual
constructor/destructor concepts (since they rely on properties of other
members and the base class). Currently GCC doesn't output declarations
unless the user declares them. So an implicit copy constructor or
destructor doesn't get a DWARF class member declaration. But I don't
think a consumer can conclude just from that fact that the copy
constructor or destructor is trivial. Nor can it asssume they are
non-trivial just because they are are respresented in DWARF. So should
we always output them and add a flag value to indicate
(non-trivialness). Or should we add attributes on the class itself?

Taking a step back and looking at the actual function that is causing
the trouble because abi/calling convention seems unclear. Which makes me
wonder if the issue isn't actually with the DWARF declaration of the
function that has special calling conventions. I am slightly surprised
the special return value passed in rule isn't expressed in the mangling
of the function name (or is it?). So the calling convention needs to be
interpreted from the DWARF representation. We already add a synthetic
formal parameter for "this" if necessary to be passed in. Why don't we
just add a similar synthetic "return" formal parameter if that is how
the function is really being invoked? That seems like a more direct way
to solve the inferior function call issue.

Cheers,

Mark




Re: [gofrontend-dev] [PATCH 5/9] Gccgo port to s390[x] -- part I

2014-10-07 Thread Dominik Vogt
On Mon, Oct 06, 2014 at 07:29:33AM -0700, Ian Lance Taylor wrote:
> On Mon, Oct 6, 2014 at 12:42 AM, Dominik Vogt  wrote:
> > On s390[x] the symbol value of a section symbol is definitely not
> > zero.
> 
> Is true even in an object file?

No.

> I agree that in an executable a
> section symbol will have a non-zero value, but that case doesn't arise
> since an executable won't have (non-dynamic) relocations.  But I'm
> quite surprised that hear that the section symbol would be non-zero in
> an object file.

I spent a day looking at that issue again, and while it's true
that section symbols don't necessarily have a zero value, that is
not the problem here.  The problem is about how cgo determines the
names of functions(?) from an object file.  On s390 we need to do
an indirect lookup of (non-section-)symbols to find the names, and
the symbol value is not zero.

The only points in that patch are that on one hand - as far as I
know - the Abi does not guarantee that section symbols are either
zero or not relocated, even if that may be the case in reality.
And on the other hand, if that code is ever modified to handle
non-section symbols, it's not obvious that you not only need to
remove the test for the symbol type but also modify the
calculations below.  So, apply the patch or drop it as you like,
but in any case, at least a comment in the code would improve
maintainability.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[PATCH] Fix PR ipa/61190, 2nd edition‏

2014-10-07 Thread Bernd Edlinger
Hi Honza,


as you know, we have a wrong code bug, when a pure or const method is called 
via a virtual thunk.
I had some more Ideas, how to fix that, but all of them had some serious 
draw-backs, so I leave the details out...


But now I have a new insight, why the "obvious" fix for this serious code 
generation bug did not work
in the first place.


And the reason was, that if ipa-pure-const.c calls set_const_flag or 
set_pure_flag for a thunk, it calls the same
function later for the called method, and this overwrites the flags of _all_ 
associated thunks and aliases.
However that should at least not be done for virtual thunks, as these need to 
be IPA_NEITHER, even if
the method itself has different attributes, that is because the assembler thunk 
accesses the vtable, while
other thunks do not.


So I re-factored set_const_flag and set_pure_flag to exclude the virtual 
thunks, taking care that other
users of call_for_symbol_thunks_and_aliases do not get a different behavior 
than before this patch.


The attached patch was boot-strapped and
regression-tested on x86_64-linux-gnu.
Ok for trunk?


PS: As a side-note, there are two identical functions, named 
"call_for_symbol_and_aliases", in
class symtab_node and in class cgraph_node, which inherits from symtab_node. 
Both functions are
not declared virtual.  Is that what's intended?  Usually this could lead to 
errors, or at least some serious
compiler warnings.


Thanks
Bernd.
  2014-10-07  Bernd Edlinger  

PR ipa/61190
* cgraph.h (symtab_node::call_for_symbol_and_aliases): Fix comment.
(cgraph_node::call_for_symbol_and_aliases): Likewise.
(cgraph_node::call_for_symbol_thunks_and_aliases_1): New function.
(cgraph_node::call_for_symbol_thunks_and_aliases): Adjust comment.
Call call_for_symbol_thunks_and_aliases_1.
* cgraph.c (cgraph_node::call_for_symbol_thunks_and_aliases): Renamed
to cgraph_node::call_for_symbol_thunks_and_aliases_1.  Added new
parameter exclude_virtual_thunks.
(cgraph_node::set_const_flag): Don't propagate to virtual thunks.
(cgraph_node::set_pure_flag): Likewise.
* ipa-pure-const.c (analyze_function): For virtual thunks
set pure_const_state to IPA_NEITHER.

testsuite/ChangeLog:
2014-10-07  Bernd Edlinger  

PR ipa/61190
* g++.old-deja/g++.mike/p4736b.C: Use -O2.



patch-pr61190.diff
Description: Binary data


Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Daniel Hellstrom

On 10/07/2014 11:07 AM, Eric Botcazou wrote:

Ok, I will update that. Is there a way of generating the comments
automatically?

Do you mean the ChangeLog?  If so, contrib/mklog will generate a skeleton but
you'll still need to write the decription sentences.


Perfect, thanks!


Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Eric Botcazou
> Ok, I will update that. Is there a way of generating the comments
> automatically?

Do you mean the ChangeLog?  If so, contrib/mklog will generate a skeleton but 
you'll still need to write the decription sentences.

-- 
Eric Botcazou


Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Daniel Hellstrom

On 10/07/2014 09:26 AM, Eric Botcazou wrote:

You're right. I have attached an updated patch. The new code becomes:

   #ifdef HAVE_AS_LEON
   #define AS_LEON_FLAG "-Aleon"
+#define AS_LEONV7_FLAG "-Aleon"
   #else
   #define AS_LEON_FLAG "-Av8"
+#define AS_LEONV7_FLAG "-Av7"
   #endif

The patch is OK for all active branches (trunk, 4.9 and 4.8), modulo nits in
the ChangeLog entry: capital letter at the beginning and period at the end of
every sentence.

* config.gcc (sparc*-*-*): Accept mcpu=leon3v7 processor.
* doc/invoke.texi (SPARC options): Add mcpu=leon3v7 comment.
* config/sparc/leon.md (leon3_load, leon_store, leon_fp_*): Handle
leon3v7 as leon3.
* config/sparc/sparc-opts.h (enum processor_type): Add LEON3V7.
* config/sparc/sparc.c (sparc_option_override): Add leon3v7 support.
* config/sparc/sparc.h (TARGET_CPU_leon3v7): New define.
* config/sparc/sparc.md (cpu): Add leon3v7.
* config/sparc/sparc.opt (enum processor_type): Add leon3v7.


Ok, I will update that. Is there a way of generating the comments automatically?


I assume that you have applied for write access so you'll be able to install
it yourself.  Otherwise let me know if I can lend a hand.


Thanks, I'll let you know. I just applied.

Daniel


Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek
On Mon, Oct 06, 2014 at 11:00:48PM +0200, Mark Wielaard wrote:
> On Mon, Oct 06, 2014 at 11:54:00AM +0200, Marek Polacek wrote:
> > Java testsuite breaks with -std=gnu11 as a default and/or with 
> > -Wimplicit-function-declaration on, since the jvgenmain.c program
> > that generates a C file containing 'main' function which calls either
> > 'JvRunMainName' or 'JvRunMain' does not generate forward declarations
> > for these functions.  The fix is obvious IMHO.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> I cannot approve (java) patches, but it does look ok to me.
> With one nitpick. JvRunMain is only used when -findirect-dispatch
> is given, and otherwise JvRunMainName is used. So you could output
> only the actually used forward declaration by checking if (indirect).

Yeah, that will be better.
 
> If no java maintainer responds, try CCing java-patc...@gcc.gnu.org
> to draw their attention.

Done (separate mail).  Thanks.

Marek


[Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek
[CCing java-patches now]

Java testsuite breaks with -std=gnu11 as a default and/or with 
-Wimplicit-function-declaration on, since the jvgenmain.c program
that generates a C file containing 'main' function which calls either
'JvRunMainName' or 'JvRunMain' does not generate forward declarations
for these functions.  The following patch generates such a declaration
depending on whether -findirect-dispatch is given.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-07  Marek Polacek  

* jvgenmain.c (main): Provide declaration for JvRunMain{,Name}.

diff --git gcc/java/jvgenmain.c gcc/java/jvgenmain.c
index 5b14258..82e468d 100644
--- gcc/java/jvgenmain.c
+++ gcc/java/jvgenmain.c
@@ -127,6 +127,10 @@ main (int argc, char **argv)
   /* At this point every element of ARGV from 1 to LAST_ARG is a `-D'
  option.  Process them appropriately.  */
   fprintf (stream, "extern const char **_Jv_Compiler_Properties;\n");
+  if (indirect)
+fprintf (stream, "extern void JvRunMainName ();\n");
+  else
+fprintf (stream, "extern void JvRunMain ();\n");
   fprintf (stream, "static const char *props[] =\n{\n");
   for (i = 1; i < last_arg; ++i)
 {

Marek


Re: [PATCH] Enhance array types debug info. for Ada

2014-10-07 Thread Jakub Jelinek
On Tue, Oct 07, 2014 at 10:08:23AM +0200, Pierre-Marie de Rodat wrote:
> >>gcc/fortran/
> >>* trans-types.c (gfc_get_array_descr_info): Use PLACEHOLDER_EXPR nodes
> >>instead of VAR_DECL ones in type-related expressions.  Remove base_decl
> >>initialization.
> >
> >Ugh, I must say I don't like PLACEHOLDER_EXPRs at all.
> 
> Why so? I know that as far as supported front-ends are concerned,
> PLACEHOLDE_EXPR nodes are used only in GNAT, but it seems to me they
> describe the best what object the bound/stride/allocated/associated
> expressions (self-)reference.

But isn't there a risk that you will have PLACEHOLDER_EXPRs (likely for Ada
only) in some trees not constructed by the langhook?
I mean, DW_OP_push_object_address isn't meaningful in all DWARF contexts,
in some it is forbidden, in others there is really no object to push, and as
implemented, you emit DW_OP_push_object_address (which emits the address of
a context related particular object) for any kind of PLACEHOLDER_EXPR with
RECORD_TYPE.

Thus, I'd feel safer, even if you decide to use a PLACEHOLDER_EXPR, that
the translation of that to DW_OP_push_object_address would be done only
if the PLACEHOLDER_EXPR is equal to some global variable, normally NULL,
and only changed temporarily while emitting loc for the array descriptor.
But then IMHO a DEBUG_EXPR_DECL is better.

That said, if Jason is fine with the patchset as is, I can live with it,
as other FEs don't use PLACEHOLDER_EXPRs, worst case it will affect Ada
only.
Also, please verify that with your patch the generated debug info for some
Fortran arrays is the same.

Jakub


Re: [PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-07 Thread Bernhard Reutner-Fischer
On 6 October 2014 22:31:18 CEST, Jan Hubicka  wrote:
>> 

>> Is it ok to commit these two patches now?
>
>Yes, it is OK, thanks!

I do not see documentation of the new parameter added to doc in the ChangeLog?
Also, I would not abbreviate "indir" in the param name.

Thanks,



Re: [PING] Enhance array types debug info. for Ada

2014-10-07 Thread Pierre-Marie de Rodat

On 10/03/2014 06:41 PM, Jason Merrill wrote:

Patches 1-4 are OK.


+  bool pell_conversions = true;


I don't understand "pell".  Do you mean "strip"?


Absolutely: I though it was correct English. I replaced all occurences 
of "pell" with "strip". Updates patches will follow...


Thank you very much for your review! :-)

--
Pierre-Marie de Rodat


Re: [PATCH] Enhance array types debug info. for Ada

2014-10-07 Thread Pierre-Marie de Rodat

Jakub,

First, thank you very much for reviewing this set of patches.

I think it's better to start with an answer to your last mail:

On 10/03/2014 11:20 AM, Jakub Jelinek wrote:

What kind of more complex expressions do you need and why?


GNAT can produce array types that make sense only as part of a record 
type and whose bounds are equal to members of this record type. Such 
ARRAY_TYPE nodes get generated from the kind of example you could see on 
the Dwarf-Discuss mailing list:


type Array_Type is array (Integer range <>) of Integer;
type Record_Type (N : Integer) is record
   A : Array_Type (1 .. N);
end record;

In this case, the "A" field's type is an ARRAY_TYPE node whose upper 
bound is:


COMPONENT_REF (PLACEHOLDER_EXPR (),
   FIELD_DECL("N"))

Upcoming patches will actually extend the need to handle more complex 
expressions: Ada arrays can contain dynamically-sized objects (their 
size is bounded, though). As a consequence, debuggers need these arrays 
to have a DW_AT_byte_stride attribute in order to decode them. The size 
expressions that describe the array stride in GCC can contain fairly 
complex operations such as unsigned divisions, unsigned comparisons, 
bitwise operations, calls to size functions (see 
stor-layout.c:self_referencial_size).


On 10/03/2014 11:18 AM, Jakub Jelinek wrote:

+ /* Instead of producing a dedicated DW_TAG_array_type DIE for  this type, let
+the circuitry wrap the main variant with DIEs for qualifiers  (for
+instance: DW_TAG_const_type, ...). */
+ if (type != TYPE_MAIN_VARIANT (type))
+ {
+   gen_type_die (TYPE_MAIN_VARIANT (type), context_die);
+   return;
+ }


I don't like this, can you explain why? I'd say that if you only
want to see TYPE_MAIN_VARIANT here, it should be responsibility of
the callers to ensure that.


Agreed. I have updated the patch to:

 1. remove this hunk;
 2. in gen_type_die_with_usage, which is the only caller, move the 
type_main_variant call on "type" right before the array descriptors 
handling.



@@ -19941,7 +19991,8 @@ gen_type_die_with_usage (tree type,  dw_die_ref 
context_die,
/* If this is an array type with hidden descriptor, handle itfirst.  */
if (!TREE_ASM_WRITTEN (type)
&& lang_hooks.types.get_array_descr_info
-  && lang_hooks.types.get_array_descr_info (type, &info)
+  && lang_hooks.types.get_array_descr_info (type,
+   init_array_descr_info (&info))


Just memset it to 0 instead?


Sure. I was not sure about whether is was considered good style, but 
it's done, now.



+  enum array_descr_ordering ordering;
tree element_type;
tree base_decl;
tree data_location;
tree allocated;
tree associated;
+


Why the extra vertical space?

struct array_descr_dimen
  {


It made the separation between "global" and "dimension-local" 
information clearer to me. As I suppose you don't like it and as there 
is already one indentation level, I removed it.



* dwarf2out.c (gen_type_die_with_usage): Enable the array lang-hook
even when (dwarf_version < 3 && dwarf_strict).
(gen_descr_array_die): Do not output DW_AT_data_locationn,
DW_AT_associated, DW_AT_allocated and DW_AT_byte_stride DWARF
attributes when (dwarf_version < 3 && dwarf_strict).


This patch sounds very wrong.  DW_OP_push_object_address is not in DWARF2
either, and that is the basis of all the fields, so there is reallynothing
you can really output correctly for DWARF2.  It isn't the default on sane
targets, where GCC defaults to DWARF4 these days, so why bother?


Generating DW_OP_push_object_address in strict DWARF2 mode is indeed a 
bug (patch is adjusted). However, if I understand correctly all 
fields/attributes don't have to rely on it.


In the case of the first Ada example I quoted above, such an operation 
would not be emitted: instead, add_bound_info/add_scalar_info are going 
to output a DW_AT_upper_bound attribute that is a reference to another 
DIE. This is valid DWARF2 and, I think, justifies enabling the language 
hook in this case.


We have several platforms whose default to strict DWARF2. These are 
quite used platforms on which some DWARF consumers crash when provided 
DIEs and tags they do not handle.



gcc/fortran/
* trans-types.c (gfc_get_array_descr_info): Use PLACEHOLDER_EXPR nodes
instead of VAR_DECL ones in type-related expressions.  Remove base_decl
initialization.


Ugh, I must say I don't like PLACEHOLDER_EXPRs at all.


Why so? I know that as far as supported front-ends are concerned, 
PLACEHOLDE_EXPR nodes are used only in GNAT, but it seems to me they 
describe the best what object the bound/stride/allocated/associated 
expressions (self-)reference.


I have attached to this mail the 3 patches that are updated thanks to 
your (Jakub and Jason's) comments and run successfuly the GCC testsuite 
on x86_64-pc-linux-gnu.


Tha

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Andrew Haley
On 06/10/14 22:00, Mark Wielaard wrote:
> If no java maintainer responds, try CCing java-patc...@gcc.gnu.org
> to draw their attention.

Please.  I can't see the patch here.

Andrew.



Re: [patch] Fix miscompilation of gnat1 in LTO bootstrap

2014-10-07 Thread Eric Botcazou
> Testcase?  I think it would be better to handle this in the canonical type
> merging code in lto.c - or how does it end up working without LTO?  That is,
> what does the Ada frontend do to make sure get_alias_set handles this
> correctly?

It manages the alias sets, see gcc-interface/utils.c:relate_alias_sets.

-- 
Eric Botcazou


Re: [patch] Turn 1 lra_assert into 1 gcc_assert

2014-10-07 Thread Eric Botcazou
> The docs on the asm_p flags say there is sth wrong with the asm constraints
> so maybe better do
> 
>  if (!asm_p)
>error_at (loc, "");
> 
> with an appropriate message and location?

OK, I guess I can copy-and-paste reload1.c:spill_failure there.

-- 
Eric Botcazou


Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 9:43 AM, Eric Botcazou  wrote:
>> I wonder if this is worth abstracting into a callee_fn () cgraph edge
>> method?
>
> That would rather be a cgraph node method without "callee" in the name since
> we also apply it to callers, something like:
>
> struct function *cgraph_node::cfun (void)
>
> and the code in can_inline_edge_p would just be:
>
>   struct function *caller_cfun = e->caller->cfun ();
>   struct function *callee_cfun = callee ? callee->cfun () : NULL;

Ah, ok.  Yes agreed - but without the 'c' (nothing is "current" here IMHO).
Maybe ->get_fun () to be consistent with other method names.

I'll pre-approve a patch to do that.

Thanks,
Richard.

> --
> Eric Botcazou


Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114

2014-10-07 Thread Richard Biener
On Tue, Oct 7, 2014 at 9:45 AM, Richard Biener
 wrote:
> On Mon, Oct 6, 2014 at 7:30 PM, Alan Lawrence  wrote:
>> Ok, so unless there are objections, I plan to commit patches 1, 2, 4, 5, and
>> 6,
>> which have been previously approved, in that sequence. (Of those, all bar
>> patch
>> 2 are AArch64 only.) I think this is better than maintaining an
>> ever-expanding
>> patch series.
>
> Agreed.
>
>> Then I'll get to work on migrating all backends to the new _scal_ optab (and
>> removing the vector optab). Certainly I'd like to replace vec_shr/l with
>> vec_perm_expr too, but I'm conscious that the end of stage 1 is approaching!
>
> I suppose we all are.  It will last until end of October at least
> (stage1 of gcc 4.9
> ended Nov 22th, certainly a bit late).
>
> I do expect we will continue merging already developed / posted stuff through
> stage3 (as usual).
>
> That said, it would be really nice to get rid of VEC_RSHIFT_EXPR.

And you can fix performance regressions you introduce (badly handled
VEC_PERM) until the GCC 5 release happens (and even after that).
Heh.  Easy way out ;)

Richard.

> Thanks,
> Richard.
>
>> --Alan
>>
>>
>>
>>
>> Richard Biener wrote:
>>>
>>> On Thu, Sep 18, 2014 at 1:41 PM, Alan Lawrence 
>>> wrote:

 The end goal here is to remove this code from tree-vect-loop.c
 (vect_create_epilog_for_reduction):

   if (BYTES_BIG_ENDIAN)
 bitpos = size_binop (MULT_EXPR,
  bitsize_int (TYPE_VECTOR_SUBPARTS (vectype)
 -
 1),
  TYPE_SIZE (scalar_type));
   else

 as this is the root cause of PR/61114 (see testcase there, failing on all
 bigendian targets supporting reduc_[us]plus_optab). Quoting Richard
 Biener,
 "all code conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is
 suspicious". The code snippet above is used on two paths:

 (Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =
 reduc_[us](plus|min|max)_optab.
 The optab is documented as "the scalar result is stored in the least
 significant bits of operand 0", but the tree code as "the first element
 in
 the vector holding the result of the reduction of all elements of the
 operand". This mismatch means that when the tree code is folded, the code
 snippet above reads the result from the wrong end of the vector.

 The strategy (as per
 https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html) is to define
 new
 tree codes and optabs that produce scalar results directly; this seems
 better than tying (the element of the vector into which the result is
 placed) to (the endianness of the target), and avoids generating extra
 moves
 on current bigendian targets. However, the previous optabs are retained
 for
 now as a migration strategy so as not to break existing backends; moving
 individual platforms over will follow.

 A complication here is on AArch64, where we directly generate
 REDUC_PLUS_EXPRs from intrinsics in gimple_fold_builtin; I temporarily
 remove this folding in order to decouple the midend and AArch64 backend.
>>>
>>>
>>> Sounds fine.  I hope we can transition all backends for 5.0 and remove
>>> the vector variant optabs (maybe renaming the scalar ones).
>>>
 (Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.
 VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the
 optab
 is defined in an endianness-dependent way, leading to significant
 complication in fold-const.c. (Moreover, the "equivalent" vec_shl_optab
 is
 never used!). Few platforms appear to handle vec_shr_optab (and fewer
 bigendian - I see only PowerPC and MIPS), so it seems pertinent to change
 the existing optab to be endianness-neutral.

 Patch 10 defines vec_shr for AArch64, for the old specification; patch 13
 updates that implementation to fit the new endianness-neutral
 specification,
 serving as a guide for other existing backends. Patches/RFCs 15 and 16
 are
 equivalents for MIPS and PowerPC; I haven't tested these but hope they
 act
 as useful pointers for the port maintainers.

 Finally patch 14 cleans up the affected part of tree-vect-loop.c
 (vect_create_epilog_for_reduction).
>>>
>>>
>>> As said during the individual patches review I'd like the vectorizer to
>>> use a VEC_PERM_EXPR instead of VEC_RSHIFT_EXPR (with
>>> only whole-element amounts).  This means we can remove
>>> VEC_RSHIFT_EXPR.  It also means that if the backend defines
>>> vec_perm_const (which it really should) it can handle the special
>>> permutes that boil down to a possibly more efficient vector shift
>>> there (a good optimization anyway).  Until it does that all backends
>>> would at least create correct code (with the endian dependent
>>> vec_shr removed).
>>>
>>> Richard.
>>>
 --Alan

>>>
>>
>>
>> -- IMPO

Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114

2014-10-07 Thread Richard Biener
On Mon, Oct 6, 2014 at 7:30 PM, Alan Lawrence  wrote:
> Ok, so unless there are objections, I plan to commit patches 1, 2, 4, 5, and
> 6,
> which have been previously approved, in that sequence. (Of those, all bar
> patch
> 2 are AArch64 only.) I think this is better than maintaining an
> ever-expanding
> patch series.

Agreed.

> Then I'll get to work on migrating all backends to the new _scal_ optab (and
> removing the vector optab). Certainly I'd like to replace vec_shr/l with
> vec_perm_expr too, but I'm conscious that the end of stage 1 is approaching!

I suppose we all are.  It will last until end of October at least
(stage1 of gcc 4.9
ended Nov 22th, certainly a bit late).

I do expect we will continue merging already developed / posted stuff through
stage3 (as usual).

That said, it would be really nice to get rid of VEC_RSHIFT_EXPR.

Thanks,
Richard.

> --Alan
>
>
>
>
> Richard Biener wrote:
>>
>> On Thu, Sep 18, 2014 at 1:41 PM, Alan Lawrence 
>> wrote:
>>>
>>> The end goal here is to remove this code from tree-vect-loop.c
>>> (vect_create_epilog_for_reduction):
>>>
>>>   if (BYTES_BIG_ENDIAN)
>>> bitpos = size_binop (MULT_EXPR,
>>>  bitsize_int (TYPE_VECTOR_SUBPARTS (vectype)
>>> -
>>> 1),
>>>  TYPE_SIZE (scalar_type));
>>>   else
>>>
>>> as this is the root cause of PR/61114 (see testcase there, failing on all
>>> bigendian targets supporting reduc_[us]plus_optab). Quoting Richard
>>> Biener,
>>> "all code conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is
>>> suspicious". The code snippet above is used on two paths:
>>>
>>> (Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =
>>> reduc_[us](plus|min|max)_optab.
>>> The optab is documented as "the scalar result is stored in the least
>>> significant bits of operand 0", but the tree code as "the first element
>>> in
>>> the vector holding the result of the reduction of all elements of the
>>> operand". This mismatch means that when the tree code is folded, the code
>>> snippet above reads the result from the wrong end of the vector.
>>>
>>> The strategy (as per
>>> https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html) is to define
>>> new
>>> tree codes and optabs that produce scalar results directly; this seems
>>> better than tying (the element of the vector into which the result is
>>> placed) to (the endianness of the target), and avoids generating extra
>>> moves
>>> on current bigendian targets. However, the previous optabs are retained
>>> for
>>> now as a migration strategy so as not to break existing backends; moving
>>> individual platforms over will follow.
>>>
>>> A complication here is on AArch64, where we directly generate
>>> REDUC_PLUS_EXPRs from intrinsics in gimple_fold_builtin; I temporarily
>>> remove this folding in order to decouple the midend and AArch64 backend.
>>
>>
>> Sounds fine.  I hope we can transition all backends for 5.0 and remove
>> the vector variant optabs (maybe renaming the scalar ones).
>>
>>> (Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.
>>> VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the
>>> optab
>>> is defined in an endianness-dependent way, leading to significant
>>> complication in fold-const.c. (Moreover, the "equivalent" vec_shl_optab
>>> is
>>> never used!). Few platforms appear to handle vec_shr_optab (and fewer
>>> bigendian - I see only PowerPC and MIPS), so it seems pertinent to change
>>> the existing optab to be endianness-neutral.
>>>
>>> Patch 10 defines vec_shr for AArch64, for the old specification; patch 13
>>> updates that implementation to fit the new endianness-neutral
>>> specification,
>>> serving as a guide for other existing backends. Patches/RFCs 15 and 16
>>> are
>>> equivalents for MIPS and PowerPC; I haven't tested these but hope they
>>> act
>>> as useful pointers for the port maintainers.
>>>
>>> Finally patch 14 cleans up the affected part of tree-vect-loop.c
>>> (vect_create_epilog_for_reduction).
>>
>>
>> As said during the individual patches review I'd like the vectorizer to
>> use a VEC_PERM_EXPR instead of VEC_RSHIFT_EXPR (with
>> only whole-element amounts).  This means we can remove
>> VEC_RSHIFT_EXPR.  It also means that if the backend defines
>> vec_perm_const (which it really should) it can handle the special
>> permutes that boil down to a possibly more efficient vector shift
>> there (a good optimization anyway).  Until it does that all backends
>> would at least create correct code (with the endian dependent
>> vec_shr removed).
>>
>> Richard.
>>
>>> --Alan
>>>
>>
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium.  Thank you.
>
> ARM Limited, Registered

Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Eric Botcazou
> I wonder if this is worth abstracting into a callee_fn () cgraph edge
> method?

That would rather be a cgraph node method without "callee" in the name since 
we also apply it to callers, something like:

struct function *cgraph_node::cfun (void)

and the code in can_inline_edge_p would just be:

  struct function *caller_cfun = e->caller->cfun ();
  struct function *callee_cfun = callee ? callee->cfun () : NULL;

-- 
Eric Botcazou


[PATCH][match-and-simplify] Change (match ...) syntax

2014-10-07 Thread Richard Biener

After internal discussion this changes

(match logical_inverted_value
 (ne truth_valued_p@0 integer_onep)
 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  (logical_inverted_value @0)))

to

(match (logical_inverted_value @0)
 (ne truth_valued_p@0 integer_onep)
 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)

thus avoids repeating 'logical_inverted_value' and puts whether
this is an expression or predicate matcher and its operands first.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-10-07  Richard Biener  

* genmatch.c (parser::parse_pattern): Change match parsing
to expect the matching template first, not as result.
(parser::parse_simplify): Likewise.
* match-bitwise.pd: Adjust.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215917)
+++ gcc/genmatch.c  (working copy)
@@ -2163,7 +2163,8 @@ private:
   operand *parse_op ();
 
   void parse_pattern ();
-  void parse_simplify (source_location, vec&, predicate_id *);
+  void parse_simplify (source_location, vec&, predicate_id *,
+  expr *);
   void parse_for (source_location);
   void parse_if (source_location);
   void parse_predicates (source_location);
@@ -2528,7 +2529,8 @@ parser::parse_op ()
 
 void
 parser::parse_simplify (source_location match_location,
-   vec& simplifiers, predicate_id *matcher)
+   vec& simplifiers, predicate_id *matcher,
+   expr *result)
 {
   /* Reset the capture map.  */
   capture_ids = new std::map;
@@ -2549,12 +2551,8 @@ parser::parse_simplify (source_location
 {
   if (!matcher)
fatal_at (token, "expected transform expression");
-  else if (matcher->nargs > 0)
-   fatal_at (token, "expected match operand expression");
-  if (matcher->nargs == -1)
-   matcher->nargs = 0;
   simplifiers.safe_push
-   (new simplify (match, match_location, NULL, token->src_loc,
+   (new simplify (match, match_location, result, token->src_loc,
   active_ifs.copy (), active_fors.copy (),
   capture_ids));
   return;
@@ -2579,12 +2577,8 @@ parser::parse_simplify (source_location
{
  if (!matcher)
fatal_at (token, "manual transform not implemented");
- else if (matcher->nargs > 0)
-   fatal_at (token, "expected match operand expression");
- if (matcher->nargs == -1)
-   matcher->nargs = 0;
  simplifiers.safe_push
- (new simplify (match, match_location, NULL,
+ (new simplify (match, match_location, result,
 paren_loc, active_ifs.copy (),
 active_fors.copy (), capture_ids));
}
@@ -2599,19 +2593,9 @@ parser::parse_simplify (source_location
}
  else
{
- operand *op = parse_expr ();
- if (matcher)
-   {
- expr *e = dyn_cast  (op);
- if (!e)
-   fatal_at (token, "match operand expression cannot "
- "be captured");
- if (matcher->nargs == -1)
-   matcher->nargs = e->ops.length ();
- if (matcher->nargs == 0
- || (unsigned) matcher->nargs != e->ops.length ())
-   fatal_at (token, "match arity doesn't match");
-   }
+ operand *op = result;
+ if (!matcher)
+   op = parse_expr ();
  simplifiers.safe_push
  (new simplify (match, match_location, op,
 token->src_loc, active_ifs.copy (),
@@ -2644,7 +2628,8 @@ parser::parse_simplify (source_location
  if (matcher)
fatal_at (token, "expected match operand expression");
  simplifiers.safe_push
- (new simplify (match, match_location, parse_op (),
+ (new simplify (match, match_location,
+matcher ? result : parse_op (),
 token->src_loc, active_ifs.copy (),
 active_fors.copy (), capture_ids));
  /* A "default" result closes the enclosing scope.  */
@@ -2811,9 +2796,15 @@ parser::parse_pattern ()
   const cpp_token *token = peek ();
   const char *id = get_ident ();
   if (strcmp (id, "simplify") == 0)
-parse_simplify (token->src_loc, simplifiers, NULL);
+parse_simplify (token->src_loc, simplifiers, NULL, NULL);
   else if (strcmp (id, "match") == 0)
 {
+  bool with_args = false;
+  if (peek ()->type == CPP_OPEN_PAREN)
+   {
+ eat_token (CPP_OPEN_PAREN);
+ with_args = true;
+   }
   const char *name = get_ident ();
   id_base *id = get_operator (name

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Eric Botcazou
> You're right. I have attached an updated patch. The new code becomes:
> 
>   #ifdef HAVE_AS_LEON
>   #define AS_LEON_FLAG "-Aleon"
> +#define AS_LEONV7_FLAG "-Aleon"
>   #else
>   #define AS_LEON_FLAG "-Av8"
> +#define AS_LEONV7_FLAG "-Av7"
>   #endif

The patch is OK for all active branches (trunk, 4.9 and 4.8), modulo nits in 
the ChangeLog entry: capital letter at the beginning and period at the end of 
every sentence.

* config.gcc (sparc*-*-*): Accept mcpu=leon3v7 processor.
* doc/invoke.texi (SPARC options): Add mcpu=leon3v7 comment.
* config/sparc/leon.md (leon3_load, leon_store, leon_fp_*): Handle
leon3v7 as leon3.
* config/sparc/sparc-opts.h (enum processor_type): Add LEON3V7.
* config/sparc/sparc.c (sparc_option_override): Add leon3v7 support.
* config/sparc/sparc.h (TARGET_CPU_leon3v7): New define.
* config/sparc/sparc.md (cpu): Add leon3v7.
* config/sparc/sparc.opt (enum processor_type): Add leon3v7.

I assume that you have applied for write access so you'll be able to install 
it yourself.  Otherwise let me know if I can lend a hand.

-- 
Eric Botcazou


Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Dodji Seketeli
Jason Merrill  writes:

> On 10/06/2014 08:50 PM, Siva Chandra wrote:
>> On Sat, Oct 4, 2014 at 11:14 AM, Jason Merrill  wrote:
>>> On 10/03/2014 05:41 PM, Siva Chandra wrote:

 I understand that knowing whether a copy-ctor or a d-tor has been
 explicitly defaulted is not sufficient to determine the parameter
 passing ABI. However, why is it not necessary? I could be wrong, but
 doesn't the example I have given show that it is necessary?
>>>
>>> An implicitly declared 'tor can also be trivial.
>>
>> But, the question is whether it is required to determine the parameter
>> passing ABI. If there is no special marker to indicate that the user
>> declared 'tor is explicitly defaulted, then GDB could (in the absence
>> of other properties which make the 'tor non-trivial) incorrectly
>> conclude that the the 'tor is user defined, and hence not-trivial.
>
> I've been thinking that we should just mark the 'tor as trivial or not
> directly rather than hint at it.

FWIW, this would be my inclination too.  I think it would make the job
of the debug info consumer a lot easier.

Thanks.

-- 
Dodji


Re: Fix libgomp crash without TLS (PR42616)

2014-10-07 Thread Jakub Jelinek
On Wed, Oct 01, 2014 at 08:44:59PM +0400, Varvara Rainchik wrote:
> Ok, then here it is a new patch (tested and bootstrapped on linux).
> 
> On linux with --disable-tls now all libgomp make check tests pass; for
> Android I've patched toolchain and tried test from one of the
> mentioned bugs, test passes too.

> Is there some benchmark to check performance?

There is SPEC OMP,
http://www.spec.org/hpg/omp2001/
EPCC,
http://www2.epcc.ed.ac.uk/computing/research_activities/openmpbench/openmp_index.html
NAS,
http://www.nas.nasa.gov/publications/npb.html
http://phase.hpcc.jp/Omni/benchmarks/NPB/
Rodinia,
https://www.cs.virginia.edu/~skadron/wiki/rodinia/index.php/Main_Page

Now, I wonder on which OS and why does config/tls.m4 CHECK_GCC_TLS
actually fail?  Can you figure that out?

If we get rid of HAVE_TLS code altogether, we might lose support of
some very old OSes, e.g. some Linux distros with a recent gcc and binutils
(so that emutls isn't used), but very old glibc (that doesn't support
TLS or supports it incorrectly, think of pre-2002 glibc).  So, if we get
rid of !HAVE_TLS code in libgomp, it would be nice if config/tls.m4 detected
it properly and we'd just fail at configure time.
And if we don't, just make sure that on Android, Darwin and/or M$Win (or
whatever other OS you had in mind which does support pthreads, but doesn't
support native TLS) find out why HAVE_AS_TLS is not defined (guess
config.log should explain that).

> 2014-10-01  Varvara Rainchik  
> 
> * libgomp.h (HAVE_TLS): Set to 1.

Jakub