Re: [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE

2017-10-19 Thread Bernhard Reutner-Fischer
[forgot to CC gcc-patches]

On Sat, Jun 18, 2016 at 09:58:47PM +0200, Bernhard Reutner-Fischer wrote:
> Hi,
> 
> Ok for trunk?

This was ACKed about a year ago by Janne and Jerry and since there were
no objections in the meantime i've installed this first step towards
providing spelling suggestions in the fortran FE as r253877.

cheers,
> 
> Changes for v4 -> v3:
> 
> - rebased
> - Use 4 argument levenshtein_distance() to save multiple strlen(typo)
>   calls as suggested by dmalcolm
> 
> Changes for v2 -> v3:
> 
> - rebased
> 
> Changes for v1 -> v2:
> 
> - subroutines using interfaces
> - keyword arguments (named parameters)
> 
> Rewrite C++ autovec in plain C.
> Factor out levenshtein distance handling into a commonly used
> gfc_closest_fuzzy_match().
> 
> gcc/fortran/ChangeLog
> 
> 2015-12-27  Bernhard Reutner-Fischer  
> 
>   * gfortran.h (gfc_lookup_function_fuzzy): New declaration.
>   (gfc_closest_fuzzy_match): New declaration.
>   (vec_push): New definition.
>   * misc.c (gfc_closest_fuzzy_match): New definition.
>   * resolve.c: Include spellcheck.h.
>   (lookup_function_fuzzy_find_candidates): New static function.
>   (lookup_uop_fuzzy_find_candidates): Likewise.
>   (lookup_uop_fuzzy): Likewise.
>   (resolve_operator) : Call lookup_uop_fuzzy.
>   (gfc_lookup_function_fuzzy): New definition.
>   (resolve_unknown_f): Call gfc_lookup_function_fuzzy.
>   * interface.c (check_interface0): Likewise.
>   (lookup_arg_fuzzy_find_candidates): New static function.
>   (lookup_arg_fuzzy ): Likewise.
>   (compare_actual_formal): Call lookup_arg_fuzzy.
>   * symbol.c: Include spellcheck.h.
>   (lookup_symbol_fuzzy_find_candidates): New static function.
>   (lookup_symbol_fuzzy): Likewise.
>   (gfc_set_default_type): Call lookup_symbol_fuzzy.
>   (lookup_component_fuzzy_find_candidates): New static function.
>   (lookup_component_fuzzy): Likewise.
>   (gfc_find_component): Call lookup_component_fuzzy.
> 
> gcc/testsuite/ChangeLog
> 
> 2015-12-27  Bernhard Reutner-Fischer  
> 
>   * gfortran.dg/spellcheck-operator.f90: New testcase.
>   * gfortran.dg/spellcheck-procedure_1.f90: New testcase.
>   * gfortran.dg/spellcheck-procedure_2.f90: New testcase.
>   * gfortran.dg/spellcheck-structure.f90: New testcase.
>   * gfortran.dg/spellcheck-parameter.f90: New testcase.
> 
> ---
> 
> David Malcolm's nice Levenshtein distance spelling check helpers
> were used in some parts of other frontends. This proposed patch adds
> some spelling corrections to the fortran frontend.
> 
> Suggestions are printed if we can find a suitable name, currently
> perusing a very simple cutoff factor:
> /* If more than half of the letters were misspelled, the suggestion is
>likely to be meaningless.  */
> cutoff = MAX (strlen (typo), strlen (best_guess)) / 2;
> which effectively skips names with less than 4 characters.
> For e.g. structures, one could try to be much smarter in an attempt to
> also provide suggestions for single-letter members/components.
> 
> This patch covers (at least partly):
> - user-defined operators
> - structures (types and their components)
> - functions
> - symbols (variables)
> 
> If anybody has a testcase where a spelling-suggestion would make sense
> then please pass it along so we maybe can add support for GCC-7.


[committed] Fix ICE with F2008 BLOCK inside of !$OMP PARALLEL (PR fortran/82568)

2017-10-19 Thread Jakub Jelinek
Hi!

As OpenMP 4.5 supports only F2003 and earlier, this is strictly speaking
code with unspecified behavior, but we shouldn't ICE on it, so if we find
a sequential loop iterator inside of BLOCK inside of parallel or task
generating construct, and the loop iterator is explicitly or implicitly
declared inside of the BLOCK, we make sure the variable isn't added to
PRIVATE clauses on the surrounding OMP PARALLEL (or task generating
construct).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
queued for backporting.

2017-10-19  Jakub Jelinek  

PR fortran/82568
* gfortran.h (gfc_resolve_do_iterator): Add a bool arg.
(gfc_resolve_omp_local_vars): New declaration.
* openmp.c (omp_current_ctx): Make static.
(gfc_resolve_omp_parallel_blocks): Handle EXEC_OMP_TASKLOOP
and EXEC_OMP_TASKLOOP_SIMD.
(gfc_resolve_do_iterator): Add ADD_CLAUSE argument, if false,
don't actually add any clause.  Move omp_current_ctx test
earlier.
(handle_local_var, gfc_resolve_omp_local_vars): New functions.
* resolve.c (gfc_resolve_code): Call gfc_resolve_omp_parallel_blocks
instead of just gfc_resolve_omp_do_blocks for EXEC_OMP_TASKLOOP
and EXEC_OMP_TASKLOOP_SIMD.
(gfc_resolve_code): Adjust gfc_resolve_do_iterator caller.
(resolve_codes): Call gfc_resolve_omp_local_vars.

* gfortran.dg/gomp/pr82568.f90: New test.

--- gcc/fortran/gfortran.h.jj   2017-10-09 09:41:20.0 +0200
+++ gcc/fortran/gfortran.h  2017-10-18 17:06:23.441304104 +0200
@@ -3103,7 +3103,8 @@ void gfc_free_omp_declare_simd_list (gfc
 void gfc_free_omp_udr (gfc_omp_udr *);
 gfc_omp_udr *gfc_omp_udr_find (gfc_symtree *, gfc_typespec *);
 void gfc_resolve_omp_directive (gfc_code *, gfc_namespace *);
-void gfc_resolve_do_iterator (gfc_code *, gfc_symbol *);
+void gfc_resolve_do_iterator (gfc_code *, gfc_symbol *, bool);
+void gfc_resolve_omp_local_vars (gfc_namespace *);
 void gfc_resolve_omp_parallel_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_omp_do_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_omp_declare_simd (gfc_namespace *);
--- gcc/fortran/openmp.c.jj 2017-09-22 20:51:49.0 +0200
+++ gcc/fortran/openmp.c2017-10-18 17:42:21.632834909 +0200
@@ -5262,7 +5262,7 @@ resolve_omp_atomic (gfc_code *code)
 }
 
 
-struct fortran_omp_context
+static struct fortran_omp_context
 {
   gfc_code *code;
   hash_set *sharing_clauses;
@@ -5345,6 +5345,8 @@ gfc_resolve_omp_parallel_blocks (gfc_cod
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
+case EXEC_OMP_TASKLOOP:
+case EXEC_OMP_TASKLOOP_SIMD:
 case EXEC_OMP_TEAMS_DISTRIBUTE:
 case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO:
 case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
@@ -5390,8 +5392,11 @@ gfc_omp_restore_state (struct gfc_omp_sa
construct, where they are predetermined private.  */
 
 void
-gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
+gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause)
 {
+  if (omp_current_ctx == NULL)
+return;
+
   int i = omp_current_do_collapse;
   gfc_code *c = omp_current_do_code;
 
@@ -5410,9 +5415,6 @@ gfc_resolve_do_iterator (gfc_code *code,
   c = c->block->next;
 }
 
-  if (omp_current_ctx == NULL)
-return;
-
   /* An openacc context may represent a data clause.  Abort if so.  */
   if (!omp_current_ctx->is_openmp && !oacc_is_loop (omp_current_ctx->code))
 return;
@@ -5421,7 +5423,7 @@ gfc_resolve_do_iterator (gfc_code *code,
   && omp_current_ctx->sharing_clauses->contains (sym))
 return;
 
-  if (! omp_current_ctx->private_iterators->add (sym))
+  if (! omp_current_ctx->private_iterators->add (sym) && add_clause)
 {
   gfc_omp_clauses *omp_clauses = omp_current_ctx->code->ext.omp_clauses;
   gfc_omp_namelist *p;
@@ -5433,6 +5435,22 @@ gfc_resolve_do_iterator (gfc_code *code,
 }
 }
 
+static void
+handle_local_var (gfc_symbol *sym)
+{
+  if (sym->attr.flavor != FL_VARIABLE
+  || sym->as != NULL
+  || (sym->ts.type != BT_INTEGER && sym->ts.type != BT_REAL))
+return;
+  gfc_resolve_do_iterator (sym->ns->code, sym, false);
+}
+
+void
+gfc_resolve_omp_local_vars (gfc_namespace *ns)
+{
+  if (omp_current_ctx)
+gfc_traverse_ns (ns, handle_local_var);
+}
 
 static void
 resolve_omp_do (gfc_code *code)
--- gcc/fortran/resolve.c.jj2017-10-14 14:12:20.0 +0200
+++ gcc/fortran/resolve.c   2017-10-18 17:41:44.627283366 +0200
@@ -10916,6 +10916,8 @@ gfc_resolve_code (gfc_code *code, gfc_na
case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
case EXEC_OMP_TASK:
+   case EXEC_OMP_TASKLOOP:
+   case EXEC_OMP_TASKLOOP_SIMD:
case 

Re: [PATCH] Do not instrument use-after-scope for vars with large alignment (PR sanitizer/82517).

2017-10-19 Thread Jakub Jelinek
On Thu, Oct 19, 2017 at 09:21:47AM +0200, Martin Liška wrote:
> Hi.
> 
> As discussed with Jakub, use-after-scope sanitization should not be done for 
> variables that have bigger
> alignment than MAX_SUPPORTED_STACK_ALIGNMENT. In this case, we can't put a 
> variable to fixed stack slot.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-10-18  Martin Liska  
> 
>   PR sanitizer/82517
>   * gimplify.c (gimplify_decl_expr): Do not instrument variables
>   that have a large alignment.
>   (gimplify_target_expr): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-10-18  Martin Liska  
> 
>   PR sanitizer/82517
>   * gcc.dg/asan/pr82517.c: New test.

Ok, thanks.

Jakub


[PATCH] Do not instrument use-after-scope for vars with large alignment (PR sanitizer/82517).

2017-10-19 Thread Martin Liška
Hi.

As discussed with Jakub, use-after-scope sanitization should not be done for 
variables that have bigger
alignment than MAX_SUPPORTED_STACK_ALIGNMENT. In this case, we can't put a 
variable to fixed stack slot.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-10-18  Martin Liska  

PR sanitizer/82517
* gimplify.c (gimplify_decl_expr): Do not instrument variables
that have a large alignment.
(gimplify_target_expr): Likewise.

gcc/testsuite/ChangeLog:

2017-10-18  Martin Liska  

PR sanitizer/82517
* gcc.dg/asan/pr82517.c: New test.
---
 gcc/gimplify.c  |  5 -
 gcc/testsuite/gcc.dg/asan/pr82517.c | 43 +
 2 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/pr82517.c


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c3fd6ace84e..19411c98fce 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1656,6 +1656,7 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	  && TREE_ADDRESSABLE (decl)
 	  && !TREE_STATIC (decl)
 	  && !DECL_HAS_VALUE_EXPR_P (decl)
+	  && DECL_ALIGN (decl) <= MAX_SUPPORTED_STACK_ALIGNMENT
 	  && dbg_cnt (asan_use_after_scope))
 	{
 	  asan_poisoned_variables->add (decl);
@@ -6505,7 +6506,9 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 	  clobber = build2 (MODIFY_EXPR, TREE_TYPE (temp), temp, clobber);
 	  gimple_push_cleanup (temp, clobber, false, pre_p, true);
 	}
-	  if (asan_poisoned_variables && dbg_cnt (asan_use_after_scope))
+	  if (asan_poisoned_variables
+	  && DECL_ALIGN (temp) <= MAX_SUPPORTED_STACK_ALIGNMENT
+	  && dbg_cnt (asan_use_after_scope))
 	{
 	  tree asan_cleanup = build_asan_poison_call_expr (temp);
 	  if (asan_cleanup)
diff --git a/gcc/testsuite/gcc.dg/asan/pr82517.c b/gcc/testsuite/gcc.dg/asan/pr82517.c
new file mode 100644
index 000..c7743ecb8b1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr82517.c
@@ -0,0 +1,43 @@
+/* PR sanitizer/82517.  */
+
+static int *pp;
+
+void
+baz ()
+{
+  return;
+}
+
+void
+bar (int *p)
+{
+  *p = 1;
+}
+
+void
+foo (int a)
+{
+  if (a == 2)
+{
+lab:
+  baz ();
+  return;
+}
+  if (a > 1)
+{
+  int x __attribute__ ((aligned (256)));
+  pp = 
+  bar ();
+  if (!x)
+	goto lab;
+}
+}
+
+int
+main (int argc, char **argv)
+{
+  foo (4);
+  foo (3);
+
+  return 0;
+}



Re: [RFA] Zen tuning part 9: Add support for scatter/gather in vectorizer costmodel

2017-10-19 Thread Jan Hubicka
Hi,
this is proof of concept patch for vectorizer costs to use costs used for 
rtx_cost
and register_move_cost which are readily available in ix86_costs instead of 
using
its own set of random values.  At least until we have proof of evidence that 
vectroizer
costs needs to differ, I do not think we want to complicate CPU tuning by 
having them
twice.

This is of course quite intrusive change to what we have becuase it affects all
x86 targets.  I have finally worked out that the "random" values used by AMD 
target
corresponds to latencies of bdver1.

I have benchmarked them on Zen and also temporarily patches Czerny (Haswel).
It seems to cause no regression and quite nice improvements:
  - 27.3% for facerec on Zen
  - 7% for mgrid on Haswel
  - maybe 1% for galgel of Haswell
  - 3% for facerec on Haswell
  - maybe 1% aspi on Haswell
  - there may be small off-noise improvement for rnflow and regression for 
fatigue2 on Haswell

So I would say that outcome is surprisingly good (especially due to lack of
noteworthy regressions).  I also know that vectorizer hurts performance on Zen 
and
Mesa/tonto benchmarks which is not cured by this patch alone.

There is testsuite fallout though.

./testsuite/g++/g++.sum:FAIL: g++.dg/vect/slp-pr56812.cc  -std=c++11  
scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times)
./testsuite/g++/g++.sum:FAIL: g++.dg/vect/slp-pr56812.cc  -std=c++14  
scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times)
./testsuite/g++/g++.sum:FAIL: g++.dg/vect/slp-pr56812.cc  -std=c++98  
scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times)

  Here we vectorize the loop before first while originally we unrolled and SLP 
vectorized next

./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_1.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_2.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_3.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_4.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_5.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_double_6.c 
scan-assembler-times vfmadd[123]+sd 56 (found 32 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_1.c 
scan-assembler-times vfmadd[123]+ss 120 (found 64 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_2.c 
scan-assembler-times vfmadd[123]+ss 120 (found 64 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_3.c 
scan-assembler-times vfmadd[123]+ss 120 (found 64 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_4.c 
scan-assembler-times vfmadd[123]+ss 120 (found 64 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_5.c 
scan-assembler-times vfmadd[123]+ss 120 (found 64 times)
./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/l_fma_float_6.c 
scan-assembler-times vfnmsub[123]+ss 120 (found 64 times)

And friends, clearly we do not vectorize all loops, I did not look into details 
yet

./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/pr61403.c scan-assembler blend

Here again we vectorize loop while originally we did SLP.  I am not sure why 
loop
vectorizer does not use blend.

./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/pr79683.c scan-assembler-times 
padd 1 (found 0 times)

Here we are supposed to vectorize two integer additions, but since generic cost 
model now claims that
latency of vector add is twice of integer add we don't.  I think it makes sense.

./testsuite/gcc/gcc.sum:FAIL: gcc.target/i386/pr79723.c scan-assembler 
mov[au]p.[ \t][^,]+, %gs:

Similarly here.

If it seems to make sense, I will clean it up (remove now unused entries and 
scale
conditional costs by COSTS_N_INSNS) and fix the tessuite fallout.

Honza

Index: i386.c
===
--- i386.c  (revision 253824)
+++ i386.c  (working copy)
@@ -44015,50 +44015,56 @@ static int
 ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
  tree vectype, int)
 {
+  bool fp = false;
+  if (vectype != NULL)
+fp = FLOAT_TYPE_P (vectype);
+
   switch (type_of_cost)
 {
   case scalar_stmt:
-return ix86_cost->scalar_stmt_cost;
+return fp ? ix86_cost->addss : COSTS_N_INSNS (1);
 
   case scalar_load:
-return ix86_cost->scalar_load_cost;
+return COSTS_N_INSNS (fp ? ix86_cost->sse_load[0]
+ : ix86_cost->int_load [2]) / 2;
 
   case scalar_store:
-return ix86_cost->scalar_store_cost;
+return COSTS_N_INSNS (fp ? ix86_cost->sse_store[0]
+ : ix86_cost->int_store [2]) / 2;
 
   case vector_stmt:
-return 

Re: [PATCH PR/82546] tree node size

2017-10-19 Thread Olivier Hainque

> On 18 Oct 2017, at 15:59, Eric Botcazou  wrote:
> 
>> I'd think so. LANG_TYPE is treated specially in several
>> places and Ada debug types are pretty sensitive so this would
>> require caution but I don't see/know-of obvious reasons why this
>> couldn't be done.
> 
> LANG_TYPE is only used in Ada to trigger the specific treatment in 
> gen_type_die_with_usage for DW_TAG_unspecified_type, so very minor.

OK, thanks for confirming Eric!



<    1   2