[Bug libstdc++/77537] pair constructors do not properly SFINAE

2016-09-08 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77537

--- Comment #1 from Ville Voutilainen  ---
See https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01230.html

Re: [PATCH] [PR libcpp/71681] Fix handling header.gcc in subdirectories

2016-09-08 Thread Andris Pavenis

On 09/08/2016 12:09 PM, Thomas Schwinge wrote:

Hi!

A few review comments:

On Wed, 7 Sep 2016 20:19:20 +0300, Andris Pavenis  wrote:

This patch fixes handling header.gcc in subdirectories when command line option 
-remap has been
used.

(I have not yet reviewed the logic of your change itself.)  Wouldn't it
be simpler to just unconditionally add a directory separator here?

Is it OK to assume that a directory separator always is "/"?  (Think DOS,
Windows etc.  But maybe there's some "translation layer" beneath this --
I don't know.)

No.

DJGPP supports both '/' and '\'. '\' is OK except in some cases (special handling of paths 
beginning with /dev/). Blind conversion off all '/' to '\' do not work for DJGPP due to this reason

(had related problems in directory gcc/ada).

Windows targets (WINGW, Cygwin): at least in Ada gcc/ada/s-os_lib.adb converts all '/' to '\' for 
Windows targets and without submitted but not yet committed patch also for DJGPP (that broke 
bootstrapping gcc for DJGPP due to gnatmake not working). I have not done however real testing for 
Windows targets myself.

Can you provide some test cases?  (Ah, I now see you got some "Test
script to reproduce problem" attached to  --
this should be turned into a regression test for the GCC testsuite.)

Which could more appropriate place for test-case?
- gcc/testsuite/c-c++-common/cpp
- gcc/testsuite/gcc.dg/cpp

Also should this test be in a separate subdirectory under either of them?

It is good practice to assign a GCC PR to yourself if you're working on
it, and it also helps to post to the PR a comment with a link to the
mailing list archive for patch submissions, etc.

Done

Andris



[Bug c/77520] wrong value for extended ASCII characters in -Wformat message

2016-09-08 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77520

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-09-09
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
 Ever confirmed|0   |1
  Known to fail||4.9.3, 5.3.0, 6.2.0, 7.0

--- Comment #1 from Martin Sebor  ---
Patch posted for review:
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00487.html

[Bug c/77521] %qc format directive should quote non-printable characters

2016-09-08 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77521

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-09-09
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
 Ever confirmed|0   |1
  Known to fail||4.9.3, 5.3.0, 6.1.0, 7.0

--- Comment #1 from Martin Sebor  ---
Patch posted for review:
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00487.html

[PATCH] avoid non-printable characters in diagnostics (c/77620, c/77521)

2016-09-08 Thread Martin Sebor

While working on the -Wformat-length pass I noticed that in some
diagnostics that make use of the %qc and %qs directives GCC prints
non-printable characters raw.  For example, it might print a newline,
corrupting the diagnostic stream (bug 77521).

Some other diagnostics that try to avoid this problem by using
a directive such as %x when the character is not printable might
use the sign-extended value of the character, printing a very
large hexadecimal value (bug 77520).

The attached patch changes the pretty printer to detect non-printable
characters in %qc and %qs directives (but not %c or %s) and print
those in hexadecimal (via "\\x%02x").

Martin

PS I used hexadecimal based on what c-format.c does but now that
I checked more carefully how %qE formats string literals I see it
uses octal.  I think hexadecimal is preferable because it avoids
ambiguity but I'm open to changing it to octal if there's a strong
preference for it.  Incidentally, %qE too suffers from bug 77520
(see below).  The patch doesn't try to fix that.

$ cat z.C && gcc z.C
constexpr int i = "ABC\x7f_\x80XYZ";
z.C:1:19: error: invalid conversion from ‘const char*’ to ‘int’ 
[-fpermissive]

 constexpr int i = "ABC\x7f_\x80XYZ";
   ^
z.C:1:19: error: ‘(int)((const char*)"ABC\177_\3777600XYZ")’ is not 
a constant expression
PR c/77520 - wrong value for extended ASCII characters in -Wformat message
PR c/77521 - %qc format directive should quote non-printable characters

gcc/c-family/ChangeLog:
2016-09-08  Martin Sebor  

	PR c/77520
	PR c/77521
	* c-format.c (argument_parser::find_format_char_info): Use %qc
	format directive unconditionally.

gcc/ChangeLog:
2016-09-08  Martin Sebor  

	PR c/77520
	PR c/77521
	* pretty-print.c (pp_quoted_string): New function.
	(pp_format): Call it for %c and %s directives.

gcc/testsuite/ChangeLog:
2016-09-08  Martin Sebor  

	PR c/77520
	PR c/77521
	* gcc.dg/pr77520.c: New test.
	* gcc.dg/pr77521.c: New test.

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 09d514e..0c17340 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -2355,20 +2355,12 @@ argument_parser::find_format_char_info (char format_char)
 ++fci;
   if (fci->format_chars == 0)
 {
-  if (ISGRAPH (format_char))
-	format_warning_at_char (format_string_loc, format_string_cst,
-format_chars - orig_format_chars,
-OPT_Wformat_,
-"unknown conversion type character"
-" %qc in format",
-format_char);
-  else
-	format_warning_at_char (format_string_loc, format_string_cst,
-format_chars - orig_format_chars,
-OPT_Wformat_,
-"unknown conversion type character"
-" 0x%x in format",
-format_char);
+  format_warning_at_char (format_string_loc, format_string_cst,
+			  format_chars - orig_format_chars,
+			  OPT_Wformat_,
+			  "unknown conversion type character"
+			  " %qc in format",
+			  format_char);
   return NULL;
 }
 
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 325263e..a39815e 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -30,6 +30,8 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #endif
 
+static void pp_quoted_string (pretty_printer *, const char *, size_t = -1);
+
 /* Overwrite the given location/range within this text_info's rich_location.
For use e.g. when implementing "+" in client format decoders.  */
 
@@ -555,8 +557,20 @@ pp_format (pretty_printer *pp, text_info *text)
 	  break;
 
 	case 'c':
-	  pp_character (pp, va_arg (*text->args_ptr, int));
-	  break;
+	  {
+	/* When quoting, print alphanumeric, punctuation, and the space
+	   character unchanged, and all others in hexadecimal with the
+	   "\x" prefix.  Otherwise print them all unchanged.  */
+	int chr = va_arg (*text->args_ptr, int);
+	if (ISPRINT (chr) || !quote)
+	  pp_character (pp, chr);
+	else
+	  {
+		const char str [2] = { chr, '\0' };
+		pp_quoted_string (pp, str, 1);
+	  }
+	break;
+	  }
 
 	case 'd':
 	case 'i':
@@ -577,7 +591,10 @@ pp_format (pretty_printer *pp, text_info *text)
 	  break;
 
 	case 's':
-	  pp_string (pp, va_arg (*text->args_ptr, const char *));
+	  if (quote)
+	pp_quoted_string (pp, va_arg (*text->args_ptr, const char *));
+	  else
+	pp_string (pp, va_arg (*text->args_ptr, const char *));
 	  break;
 
 	case 'p':
@@ -939,6 +956,41 @@ pp_string (pretty_printer *pp, const char *str)
   pp_maybe_wrap_text (pp, str, str + strlen (str));
 }
 
+/* Append the leading N characters of STRING to the output area of
+   PRETTY-PRINTER, quoting in hexadecimal non-printable characters.
+   Setting N = -1 is as if N were set to strlen (STRING).  The STRING
+   may be line-wrapped if in appropriate mode.  */
+static void
+pp_quoted_string (pretty_printer *pp, const char *str, size_t n /* = -1 */)
+{
+  gcc_checking_assert (str);
+
+  const char *last = str;

[PATCH] PR fortran/77506

2016-09-08 Thread Steve Kargl
Regression tested on x86_64-*-freebsd.  OK to commit?

2016-09-08  Steven G. Kargl  

PR fortran/77506
* array.c (gfc_match_array_constructor): CHARACTER(len=*) cannot
appear in an array constructor.

2016-09-08  Steven G. Kargl  

PR fortran/77506
* gfortran.dg/pr77506.f90: New test.

Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c (revision 240039)
+++ gcc/fortran/array.c (working copy)
@@ -1142,6 +1142,15 @@ gfc_match_array_constructor (gfc_expr **
  gfc_restore_last_undo_checkpoint ();
  goto cleanup;
}
+
+ if (ts.type == BT_CHARACTER
+ && ts.u.cl && !ts.u.cl->length && !ts.u.cl->length_from_typespec)
+   {
+ gfc_error ("Type-spec at %L cannot contain an asterisk for a "
+"type parameter", );
+ gfc_restore_last_undo_checkpoint ();
+ goto cleanup;
+   }
}
 }
   else if (m == MATCH_ERROR)
Index: gcc/testsuite/gfortran.dg/pr77506.f90
===
--- gcc/testsuite/gfortran.dg/pr77506.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr77506.f90   (working copy)
@@ -0,0 +1,4 @@
+! { dg-do compile }
+program foo
+   print *, [character(len=*)::'ab','cd'] ! { dg-error "contain an asterisk" }
+end program foo

-- 
Steve


[PATCH 6/9] df selftests

2016-09-08 Thread David Malcolm
gcc/ChangeLog:
* df-core.c: Include selftest.h and selftest-rtl.h.
(selftest::dataflow_test::dataflow_test): New ctor.
(selftest::dataflow_test::~dataflow_test): New dtor.
(selftest::test_df_single_set): New function.
(selftest::df_core_c_tests): New function.
* selftest-run-tests.c (selftest::run_tests): Call it.
* selftest.h (selftest::df_core_c_tests): New decl.
---
 gcc/df-core.c| 77 
 gcc/selftest-run-tests.c |  1 +
 gcc/selftest.h   |  1 +
 3 files changed, 79 insertions(+)

diff --git a/gcc/df-core.c b/gcc/df-core.c
index e531d58..cb8e2f9 100644
--- a/gcc/df-core.c
+++ b/gcc/df-core.c
@@ -384,6 +384,8 @@ are write-only operations.
 #include "cfganal.h"
 #include "tree-pass.h"
 #include "cfgloop.h"
+#include "selftest.h"
+#include "selftest-rtl.h"
 
 static void *df_get_bb_info (struct dataflow *, unsigned int);
 static void df_set_bb_info (struct dataflow *, unsigned int, void *);
@@ -2482,3 +2484,78 @@ debug_df_chain (struct df_link *link)
   df_chain_dump (link, stderr);
   fputc ('\n', stderr);
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* dataflow_test's constructor.  Initialize df.  */
+
+dataflow_test::dataflow_test ()
+{
+  rest_of_handle_df_initialize ();
+}
+
+/* dataflow_test's destructor.  Clean up df.  */
+
+dataflow_test::~dataflow_test ()
+{
+  rest_of_handle_df_finish ();
+}
+
+/* Verify df_note on a trivial function.  */
+
+void
+test_df_single_set ()
+{
+  const char *input_dump
+= "(insn 1 0 0 2 (set (reg:SI 100) (reg:SI 101)) -1 (nil))\n";
+  rtl_dump_test t (input_dump, 100);
+  //print_rtl_with_bb (stdout, get_insns (), 1024);
+
+  dataflow_test dftest;
+
+  df_note_add_problem ();
+  df_analyze ();
+  //df_dump (stderr);
+  df_finish_pass (false);
+
+  rtx_insn *insn = get_insn_by_uid (1);
+
+  ASSERT_NE (NULL, REG_NOTES (insn));
+  rtx_expr_list *note0 = as_a  (REG_NOTES (insn));
+
+  rtx_expr_list *note1 = note0->next ();
+  ASSERT_NE (NULL, note1);
+
+  ASSERT_EQ (NULL, note1->next ());
+
+  ASSERT_EQ (SET_SRC (PATTERN (insn)), note0->element ());
+  ASSERT_EQ (REG_DEAD, REG_NOTE_KIND (note0));
+
+  ASSERT_EQ (SET_DEST (PATTERN (insn)), note1->element ());
+  ASSERT_EQ (REG_UNUSED, REG_NOTE_KIND (note1));
+
+  struct df_lr_bb_info *bb_info = df_lr_get_bb_info (2);
+  ASSERT_NE (NULL, bb_info);
+
+  /* "r100 = r101;" so we should have a use of r101.  */
+  ASSERT_FALSE (bitmap_bit_p (_info->use, t.effective_regno (100)));
+  ASSERT_TRUE (bitmap_bit_p (_info->use, t.effective_regno (101)));
+
+  /* ...and a def of r100.  */
+  ASSERT_TRUE (bitmap_bit_p (_info->def, t.effective_regno (100)));
+  ASSERT_FALSE (bitmap_bit_p (_info->def, t.effective_regno (101)));
+}
+
+/* Run all of the selftests within this file.  */
+
+void
+df_core_c_tests ()
+{
+  test_df_single_set ();
+}
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index c90037c..14e5828 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -64,6 +64,7 @@ selftest::run_tests ()
   gimple_c_tests ();
   rtl_tests_c_tests ();
   read_rtl_function_c_tests ();
+  df_core_c_tests ();
 
   /* Higher-level tests, or for components that other selftests don't
  rely on.  */
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 75fea6f..037a5ee 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -190,6 +190,7 @@ extern void forcibly_ggc_collect ();
 /* Declarations for specific families of tests (by source file), in
alphabetical order.  */
 extern void bitmap_c_tests ();
+extern void df_core_c_tests ();
 extern void diagnostic_c_tests ();
 extern void diagnostic_show_locus_c_tests ();
 extern void edit_context_c_tests ();
-- 
1.8.5.3



[PATCH 5/9] Introduce class function_reader

2016-09-08 Thread David Malcolm
This patch generalizes the RTL-reading capabilities so that they
can be run on the host as well as the build machine.
The available rtx in rtl.def changes dramatically between these
two configurations, so a fair amount of #ifdef GENERATOR_FILE is
required to express this.

This patch introduces a function_reader subclass of rtx_reader,
capable of reading an RTL function dump (or part of one),
reconstructing a cfun with a CFG and basic blocks containing insns.

gcc/ChangeLog:
* Makefile.in (OBJS): Add errors.o, read-md.o, read-rtl.o,
read-rtl-function.o, and selftest-rtl.o.
* cfgexpand.c (pass_expand::execute): Move stack initializations
to rtl_data::init_stack_alignment and call it.  Pass "true"
for new "emit_insns" param of expand_function_start.
* emit-rtl.c (gen_reg_rtx): Move regno_pointer_align and
regno_reg_rtx resizing logic to...
(emit_status::ensure_regno_capacity): ...this new method.
(init_emit): Allocate regno_reg_rtx using ggc_cleared_vec_alloc
rather than ggc_vec_alloc.
(rtl_data::init_stack_alignment): New method.
(get_insn_by_uid): New function.
* emit-rtl.h (rtl_data::init_stack_alignment): New method.
* errors.c: Use consistent pattern for bconfig.h vs config.h
includes.
(progname): Wrap with #ifdef GENERATOR_FILE.
(error): Likewise.  Add "error: " to message.
(fatal): Likewise.
(internal_error): Likewise.
(trim_filename): Likewise.
(fancy_abort): Likewise.
* errors.h (struct file_location): Move here from read-md.h.
(file_location::file_location): Likewise.
(error_at): New decl.
* function-tests.c (selftest::verify_three_block_rtl_cfg): Remove
"static".
* function.c (instantiate_decls): Guard call to
instantiate_decls_1 with if (DECL_INITIAL (fndecl)).
(expand_function_start): Add param "emit_insns", and use it to
guard the various gen/emit calls.
* function.h (emit_status::ensure_regno_capacity): New method.
(expand_function_start): Add bool param to decl.
* gensupport.c (gen_reader::gen_reader): Add NULL for new policy
param of rtx_reader ctor.
* print-rtl.c (print_rtx): Print "(nil)" rather than an empty
string for NULL strings.  Print "(nil)" for NULL basic blocks.
* read-md.c (read_skip_construct): Provide forward decl.
(read_skip_spaces): Support '/'.
(require_char): New function.
(require_word_ws): New function.
(peek_char): New function.
(read_name): Rename to...
(read_name_1): ...this new static function, adding "out_loc" param,
and converting "missing name or number" to returning false, rather
than failing.
(read_name): Reimplement in terms of read_name_1.
(read_name_or_nil): New function.
(read_string): Handle "(nil)" by returning NULL.  */
(rtx_reader::rtx_reader): Add rtl_reader_policy * param, using
it to initialize m_policy.
(rtx_reader::~rtx_reader): Free m_base_dir.  Clean up global data.
* read-md.h (struct file_location): Move to errors.h.
(file_location::file_location): Likewise.
Include errors.h.
(class regno_remapper): New class.
(struct rtl_reader_policy): New struct.
(rtx_reader::rtx_reader): Add rtl_reader_policy * param.
(rtx_reader::add_fixup_insn_uid): New vfunc.
(rtx_reader::add_fixup_bb): New vfunc.
(rtx_reader::add_fixup_note_insn_basic_block): New vfunc.
(rtx_reader::add_fixup_source_location): New vfunc.
(rtx_reader::add_fixup_jump_label): New vfunc.
(rtx_reader::add_fixup_expr): New vfunc.
(rtx_reader::remap_regno): New method.
(rtx_reader::m_policy): New field.
(noop_reader::noop_reader): Add NULL for new policy param of
rtx_reader ctor.
(peek_char): New decl.
(require_char): New decl.
(require_word_ws): New decl.
(read_name): Convert return type from void to file_location.
(read_name_or_nil): New decl.
* read-rtl-function.c: New file.
* read-rtl-function.h: New file.
* read-rtl.c: Potentially include config.h rather than bconfig.h.
For host, include function.h and emit-rtl.h.
(apply_subst_iterator): Wrap with #ifdef GENERATOR_FILE.
(bind_subst_iter_and_attr): Likewise.
(add_condition_to_string): Likewise.
(add_condition_to_rtx): Likewise.
(apply_attribute_uses): Likewise.
(add_current_iterators): Likewise.
(apply_iterators): Likewise.
(initialize_iterators): Guard usage of apply_subst_iterator with
#ifdef GENERATOR_FILE.
(read_conditions): Wrap with #ifdef GENERATOR_FILE.
(read_mapping): Likewise.
(add_define_attr_for_define_subst): Likewise.

[PATCH 4/9] Expose forcibly_ggc_collect and run it after all selftests

2016-09-08 Thread David Malcolm
Force a GC at the end of the selftests, to shake out GC-related
issues.  For example, if any GC-managed items have buggy (or missing)
finalizers, this last collection will ensure that things that were
failed to be finalized can be detected by valgrind.

gcc/ChangeLog:
* ggc-tests.c (forcibly_ggc_collect): Rename to...
(selftest::forcibly_ggc_collect): ...this, and remove "static".
(test_basic_struct): Update for above renaming.
(test_length): Likewise.
(test_union): Likewise.
(test_finalization): Likewise.
(test_deletable_global): Likewise.
(test_inheritance): Likewise.
(test_chain_next): Likewise.
(test_user_struct): Likewise.
(test_tree_marking): Likewise.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::forcibly_ggc_collect at the end of the selftests.
* selftest.h (selftest::forcibly_ggc_collect): New decl.
---
 gcc/ggc-tests.c  | 28 ++--
 gcc/selftest-run-tests.c |  6 ++
 gcc/selftest.h   |  5 +
 3 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/gcc/ggc-tests.c b/gcc/ggc-tests.c
index 7f97231..b9cd276 100644
--- a/gcc/ggc-tests.c
+++ b/gcc/ggc-tests.c
@@ -27,19 +27,19 @@ along with GCC; see the file COPYING3.  If not see
 
 #if CHECKING_P
 
-/* The various GTY markers must be outside of a namespace to be seen by
-   gengtype, so we don't put this file within the selftest namespace.  */
-
 /* A helper function for writing ggc tests.  */
 
-static void
-forcibly_ggc_collect ()
+void
+selftest::forcibly_ggc_collect ()
 {
   ggc_force_collect = true;
   ggc_collect ();
   ggc_force_collect = false;
 }
 
+/* The various GTY markers must be outside of a namespace to be seen by
+   gengtype, so we don't put this file within the selftest namespace.  */
+
 
 
 /* Verify that a simple struct works, and that it can
@@ -58,7 +58,7 @@ test_basic_struct ()
   root_test_struct = ggc_cleared_alloc  ();
   root_test_struct->other = ggc_cleared_alloc  ();
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_TRUE (ggc_marked_p (root_test_struct));
   ASSERT_TRUE (ggc_marked_p (root_test_struct->other));
@@ -88,7 +88,7 @@ test_length ()
   for (int i = 0; i < count; i++)
 root_test_of_length->elem[i] = ggc_cleared_alloc  ();
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_TRUE (ggc_marked_p (root_test_of_length));
   for (int i = 0; i < count; i++)
@@ -162,7 +162,7 @@ test_union ()
   test_struct *referenced_by_other = ggc_cleared_alloc  ();
   other->m_ptr = referenced_by_other;
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_TRUE (ggc_marked_p (root_test_of_union_1));
   ASSERT_TRUE (ggc_marked_p (ts));
@@ -202,7 +202,7 @@ test_finalization ()
 
   test_struct_with_dtor::dtor_call_count = 0;
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   /* Verify that the destructor was run for each instance.  */
   ASSERT_EQ (count, test_struct_with_dtor::dtor_call_count);
@@ -220,7 +220,7 @@ test_deletable_global ()
   test_of_deletable = ggc_cleared_alloc  ();
   ASSERT_TRUE (test_of_deletable != NULL);
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_EQ (NULL, test_of_deletable);
 }
@@ -293,7 +293,7 @@ test_inheritance ()
   test_some_subclass_as_base_ptr = new some_subclass ();
   test_some_other_subclass_as_base_ptr = new some_other_subclass ();
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   /* Verify that the roots and everything referenced by them got marked
  (both for fields in the base class and those in subclasses).  */
@@ -372,7 +372,7 @@ test_chain_next ()
   tail_node = new_node;
 }
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   /* If we got here, we survived.  */
 
@@ -439,7 +439,7 @@ test_user_struct ()
 
   num_calls_to_user_gt_ggc_mx = 0;
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_TRUE (ggc_marked_p (root_user_struct_ptr));
   ASSERT_TRUE (ggc_marked_p (referenced));
@@ -457,7 +457,7 @@ test_tree_marking ()
 {
   dummy_unittesting_tree = build_int_cst (integer_type_node, 1066);
 
-  forcibly_ggc_collect ();
+  selftest::forcibly_ggc_collect ();
 
   ASSERT_TRUE (ggc_marked_p (dummy_unittesting_tree));
 }
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index d9d3ea1..54a9b0f 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -80,6 +80,12 @@ selftest::run_tests ()
   /* Run any lang-specific selftests.  */
   lang_hooks.run_lang_selftests ();
 
+  /* Force a GC at the end of the selftests, to shake out GC-related
+ issues.  For example, if any GC-managed items have buggy (or missing)
+ finalizers, this last collection will ensure that things that were
+ failed to be finalized can be detected by valgrind.  */
+  forcibly_ggc_collect ();

[PATCH 0/9] RFC: selftests based on RTL dumps

2016-09-08 Thread David Malcolm
The current selftest code is adequate for testing individual
instructions, but most interesting passes have logic involving the
interaction of multiple instructions, or require a CFG and function
to be runnable.  In theory we could write selftests by programatically
constructing a function and CFG "by hand" via API calls, but this is
awkward, tedious, and the resulting code is unnatural.  Examples can
be seen in function-tests.c; that file constructs trivial 3-block
functions, and is pushing the limits of what I'd want to write
"by hand".

This patch kit provides a more natural way to write RTL selftests,
by providing a parser for RTL dumps (or fragments of RTL dumps).  You
can copy and paste fragments of RTL dump into the source for a pass
and then have selftests that run part of the pass on that dump,
asserting that desired outcomes occur.

This is an updated and rewritten version of the RTL frontend work I
posted in May (c.f. "[PATCH 0/4] RFC: RTL frontend"
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00352.html).

In that patch kit, I introduced an rtl1 frontend, capable of parsing
RTL dumps, and running just one RTL pass on them, in the hope of
better testing individual RTL passes.

This patch kit takes a somewhat different approach: it provides
the infrastructure for parsing RTL dumps, but doesn't wire it up as
a frontend.  Instead, it just provides a set of classes for use when
writing selftests.  It would be possible to build out an "rtl1"
frontend as a followup to this kit.

The advantage of this approach is that it's possible to run and test
passes at finer granularity: for example, rather than being limited
to testing all of, say, pass "final", we can also write tests that
run just final_scan_insn on individual rtx_insn *, and verify that
the expected output is emitted.  Tests can be written at various
different levels, testing how the components of a pass handle fragments
of an insn, how they handle entire insns, basic blocks, and so on up
to running a whole pass on a whole function.

The disadvantage is that changing a selftest requires recompilation
of cc1 (though that drawback affects selftests in general).

An overview of the kit follows; patches 6-9 are probably the most
interesting, as they show example of the kinds of selftest that can
be written using this approach:

  * Patch 1 tidies up some global state in .md parsing, wrapping it up in
  an rtx_reader class.

  * Patches 2-4 add some selftest support code.

  * Patch 5 introduces a function_reader subclass of patch 1's rtx_reader,
  capable of parsing RTL function dumps (or fragments thereof), and
  generalizes things so that rtx parsing can be run from the host, rather
  than just at build time.

   * Patch 6 uses the infrastructure to write a dataflow selftest.  It's
   a trivial example, but hopefully shows how more interesting selftests
   could be written.  (it's much easier to write a selftest if a similar
   one already exists)

   * Patch 7 does the same, but for combine.c.

   * Patch 8 does the same, but for final.c, showing examples of both
   assembling an entire function, and of assembling individual rtx_insn *
   (to exercise the .md files and asm output code)

   * Patch 9 does the same, but for cse.c.  I attempted to create a
   selftest that directly reproduces PR 71779, though I haven't yet
   been able to to reproduce the issue (just load the insns and run cse
   on them).

These tests are very target-specific and were developed mostly for
target==x86_64, and a little for target==aarch64.
I put clauses like this in the various test functions, which is a kludge:

/* Only run these tests for i386.  */
 #ifndef I386_OPTS_H
return;
 #endif

Is there a better way to express this condition?  (I guess I could
add a selftest::target_is_x86_p () predicate).

Posting for comment (doesn't fully bootstrap yet due to a few stray
warnings).  Patches 1-4 could probably be applicable even without
the rest of the kit.

Thoughts?

David Malcolm (9):
  Introduce class rtx_reader
  Add selftest::read_file
  selftest.h: add temp_override fixture
  Expose forcibly_ggc_collect and run it after all selftests
  Introduce class function_reader
  df selftests
  combine.c selftests
  final.c selftests
  cse.c selftests

 gcc/Makefile.in  |5 +
 gcc/cfgexpand.c  |7 +-
 gcc/combine.c|  155 ++
 gcc/cse.c|  109 +
 gcc/df-core.c|   77 +++
 gcc/emit-rtl.c   |   70 ++-
 gcc/emit-rtl.h   |2 +
 gcc/errors.c |   23 +-
 gcc/errors.h |   13 +
 gcc/final.c  |  271 +++
 gcc/function-tests.c |2 +-
 gcc/function.c   |   41 +-
 gcc/function.h   |4 +-
 gcc/genconstants.c   |3 +-
 gcc/genenums.c   |3 +-
 gcc/genmddeps.c  |3 +-
 gcc/genpreds.c   |9 +-
 gcc/gensupport.c |   29 +-
 gcc/gensupport.h |6 +-
 gcc/ggc-tests.c  

[PATCH 1/9] Introduce class rtx_reader

2016-09-08 Thread David Malcolm
Bundle up various global variables within gensupport.c into a
class rtx_reader, with a view towards making it easier to run the
code more than once in-process.

gcc/ChangeLog:
* genconstants.c (main): Introduce noop_reader and convert call
to read_md_files to a method call.
* genenums.c (main): Likewise.
* genmddeps.c (main): Likewise.
* genpreds.c (write_tm_constrs_h): Replace use of "in_fname" with
rtx_reader_ptr->get_top_level_filename ().
(write_tm_preds_h): Likewise.
(write_insn_preds_c): Likewise.
* gensupport.c (class gen_reader): New subclass of rtx_reader.
(rtx_handle_directive): Convert to...
(gen_reader::handle_unknown_directive): ...this.
(init_rtx_reader_args_cb): Convert return type from bool to
rtx_reader *.  Create a gen_reader instance, using it for the
call to read_md_files.  Return it if no errors occur.
(init_rtx_reader_args): Convert return type from bool to
rtx_reader *.
* gensupport.h (init_rtx_reader_args_cb): Likewise.
(init_rtx_reader_args_cb): Likewise.
* read-md.c (struct file_name_list): Move to class rtx_reader.
(read_md_file): Delete in favor of rtx_reader::m_read_md_file.
(read_md_filename): Delete in favor of
rtx_reader::m_read_md_filename.
(read_md_lineno): Delete in favor of rtx_reader::m_read_md_lineno.
(in_fname): Delete in favor of rtx_reader::m_toplevel_fname.
(base_dir): Delete in favor of rtx_reader::m_base_dir.
(first_dir_md_include): Delete in favor of
rtx_reader::m_first_dir_md_include.
(last_dir_md_include_ptr): Delete in favor of
rtx_reader::m_last_dir_md_include_ptr.
(max_include_len): Delete.
(rtx_reader_ptr): New.
(fatal_with_file_and_line): Use get_filename and get_lineno
accessors of rtx_reader_ptr.
(require_char_ws): Likewise.
(rtx_reader::read_char): New method, based on ::read_char.
(rtx_reader::unread_char): New method, based on ::unread_char.
(read_escape): Use get_filename and get_lineno accessors of
rtx_reader_ptr.
(read_braced_string): Use get_lineno accessor of rtx_reader_ptr.
(read_string): Use get_filename and get_lineno accessors of
rtx_reader_ptr.
(rtx_reader::rtx_reader): New ctor.
(rtx_reader::~rtx_reader): New dtor.
(handle_include): Convert from a function to...
(rtx_reader::handle_include): ...this method, converting
handle_directive from a callback to a virtual function.
(handle_file): Likewise, converting to...
(rtx_reader::handle_file): ...this method.
(handle_toplevel_file): Likewise, converting to...
(rtx_reader::handle_toplevel_file): ...this method.
(rtx_reader::get_current_location): New method.
(parse_include): Convert from a function to...
(rtx_reader::add_include_path): ...this method, dropping redundant
update to unused max_include_len.
(read_md_files): Convert from a function to...
(rtx_reader::read_md_files): ...this method, converting
handle_directive from a callback to a virtual function.
(noop_reader::handle_unknown_directive): New method.
* read-md.h (directive_handler_t): Delete this typedef.
(in_fname): Delete.
(read_md_file): Delete.
(read_md_lineno): Delete.
(read_md_filename): Delete.
(class rtx_reader): New class.
(rtx_reader_ptr): New decl.
(class noop_reader): New subclass of rtx_reader.
(read_char): Reimplement in terms of rtx_reader::read_char.
(unread_char): Reimplement in terms of rtx_reader::unread_char.
(read_md_files): Delete.
* read-rtl.c (read_rtx_code): Update for deletion of globals
read_md_filename and read_md_lineno.
---
 gcc/genconstants.c |   3 +-
 gcc/genenums.c |   3 +-
 gcc/genmddeps.c|   3 +-
 gcc/genpreds.c |   9 ++-
 gcc/gensupport.c   |  29 +--
 gcc/gensupport.h   |   6 +-
 gcc/read-md.c  | 225 ++---
 gcc/read-md.h  |  98 ++-
 gcc/read-rtl.c |   3 +-
 9 files changed, 242 insertions(+), 137 deletions(-)

diff --git a/gcc/genconstants.c b/gcc/genconstants.c
index c10e3e3..e8be5b6 100644
--- a/gcc/genconstants.c
+++ b/gcc/genconstants.c
@@ -79,7 +79,8 @@ main (int argc, const char **argv)
 {
   progname = "genconstants";
 
-  if (!read_md_files (argc, argv, NULL, NULL))
+  noop_reader reader;
+  if (!reader.read_md_files (argc, argv, NULL))
 return (FATAL_EXIT_CODE);
 
   /* Initializing the MD reader has the side effect of loading up
diff --git a/gcc/genenums.c b/gcc/genenums.c
index db46a67..8af8d9a 100644
--- a/gcc/genenums.c
+++ b/gcc/genenums.c
@@ -49,7 +49,8 @@ main (int argc, const char **argv)
 {
   progname = "genenums";
 
-  if 

[PATCH 8/9] final.c selftests

2016-09-08 Thread David Malcolm
gcc/ChangeLog:
* final.c: Include selftest.h and selftest-rtl.h.
(class selftest::temp_asm_out): New subclass of
selftest::named_temp_file.
(selftest::temp_asm_out::temp_asm_out): New ctor.
(selftest::temp_asm_out::~temp_asm_out): New dtor.
(class selftest::asm_out_test): New subclass of
selftest::rtl_dump_test.
(selftest::asm_out_test::asm_out_test): New ctor.
(selftest::test_jump_insn): New function.
(selftest::test_empty_function): New function.
(selftest::test_asm_for_insn): New function.
(TEST_ASM_FOR_INSN): New macro.
(selftest::test_x86_64_leal): New function.
(selftest::test_x86_64_negl): New function.
(selftest::test_x86_64_cmpl): New function.
(selftest::test_x86_64_cmovge): New function.
(selftest::test_x86_64_ret): New function.
(selftest::final_c_tests): New function.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::final_c_tests.
* selftest.h (selftest::final_c_tests): New decl.
---
 gcc/final.c  | 271 +++
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h   |   1 +
 3 files changed, 273 insertions(+)

diff --git a/gcc/final.c b/gcc/final.c
index eccc3d8..990f898 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -78,6 +78,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "selftest.h"
+#include "selftest-rtl.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"  /* Needed for external data declarations.  */
@@ -4894,3 +4896,272 @@ get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET 
*reg_set,
   COPY_HARD_REG_SET (*reg_set, default_set);
   return false;
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Fixture for temporarily setting the global "asm_out_file"
+   to a named temporary file.  */
+
+class temp_asm_out : public named_temp_file
+{
+ public:
+  temp_asm_out (const location );
+  ~temp_asm_out ();
+};
+
+/* Constructor.  Open a tempfile for writing and set "asm_out_file" to it.  */
+
+temp_asm_out::temp_asm_out (const location )
+: named_temp_file (".s")
+{
+  gcc_assert (asm_out_file == NULL);
+  asm_out_file = fopen (get_filename (), "w");
+  if (!asm_out_file)
+::selftest::fail_formatted (loc, "unable to open tempfile: %s",
+   get_filename ());
+}
+
+/* Destructor.  Close the tempfile and unset "asm_out_file".
+   The tempfile is unlinked by the named_temp_file dtor.  */
+
+temp_asm_out::~temp_asm_out ()
+{
+  fclose (asm_out_file);
+  asm_out_file = NULL;
+}
+
+/* A subclass of rtl_dump_test for testing asm output.
+   Overrides asm_out_file to a tempfile, and temporarily
+   sets "reload_completed = 1;".  */
+
+class asm_out_test : public rtl_dump_test
+{
+ public:
+  asm_out_test (const location , const char *dump_content);
+
+  /* Get the asm output written so far.  The caller should free
+ the returned ptr.  */
+  char *
+  get_output (const location ) const
+  {
+fflush (asm_out_file);
+return read_file (loc, m_asm_out.get_filename ());
+  }
+
+ private:
+  temp_asm_out m_asm_out;
+  temp_override  m_override_reload_completed;
+  temp_override  m_override_in_section;
+};
+
+/* asm_out_test's constructor.  Write DUMP_CONTENT to a tempfile,
+   overrides "asm_out_file" to a tempfile, and temporarily
+   sets "reload_completed = 1;".
+   Assume that no pseudo regs are present in DUMP_CONTENT (by
+   using the default arg to rtl_dump_test's ctor).  */
+
+asm_out_test::asm_out_test (const location , const char *dump_content)
+: rtl_dump_test (dump_content),
+  m_asm_out (loc),
+  /* Temporarily set reload_completed = true.
+ Needed e.g. by ix86_can_use_return_insn_p.  */
+  m_override_reload_completed (reload_completed, true),
+  /* Temporarily set "in_section = NULL;".  */
+  m_override_in_section (in_section, NULL)
+{
+}
+
+/* Test writing asm for a jump_insn.  */
+
+static void
+test_jump_insn ()
+{
+  const char *input_dump
+= ("(jump_insn 1 0 2 4 (set (pc)\n"
+   "(label_ref 3)) ../../src/gcc/testsuite/rtl.dg/test.c:4 -1\n"
+   " (nil)\n"
+   " -> 3)\n"
+   "(barrier 2 1 3)\n"
+   "(code_label 3 2 0 5 2 (nil) [1 uses])\n");
+  asm_out_test t (SELFTEST_LOCATION, input_dump);
+
+  rtx_insn *jump_insn = get_insn_by_uid (1);
+  ASSERT_EQ (JUMP_INSN, GET_CODE (jump_insn));
+
+  rtx_insn *barrier = get_insn_by_uid (2);
+  ASSERT_EQ (BARRIER, GET_CODE (barrier));
+
+  rtx_insn *code_label = get_insn_by_uid (3);
+  ASSERT_EQ (CODE_LABEL, GET_CODE (code_label));
+
+  int seen;
+  final_scan_insn (jump_insn, stderr, 0, 0, );
+
+  char *c = t.get_output (SELFTEST_LOCATION);
+  ASSERT_STREQ ("\tjmp\t.L2\n", c);
+  free (c);
+}
+
+/* Test writing asm for an empty function.  */
+
+static void
+test_empty_function ()
+{
+  /* Dump of the dump from cc1 in "test.c.289r.dwarf2" 

[PATCH 9/9] cse.c selftests

2016-09-08 Thread David Malcolm
This patch uses rtl_dump_test to start building out a test suite
for cse.

I attempted to create a reproducer for PR 71779; however I'm not yet
able to replicate the bogus cse reported there via the test case.

gcc/ChangeLog:
* cse.c: Include selftest.h and selftest-rtl.h.
(selftest::test_simple_cse): New function.
(selftest::test_pr71779): New function.
(selftest::cse_c_tests): New function.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::cse_c_tests.
* selftest.h (selftest::cse_c_tests): New decl.
---
 gcc/cse.c| 109 +++
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h   |   1 +
 3 files changed, 111 insertions(+)

diff --git a/gcc/cse.c b/gcc/cse.c
index 0bfd7ff..f4f06fe 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -41,6 +41,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "selftest.h"
+#include "selftest-rtl.h"
 
 #ifndef LOAD_EXTEND_OP
 #define LOAD_EXTEND_OP(M) UNKNOWN
@@ -7773,3 +7775,110 @@ make_pass_cse_after_global_opts (gcc::context *ctxt)
 {
   return new pass_cse_after_global_opts (ctxt);
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests for CSE.  */
+
+/* Simple test of eliminating a redundant (reg + 1) computation
+   i.e. that:
+ r101 = r100 + 1;
+ r102 = r100 + 1; <<< common subexpression
+ *r103 = r101 * r102;
+   can be CSE-ed to:
+ r101 = r100 + 1;
+ r102 = r101; <<< replaced
+ *r103 = r101 * r102;
+   by cse_main.  */
+
+static void
+test_simple_cse ()
+{
+  /* Only run this tests for i386.  */
+#ifndef I386_OPTS_H
+  return;
+#endif
+
+  const char *input_dump
+= (/* "r101 = r100 + 1;" */
+   "(insn 1 0 2 2 (set (reg:SI 101)\n"
+   "   (plus:SI (reg:SI 100)\n"
+   "(const_int 1 [0x1]))) -1 (nil))\n"
+   /* "r102 = r100 + 1;" */
+   "(insn 2 1 3 2 (set (reg:SI 102)\n"
+   "   (plus:SI (reg:SI 100)\n"
+   "(const_int 1 [0x1]))) -1 (nil))\n"
+   /* "*r103 = r101 * r102;" */
+   "(insn 3 2 0 2 (set (mem:SI (reg:SI 103) [1 i+0 S4 A32])\n"
+   "   (mult:SI (reg:SI 101) (reg:SI 102))) -1 (nil))\n"
+   );
+  rtl_dump_test t (input_dump, 100);
+  dataflow_test df_test;
+
+  int tem;
+  tem = cse_main (get_insns (), max_reg_num ());
+  ASSERT_EQ (0, tem);
+
+  /* Verify that insn 2's SET_SRC has been replaced with
+ the SET_DEST of insn 1.  */
+  ASSERT_EQ (SET_DEST (PATTERN (get_insn_by_uid (1))),
+SET_SRC (PATTERN (get_insn_by_uid (2;
+}
+
+/* Towards a regression test for PR 71779.  */
+
+static void
+test_pr71779 ()
+{
+  /* Only run this tests for target==aarch64.  */
+#ifndef GCC_AARCH64_H
+  return;
+#endif
+
+  /* Dump taken from comment 2 of PR 71779, of
+ "...the relevant memory access coming out of expand"
+ with basic block IDs added, and prev/next insns set to
+ 0 at ends.  */
+  const char *input_dump
+= (";; MEM[(struct isl_obj *)] = _obj_map_vtable;\n"
+   "(insn 1045 0 1046 2 (set (reg:SI 480)\n"
+   "(high:SI (symbol_ref:SI (\"isl_obj_map_vtable\") [flags 0xc0] 
))) y.c:12702 -1\n"
+   " (nil))\n"
+   "(insn 1046 1045 1047 2 (set (reg/f:SI 479)\n"
+   "(lo_sum:SI (reg:SI 480)\n"
+   "(symbol_ref:SI (\"isl_obj_map_vtable\") [flags 0xc0] 
))) y.c:12702 -1\n"
+   " (expr_list:REG_EQUAL (symbol_ref:SI (\"isl_obj_map_vtable\") 
[flags 0xc0] )\n"
+   "(nil)))\n"
+   "(insn 1047 1046 1048 2 (set (reg:DI 481)\n"
+   "(subreg:DI (reg/f:SI 479) 0)) y.c:12702 -1\n"
+   " (nil))\n"
+   "(insn 1048 1047 1049 2 (set (zero_extract:DI (reg/v:DI 191 [ 
obj1D.17368 ])\n"
+   "(const_int 32 [0x20])\n"
+   "(const_int 0 [0]))\n"
+   "(reg:DI 481)) y.c:12702 -1\n"
+   " (nil))\n"
+   /* Extra insn, to avoid all of the above from being deleted by DCE.  */
+   "(insn 1049 1048 0 2 (set (mem:DI (reg:DI 191) [1 i+0 S4 A32])\n"
+   " (const_int 1 [0x1])) -1 (nil))\n");
+
+  rtl_dump_test t (input_dump);
+  dataflow_test df_test;
+
+  int tem;
+  tem = cse_main (get_insns (), max_reg_num ());
+  ASSERT_EQ (0, tem);
+}
+
+/* Run all of the selftests within this file.  */
+
+void
+cse_c_tests ()
+{
+  test_simple_cse ();
+  test_pr71779 ();
+}
+
+} // namespace selftest
+#endif /* CHECKING_P */
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index 015572c..5fdfb42 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -65,6 +65,7 @@ selftest::run_tests ()
   rtl_tests_c_tests ();
   read_rtl_function_c_tests ();
   df_core_c_tests ();
+  cse_c_tests ();
 
   /* Higher-level tests, or for components that other selftests don't
  rely on.  */
diff 

[PATCH 7/9] combine.c selftests

2016-09-08 Thread David Malcolm
gcc/ChangeLog:
* combine.c: Include selftest.h and selftest-rtl.h.
(try_combine): Add assertion on this_basic_block.
(class selftest::combine_test): New subclass of
selftest::tl_dump_test.
(selftest::combine_test::combine_test): New ctor.
(selftest::test_combining_shifts): New function.
(selftest::test_non_combinable_shifts): New function.
(selftest::combine_c_tests): New function.
* selftest-run-tests.c (selftest::run_tests): Run
selftest::combine_c_tests.
* selftest.h (selftest::combine_c_tests): New decl.
---
 gcc/combine.c| 155 +++
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h   |   1 +
 3 files changed, 157 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index 1b262f9..9c148bb 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -102,6 +102,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "valtrack.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "selftest.h"
+#include "selftest-rtl.h"
 
 #ifndef LOAD_EXTEND_OP
 #define LOAD_EXTEND_OP(M) UNKNOWN
@@ -2625,6 +2627,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   rtx new_other_notes;
   int i;
 
+  gcc_assert (this_basic_block);
+
   /* Immediately return if any of I0,I1,I2 are the same insn (I3 can
  never be).  */
   if (i1 == i2 || i0 == i2 || (i0 && i0 == i1))
@@ -14441,3 +14445,154 @@ make_pass_combine (gcc::context *ctxt)
 {
   return new pass_combine (ctxt);
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* A subclass of rtl_dump_test for testing combine.c.  */
+
+class combine_test : public rtl_dump_test
+{
+ public:
+   combine_test (const char *dump_content, int dumped_first_pseudo_regno);
+
+ private:
+   dataflow_test m_df_test;
+};
+
+/* combine_test's constructor.  Write DUMP_CONTENT to a tempfile and load
+   it.  Initialize df and perform dataflow analysis.  */
+
+combine_test::combine_test (const char *dump_content,
+   int dumped_first_pseudo_regno)
+: rtl_dump_test (dump_content, dumped_first_pseudo_regno),
+  m_df_test ()
+{
+  /* The dataflow instance should have been created by m_df_test's ctor.  */
+  gcc_assert (df);
+
+  /* From rest_of_handle_combine.  */
+  df_set_flags (/*DF_LR_RUN_DCE + */ DF_DEFER_INSN_RESCAN);
+  df_note_add_problem ();
+  df_analyze ();
+}
+
+/* Verify that combine_instructions works, for combining a pair of shifts.
+   Ideally we'd test try_combine by itself, but a fair amount of
+   refactoring would be needed to do so.  */
+
+static void
+test_combining_shifts ()
+{
+  /* Taken from
+   gcc/testsuite/gcc.dg/asr_div1.c -O2 -fdump-rtl-all -mtune=cortex-a53
+ for aarch64, hand editing the prev/next insns to 0 as needed, and
+ editing whitespace to avoid over-long lines.  */
+  const char *input_dump
+= ("(insn 8 0 9 2 (set (reg:DI 78)\n"
+   "(lshiftrt:DI (reg:DI 76)\n"
+   "(const_int 32 [0x20])))\n"
+   "../../src/gcc/testsuite/gcc.dg/asr_div1.c:14\n"
+   "641 {*aarch64_lshr_sisd_or_int_di3}\n"
+   " (expr_list:REG_DEAD (reg:DI 76)\n"
+   "(nil)))\n"
+   "(insn 9 8 0 2 (set (reg:SI 79)\n"
+   "(ashiftrt:SI (subreg:SI (reg:DI 78) 0)\n"
+   "(const_int 3 [0x3])))\n"
+   "../../src/gcc/testsuite/gcc.dg/asr_div1.c:14\n"
+   "642 {*aarch64_ashr_sisd_or_int_si3}\n"
+   " (expr_list:REG_DEAD (reg:DI 78)\n"
+   "(nil)))\n");
+  combine_test t (input_dump, 76);
+
+  rtx_insn *insn_8 = get_insn_by_uid (8);
+  ASSERT_TRUE (insn_8);
+
+  rtx_insn *insn_9 = get_insn_by_uid (9);
+  ASSERT_TRUE (insn_9);
+
+  int rebuild_jump_labels_after_combine
+= combine_instructions (get_insns (), max_reg_num ());
+  ASSERT_FALSE (rebuild_jump_labels_after_combine);
+
+  /* Verify that insns 8 and 9 were combined.  */
+  ASSERT_EQ (1, combine_merges);
+
+  /* insn 8 should now be deleted.  */
+  ASSERT_EQ (NOTE, GET_CODE (insn_8));
+  ASSERT_EQ (NOTE_INSN_DELETED, NOTE_KIND (insn_8));
+
+  /* insn 9 should now be a shift of 35.
+ On aarch64 it's a set; on x86_64 it's a parallel of a set and a clobber
+ of CC.  */
+  rtx set_in_9 = single_set (insn_9);
+  ASSERT_TRUE (set_in_9);
+  rtx src_of_9 = SET_SRC (set_in_9);
+  ASSERT_EQ (ASHIFTRT, GET_CODE (src_of_9));
+  rtx amt = XEXP (src_of_9, 1);
+  ASSERT_TRUE (CONST_INT_P (amt));
+  ASSERT_EQ (35, INTVAL (amt));
+}
+
+/* Test of failing to combine instructions.
+
+   Similar to test_combining_shifts, but with the input register
+   for the 2nd shift hand-edited (from 78 to 80) so that it doesn't come
+   from the output of the 1st shift, so that the shifts should *not*
+   be combinable.  */
+
+static void
+test_non_combinable_shifts ()
+{
+  const char *input_dump
+= ("(insn 8 0 9 2 (set (reg:DI 78)\n"
+   "(lshiftrt:DI (reg:DI 76)\n"
+   "  

[PATCH 2/9] Add selftest::read_file

2016-09-08 Thread David Malcolm
This is used later in the kit by the selftests for final.c

gcc/ChangeLog:
* selftest.c (selftest::read_file): New function.
(selftest::test_read_file): New function.
(selftest::selftest_c_tests): Call test_read_file.
* selftest.h (selftest::read_file): New decl.
---
 gcc/selftest.c | 60 ++
 gcc/selftest.h |  7 +++
 2 files changed, 67 insertions(+)

diff --git a/gcc/selftest.c b/gcc/selftest.c
index 0db69d2..cf7031f 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -151,6 +151,53 @@ temp_source_file::temp_source_file (const location ,
   fclose (out);
 }
 
+/* Read the contents of PATH into memory, returning a 0-terminated buffer
+   that must be freed by the caller.
+   Fail (and abort) if there are any problems, with LOC as the reported
+   location of the failure.  */
+
+char *
+read_file (const location , const char *path)
+{
+  FILE *f_in = fopen (path, "r");
+  if (!f_in)
+fail_formatted (loc, "unable to open file: %s", path);
+
+  /* Read content, allocating FIXME.  */
+  char *result = NULL;
+  size_t total_sz = 0;
+  size_t alloc_sz = 0;
+  char buf[4096];
+  size_t iter_sz_in;
+
+  while ( (iter_sz_in = fread (buf, 1, sizeof (buf), f_in)) )
+{
+  gcc_assert (alloc_sz >= total_sz);
+  size_t old_total_sz = total_sz;
+  total_sz += iter_sz_in;
+  /* Allow 1 extra byte for 0-termination.  */
+  if (alloc_sz < (total_sz + 1))
+   {
+ size_t new_alloc_sz = alloc_sz ? alloc_sz * 2: total_sz + 1;
+ result = (char *)xrealloc (result, new_alloc_sz);
+ alloc_sz = new_alloc_sz;
+   }
+  memcpy (result + old_total_sz, buf, iter_sz_in);
+}
+
+  if (!feof (f_in))
+fail_formatted (loc, "error reading from %s: %s", path,
+   xstrerror (errno));
+
+  fclose (f_in);
+
+  /* 0-terminate the buffer.  */
+  gcc_assert (total_sz < alloc_sz);
+  result[total_sz] = '\0';
+
+  return result;
+}
+
 /* Selftests for the selftest system itself.  */
 
 /* Sanity-check the ASSERT_ macros with various passing cases.  */
@@ -181,6 +228,18 @@ test_named_temp_file ()
   fclose (f);
 }
 
+/* Verify read_file (and also temp_source_file).  */
+
+static void
+test_read_file ()
+{
+  temp_source_file t (SELFTEST_LOCATION, "test1.s",
+ "\tjmp\t.L2\n");
+  char *buf = read_file (SELFTEST_LOCATION, t.get_filename ());
+  ASSERT_STREQ ("\tjmp\t.L2\n", buf);
+  free (buf);
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -188,6 +247,7 @@ selftest_c_tests ()
 {
   test_assertions ();
   test_named_temp_file ();
+  test_read_file ();
 }
 
 } // namespace selftest
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 3938560..e5f5c60 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -146,6 +146,13 @@ class line_table_test
 extern void
 for_each_line_table_case (void (*testcase) (const line_table_case &));
 
+/* Read the contents of PATH into memory, returning a 0-terminated buffer
+   that must be freed by the caller.
+   Fail (and abort) if there are any problems, with LOC as the reported
+   location of the failure.  */
+
+extern char *read_file (const location , const char *path);
+
 /* Declarations for specific families of tests (by source file), in
alphabetical order.  */
 extern void bitmap_c_tests ();
-- 
1.8.5.3



[PATCH 3/9] selftest.h: add temp_override fixture

2016-09-08 Thread David Malcolm
We have a lot of global state in our code.  Ideally we'd reduce the
amount of such global state, but a prerequisite for sane refactoring
is having automated testing in place to ensure that the refactoring
doesn't break anything.

However, the global state itself makes it hard to write such automated
testing.

To break this Catch-22, this patch introduces a class temp_override,
for temporarily assigning a value to a global variable, saving the old
value, and then restoring that old value in a dtor.

gcc/ChangeLog:
* selftest.h (selftest::temp_override): New class.
---
 gcc/selftest.h | 29 +
 1 file changed, 29 insertions(+)

diff --git a/gcc/selftest.h b/gcc/selftest.h
index e5f5c60..4c50217 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -153,6 +153,35 @@ for_each_line_table_case (void (*testcase) (const 
line_table_case &));
 
 extern char *read_file (const location , const char *path);
 
+/* A fixture for temporarily overriding a global variable with a new
+   value.  The original value of the variable is captured in the ctor,
+   and restored in the dtor.  */
+
+template 
+class temp_override
+{
+ public:
+  temp_override (T& var, T new_value)
+  : m_var (var),
+/* Record the current value of VAR.  */
+m_old_value (var)
+  {
+/* Set the var to the new value.  */
+m_var = new_value;
+  }
+
+  ~temp_override ()
+  {
+/* Restore the value of the variable to that stored in the
+   ctor.  */
+m_var = m_old_value;
+  }
+
+ private:
+  T& m_var;
+  T m_old_value;
+};
+
 /* Declarations for specific families of tests (by source file), in
alphabetical order.  */
 extern void bitmap_c_tests ();
-- 
1.8.5.3



[Bug libstdc++/77537] New: pair constructors do not properly SFINAE

2016-09-08 Thread Casey at Carter dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77537

Bug ID: 77537
   Summary: pair constructors do not properly SFINAE
   Product: gcc
   Version: 6.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Casey at Carter dot net
  Target Milestone: ---

Created attachment 39586
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39586=edit
Patch

Both static_asserts in this program fail:

#include 
#include 

struct moveonly {
moveonly() = default;
moveonly(moveonly&&) = default;
};

template
using P = std::pair;
static_assert(!std::is_constructible::value,
"FAIL");
static_assert(!std::is_constructible>::value,
"FAIL");

Because the resolution of PR 70437 makes many of the pair constructors
inappropriately participate in overload resolution.

Attached patch is a poorly-tested fix. (It works for this repro, and the repro
in 70437, but caveat emptor.)

[Bug fortran/77505] Negative character length not treated as LEN=0

2016-09-08 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77505

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P4
   Severity|minor   |normal

[PATCH] PR fortran/77507

2016-09-08 Thread Steve Kargl
The attached patch fixes issues with using keywords with
the IEEE_VALUE and C_ASSOCIATED intrinsic routines. 
Regression tested on x86_64-*-freebsd.  OK to commit?

2016-09-08  Steven G. Kargl  

PR fortran/77507
* intrinsic.c (add_functions):  Use correct keyword.

2016-09-08  Steven G. Kargl  

PR fortran/77507
* ieee/ieee_arithmetic.F90 (IEEE_VALUE_4,IEEE_VALUE_8,IEEE_VALULE_10,
IEEE_VALUE_16):  Use correct keyword.

2016-09-08  Steven G. Kargl  

PR fortran/77507
* gfortran.dg/pr77507.f90: New test.

-- 
Steve
Index: gcc/fortran/intrinsic.c
===
--- gcc/fortran/intrinsic.c	(revision 240039)
+++ gcc/fortran/intrinsic.c	(working copy)
@@ -1239,7 +1239,8 @@ add_functions (void)
 *z = "z", *ln = "len", *ut = "unit", *han = "handler",
 *num = "number", *tm = "time", *nm = "name", *md = "mode",
 *vl = "values", *p1 = "path1", *p2 = "path2", *com = "command",
-*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed";
+*ca = "coarray", *sub = "sub", *dist = "distance", *failed="failed",
+*c_ptr_1 = "c_ptr_1", *c_ptr_2 = "c_ptr_2";
 
   int di, dr, dd, dl, dc, dz, ii;
 
@@ -2914,8 +2915,8 @@ add_functions (void)
   /* The following functions are part of ISO_C_BINDING.  */
   add_sym_2 ("c_associated", GFC_ISYM_C_ASSOCIATED, CLASS_INQUIRY, ACTUAL_NO,
 	 BT_LOGICAL, dl, GFC_STD_F2003, gfc_check_c_associated, NULL, NULL,
-	 "C_PTR_1", BT_VOID, 0, REQUIRED,
-	 "C_PTR_2", BT_VOID, 0, OPTIONAL);
+	 c_ptr_1, BT_VOID, 0, REQUIRED,
+	 c_ptr_2, BT_VOID, 0, OPTIONAL);
   make_from_module();
 
   add_sym_1 ("c_loc", GFC_ISYM_C_LOC, CLASS_INQUIRY, ACTUAL_NO,
Index: libgfortran/ieee/ieee_arithmetic.F90
===
--- libgfortran/ieee/ieee_arithmetic.F90	(revision 240039)
+++ libgfortran/ieee/ieee_arithmetic.F90	(working copy)
@@ -857,12 +857,12 @@ contains
 
   ! IEEE_VALUE
 
-  elemental real(kind=4) function IEEE_VALUE_4(X, C) result(res)
-implicit none
+  elemental real(kind=4) function IEEE_VALUE_4(X, CLASS) result(res)
+
 real(kind=4), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS
 
-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -895,12 +895,12 @@ contains
  end select
   end function
 
-  elemental real(kind=8) function IEEE_VALUE_8(X, C) result(res)
-implicit none
+  elemental real(kind=8) function IEEE_VALUE_8(X, CLASS) result(res)
+
 real(kind=8), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS
 
-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -934,12 +934,12 @@ contains
   end function
 
 #ifdef HAVE_GFC_REAL_10
-  elemental real(kind=10) function IEEE_VALUE_10(X, C) result(res)
-implicit none
+  elemental real(kind=10) function IEEE_VALUE_10(X, CLASS) result(res)
+
 real(kind=10), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS
 
-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
@@ -971,15 +971,16 @@ contains
 res = 0
  end select
   end function
+
 #endif
 
 #ifdef HAVE_GFC_REAL_16
-  elemental real(kind=16) function IEEE_VALUE_16(X, C) result(res)
-implicit none
+  elemental real(kind=16) function IEEE_VALUE_16(X, CLASS) result(res)
+
 real(kind=16), intent(in) :: X
-type(IEEE_CLASS_TYPE), intent(in) :: C
+type(IEEE_CLASS_TYPE), intent(in) :: CLASS
 
-select case (C%hidden)
+select case (CLASS%hidden)
   case (1) ! IEEE_SIGNALING_NAN
 res = -1
 res = sqrt(res)
Index: gcc/testsuite/gfortran.dg/pr77507.f90
===
--- gcc/testsuite/gfortran.dg/pr77507.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr77507.f90	(working copy)
@@ -0,0 +1,7 @@
+! { dg-do compile }
+Program p
+  Use ieee_arithmetic
+  Use iso_c_binding
+  Print *, ieee_value(x=1.0, class=ieee_negative_inf)
+  Print *, c_associated(c_ptr_1=c_null_ptr)
+End Program


[Bug fortran/69514] ICE with nested array constructor

2016-09-08 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69514

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |kargl at gcc dot gnu.org
   Target Milestone|--- |7.0

--- Comment #8 from kargl at gcc dot gnu.org ---
Fixed on trunk.

gcc-6-20160908 is now available

2016-09-08 Thread gccadmin
Snapshot gcc-6-20160908 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20160908/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch 
revision 240038

You'll find:

 gcc-6-20160908.tar.bz2   Complete GCC

  MD5=befb90f66fcbe4f742dbe14644da077d
  SHA1=7cc96ea766f967979b529ae0d0cb60db92d8bf99

Diffs from 6-20160901 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Committed] PR fortran/69514

2016-09-08 Thread Steve Kargl
I've committed the attached patch to trunk after
completing regression testing on x86_64-*-freebsd.

2016-09-08  Steven G. Kargl  

PR fortran/69514
* array.c (gfc_match_array_constructor):  If type-spec is present,
walk the array constructor performing possible conversions for 
numeric types.

2016-09-08  Steven G. Kargl  
Louis Krupp  

PR fortran/69514
* gfortran.dg/pr69514_1.f90: New test.
* gfortran.dg/pr69514_2.f90: New test.

-- 
Steve
Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c	(revision 240038)
+++ gcc/fortran/array.c	(working copy)
@@ -1089,6 +1089,7 @@ match_array_cons_element (gfc_constructo
 match
 gfc_match_array_constructor (gfc_expr **result)
 {
+  gfc_constructor *c;
   gfc_constructor_base head, new_cons;
   gfc_undo_change_set changed_syms;
   gfc_expr *expr;
@@ -1194,8 +1195,6 @@ done:
 	 be converted.  See PR fortran/67803.  */
   if (ts.type == BT_CHARACTER)
 	{
-	  gfc_constructor *c;
-
 	  c = gfc_constructor_first (head);
 	  for (; c; c = gfc_constructor_next (c))
 	{
@@ -1218,6 +1217,14 @@ done:
 		}
 	}
 	}
+
+  /* Walk the constructor and ensure type conversion for numeric types.  */
+  if (gfc_numeric_ts ())
+	{
+	  c = gfc_constructor_first (head);
+	  for (; c; c = gfc_constructor_next (c))
+	gfc_convert_type (c->expr, , 1);
+	}
 }
   else
 expr = gfc_get_array_expr (BT_UNKNOWN, 0, );
Index: gcc/testsuite/gfortran.dg/pr69514_1.f90
===
--- gcc/testsuite/gfortran.dg/pr69514_1.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr69514_1.f90	(working copy)
@@ -0,0 +1,5 @@
+! { dg-do run }
+program foo
+   real, parameter :: x(3) = 2.0 * [real :: 1, 2, 3 ]
+   if (any(x /= [2., 4., 6.])) call abort
+end program foo
Index: gcc/testsuite/gfortran.dg/pr69514_2.f90
===
--- gcc/testsuite/gfortran.dg/pr69514_2.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr69514_2.f90	(working copy)
@@ -0,0 +1,49 @@
+! { dg-do run }
+program p
+ implicit none
+
+ real   , parameter :: arr(3) = [ real:: 2, 2.5, (1.5, 2.5) ]
+ real   , parameter :: ari(3) = [ integer :: 2, 2.5, (1.5, 2.5) ]
+ real   , parameter :: arc(3) = [ complex :: 2, 2.5, (1.5, 2.5) ]
+ integer, parameter :: air(3) = [ real:: 2, 2.5, (1.5, 2.5) ]
+ integer, parameter :: aii(3) = [ integer :: 2, 2.5, (1.5, 2.5) ]
+ integer, parameter :: aic(3) = [ complex :: 2, 2.5, (1.5, 2.5) ]
+ complex, parameter :: acr(3) = [ real:: 2, 2.5, (1.5, 2.5) ]
+ complex, parameter :: aci(3) = [ integer :: 2, 2.5, (1.5, 2.5) ]
+ complex, parameter :: acc(3) = [ complex :: 2, 2.5, (1.5, 2.5) ]
+
+ real   , parameter :: mrr(3) =  4.5   * [ real:: 2, 2.5, (3.5, 4.0) ]
+ real   , parameter :: mri(3) =  4.5   * [ integer :: 2, 2.5, (3.5, 4.0) ]
+ real   , parameter :: mrc(3) =  4.5   * [ complex :: 2, 2.5, (3.5, 4.0) ]
+ integer, parameter :: mir(3) =  4 * [ real:: 2, 2.5, (3.5, 4.0) ]
+ integer, parameter :: mii(3) =  4 * [ integer :: 2, 2.5, (3.5, 4.0) ]
+ integer, parameter :: mic(3) =  4 * [ complex :: 2, 2.5, (3.5, 4.0) ]
+ complex, parameter :: mcr(3) = (4.5, 5.5) * [ real:: 2, 2.5, (3.5, 4.0) ]
+ complex, parameter :: mci(3) = (4.5, 5.5) * [ integer :: 2, 2.5, (3.5, 4.0) ]
+ complex, parameter :: mcc(3) = (4.5, 5.5) * [ complex :: 2, 2.5, (3.5, 4.0) ]
+
+ if (any(arr /= [2.00, 2.50, 1.50])) call abort
+ if (any(ari /= [2.00, 2.00, 1.00])) call abort
+ if (any(arc /= [2.00, 2.50, 1.50])) call abort
+
+ if (any(air /= [2, 2, 1])) call abort
+ if (any(aii /= [2, 2, 1])) call abort
+ if (any(aic /= [2, 2, 1])) call abort
+
+ if (any(acr /= [(2.00, 0.00), (2.50, 0.00), (1.50, 0.00)])) call abort
+ if (any(aci /= [(2.00, 0.00), (2.00, 0.00), (1.00, 0.00)])) call abort
+ if (any(acc /= [(2.00, 0.00), (2.50, 0.00), (1.50, 2.50)])) call abort
+
+ if (any(mrr /= [9.00, 11.25, 15.75])) call abort
+ if (any(mri /= [9.00,  9.00, 13.50])) call abort
+ if (any(mrc /= [9.00, 11.25, 15.75])) call abort
+
+ if (any(mir /= [8, 10, 14])) call abort
+ if (any(mii /= [8,  8, 12])) call abort
+ if (any(mic /= [8, 10, 14])) call abort
+
+ if (any(mcr /= [(9.00, 11.00), (11.25, 13.75), (15.75, 19.25)])) call abort
+ if (any(mci /= [(9.00, 11.00), ( 9.00, 11.00), (13.50, 16.50)])) call abort
+ if (any(mcc /= [(9.00, 11.00), (11.25, 13.75), (-6.25, 37.25)])) call abort
+
+end program p


[Bug fortran/69514] ICE with nested array constructor

2016-09-08 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69514

--- Comment #7 from kargl at gcc dot gnu.org ---
Author: kargl
Date: Thu Sep  8 22:33:10 2016
New Revision: 240039

URL: https://gcc.gnu.org/viewcvs?rev=240039=gcc=rev
Log:
2016-09-08  Steven G. Kargl  

PR fortran/69514
* array.c (gfc_match_array_constructor):  If type-spec is present,
walk the array constructor performing possible conversions for 
numeric types.

2016-09-08  Steven G. Kargl  
Louis Krupp  

PR fortran/69514
* gfortran.dg/pr69514_1.f90: New test.
* gfortran.dg/pr69514_2.f90: New test.

Added:
trunk/gcc/testsuite/gfortran.dg/pr69514_1.f90
trunk/gcc/testsuite/gfortran.dg/pr69514_2.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/array.c
trunk/gcc/testsuite/ChangeLog

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread Joseph Myers
On Thu, 8 Sep 2016, Martin Sebor wrote:

> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index da133a4..4607495 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4081,6 +4081,13 @@ In either case, it remains possible to select 
> code-generation for the alternate
>  scheme, by means of compiler command line switches.
>  @end defmac
>  
> +@deftypefn {Target Hook} {const char *} TARGET_LIBC_PRINTF_POINTER_FORMAT 
> (tree, const char **@var{flags})
> +A hook to determine the target @code{printf} implementation format string
> +that the most closely corresponds to the @code{%p} format directive.
> +The object pointed to by the @var{flags} is set to a string consisting
> +of recognized format flags such as the @code{'#'} character.
> +@end deftypefn

No, the substance of hook documentation should go in target.def with just 
an @hook line in tm.texi.in leading to the documentation going in tm.texi 
automatically.

You appear to be defining a target macro masquerading as a hook.  Please 
don't (new target macros should be avoided where possible); use a proper 
hook.  (Maybe the settings depending on OS rather than architecture means 
it needs to be one of those whose default is a manual setting in 
target-def.h rather than automatically generated, but that should be the 
limit of deviation from the normal workings of hooks.)

> +  const char *pfmt = TARGET_LIBC_PRINTF_POINTER_FORMAT (arg, );

With a proper hook them you'd call targetm.libc_printf_pointer_format.

> + inform (callloc,
> + (nbytes + exact == 1
> +  ? "format output %wu byte into a destination of size %wu"
> +  : "format output %wu bytes into a destination of size %wu"),
> + nbytes + exact, info.objsize);

You need to use G_() around both format strings in such a case; xgettext 
doesn't know how to extract them both.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread Martin Sebor

On 09/08/2016 01:45 PM, David Malcolm wrote:

On Thu, 2016-09-08 at 13:03 -0600, Martin Sebor wrote:

Attached is another update to the patch to address the last round
of comments and suggestions, most notably to:

   *  implement the checking of the implementation limit of 4,095 on
  the output of a single directive to allow for the Glibc failure
  due to ENOMEM (the patch issues a warning and disables the
  optimization when this happens)
   *  implement checking for exceeding INT_MAX bytes (warn and disable
  optimization)
   *  call set_range_info when the return value optimization is not
  possible
   *  remove code to work around tree-optimization/71831 (now on
  trunk)

The -fprintf-return value optimization is still disabled.  GCC
successfully bootstraps with it and most tests pass but there's
a failure in the Fortran libgomp tests that I am yet to figure
out.

I'm hoping to get the patch reviewed and hopefully approved while
I debug the libgomp failure.

Martin


I see that you also integrated the substring_loc and format_warning API
into this revision - thanks.


Yes, I forgot to mention it among the highlights.  Thanks for making
the API available to the middle-end!



The patch has a lot of macro-based testcases, presumably for exercising
all of the format codes and boundary conditions, but it seems to be
lacking what I'd call a "usability test case" - a test case that shows
a representative example of idiomatic but buggy code, along with the
full output of the warning, with carets and underlining, so that we can
easily see what the user experience is.  (sorry if there is one and I
didn't see it).


There is a simple test (in the test_sprintf_note() function in
builtin-sprintf-warn-1.c) that exercises the notes but there's always
room for more and more robust test cases :)  I'll see about adding
a few.

FWIW, the challenge here is not in adding them but rather in knowing
when to stop and what level of detail to go to.  With too many test
cases (exercising this level of detail) it can get very tedious and
time consuming to then change the diagnostics or add more detail.
This is not an excuse for not having enough tests.  But striking
the right balance between the level of detail in them is something
I had to grapple with on this project.


 From a marketing point-of-view, I think any new diagnostics like this
deserve a screenshot on the website's gcc-7/changes.html page, showing
a simple example of the above that makes a casual reader think "gcc 7
looks neat; I've definitely made that mistake; I wonder if that's going
to find bugs in my code; I'd better try it at some point".

So please can you add a test case that demonstrates such a screenshot
-worthy example, using:

   /* { dg-options "-fdiagnostics-show-caret" } */

and you can use:

   /* { dg-begin-multiline-output "" }
copy the source, underlines and carets here, omitting trailing dg
directives.
  { dg-end-multiline-output "" } */

(we'd strip away all the dg- directives when making the screenshots for
the website).

The act of creating such an example sometimes suggests tweaks e.g. to
the exact wording of the warning.

Sorry if this seems like I'm picking on you Martin; I just wanted to
share some thoughts that I'm trying to crystallize into general
guidelines on writing diagnostics.


Not at all!  Thanks for the review and for the suggestion to make
use of the multiline DejaGnu directives.  I'll add more tests in
the next revision of the patch (as I have been in each iteration).

Martin


Re: [PATCH, PING] DWARF: process all TYPE_DECL nodes when iterating on scopes

2016-09-08 Thread Pierre-Marie de Rodat

On 09/07/2016 11:30 AM, Richard Biener wrote:

Ok, had time to look at this issue again.  I see the patch works like dwarf2out
works currently with respect to DIE creation order and re-location.


Thank you very much for helping me with this again!

So yes, that was the intent of the patch.


this might be incomplete though for the case where it's say

 typedef const T X;

thus the type of decl is a qualified type?  In this case the qualification might
be at the correct scope but the actual type not or you just relocate the
qualification but not the type DIE it refers to?


I haven’t tested this yet but I guess you are right. A complete patch 
should also probably see if the unqualifier type should be relocated.



That said, with the idea of early debug in place and thus giving
more responsibility to the frontends I wonder in what order the Ada
FE calls debug_hooks.early_global_decl ()? Maybe it's the middle-end
and it should arrange for a more natural order on function nests.  So
I'd appreciate if you can investigate this side of the issue a bit,
that is, simply avoid the bad ordering.


Reordering compilation of function nests was the first idea I had in 
mind when I first worked on this issue, maybe two years ago. I thought 
it would make sense debug info generation-wise, but I wondered if this 
specific order was motivated by code generation concerns.


Anyway I agree it would be a more elegant way out. As the GNU cauldron 
is going to keep me busy, I think I’ll investigate this way next week. 
Thanks again! :-)


--
Pierre-Marie de Rodat


Re: RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-08 Thread Jason Merrill
On Thu, Sep 8, 2016 at 11:55 AM, Joseph Myers  wrote:
> On Thu, 8 Sep 2016, Jason Merrill wrote:
>
>> Various places in GCC use negate, bit-and and compare to test whether
>> an integer is a power of 2, but I think it would be clearer for this
>> test to be wrapped in a function.
>
> (x & -x) == x is also true for 0.  Whatever the correct function semantics
> for 0, the comment needs to reflect them.

Yep, I was just realizing that, too.  This much larger patch introduces:

 least_bit_hwi: (x & -x)
 pow2_or_zerop: (x & -x) == x
 pow2p_hwi: x && x == (x & -x)
 ctz_or_zero: floor_log2 (x & -x)

and replaces these patterns accordingly.  I'm not at all attached to the names.

Tested x86_64-pc-linux-gnu.
commit 4b76501b72f3953ac57cb077b4e25a90afb6d9a9
Author: Jason Merrill 
Date:   Thu Sep 8 13:10:12 2016 -0400

Add inline functions for various bitwise operations.

* hwint.h (least_bit_hwi, pow2_or_zerop, pow2p_hwi, ctz_or_zero):
New.
* hwint.c (exact_log2): Use pow2p_hwi.
(ctz_hwi, ffs_hwi): Use least_bit_hwi.
* alias.c (memrefs_conflict_p): Use pow2_or_zerop.
* builtins.c (get_object_alignment_2, get_object_alignment)
(get_pointer_alignment, fold_builtin_atomic_always_lock_free): Use
least_bit_hwi.
* calls.c (compute_argument_addresses, store_one_arg): Use
least_bit_hwi.
* cfgexpand.c (expand_one_stack_var_at): Use least_bit_hwi.
* combine.c (force_to_mode): Use least_bit_hwi.
* emit-rtl.c (set_mem_attributes_minus_bitpos, adjust_address_1):
Use least_bit_hwi.
* expmed.c (synth_mult, expand_divmod): Use ctz_or_zero, ctz_hwi.
(init_expmed_one_conv): Use pow2p_hwi.
* fold-const.c (round_up_loc, round_down_loc): Use pow2_or_zerop.
(fold_binary_loc): Use pow2p_hwi.
* function.c (assign_parm_find_stack_rtl): Use least_bit_hwi.
* gimple-fold.c (gimple_fold_builtin_memory_op): Use pow2p_hwi.
* gimple-ssa-strength-reduction.c (replace_ref): Use least_bit_hwi.
* hsa-gen.c (gen_hsa_addr_with_align, hsa_bitmemref_alignment):
Use least_bit_hwi.
* ipa-cp.c (ipcp_alignment_lattice::meet_with_1): Use least_bit_hwi.
* ipa-prop.c (ipa_modify_call_arguments): Use least_bit_hwi.
* omp-low.c (oacc_loop_fixed_partitions)
(oacc_loop_auto_partitions): Use least_bit_hwi.
* rtlanal.c (nonzero_bits1): Use ctz_or_zero.
* stor-layout.c (place_field): Use least_bit_hwi.
* tree-pretty-print.c (dump_generic_node): Use pow2p_hwi.
* tree-sra.c (build_ref_for_offset): Use least_bit_hwi.
* tree-ssa-ccp.c (ccp_finalize): Use least_bit_hwi.
* tree-ssa-math-opts.c (bswap_replace): Use least_bit_hwi.
* tree-ssa-strlen.c (handle_builtin_memcmp): Use pow2p_hwi.
* tree-vect-data-refs.c (vect_analyze_group_access_1)
(vect_grouped_store_supported, vect_grouped_load_supported)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load): Use pow2p_hwi.
* tree-vect-generic.c (expand_vector_divmod): Use ctz_or_zero.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Use ctz_or_zero.
* tree-vect-stmts.c (vectorizable_mask_load_store): Use
least_bit_hwi.
* tsan.c (instrument_expr): Use least_bit_hwi.
* var-tracking.c (negative_power_of_two_p): Use pow2_or_zerop.

diff --git a/gcc/alias.c b/gcc/alias.c
index f4b5a92..277125e 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2534,7 +2534,7 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, 
HOST_WIDE_INT c)
 {
   HOST_WIDE_INT sc = INTVAL (XEXP (x, 1));
   unsigned HOST_WIDE_INT uc = sc;
-  if (sc < 0 && -uc == (uc & -uc))
+  if (sc < 0 && pow2_or_zerop (-uc))
{
  if (xsize > 0)
xsize = -xsize;
@@ -2549,7 +2549,7 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, 
HOST_WIDE_INT c)
 {
   HOST_WIDE_INT sc = INTVAL (XEXP (y, 1));
   unsigned HOST_WIDE_INT uc = sc;
-  if (sc < 0 && -uc == (uc & -uc))
+  if (sc < 0 && pow2_or_zerop (-uc))
{
  if (ysize > 0)
ysize = -ysize;
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 1073e35..0ccba15 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -305,7 +305,7 @@ get_object_alignment_2 (tree exp, unsigned int *alignp,
{
  ptr_bitmask = TREE_INT_CST_LOW (TREE_OPERAND (addr, 1));
  ptr_bitmask *= BITS_PER_UNIT;
- align = ptr_bitmask & -ptr_bitmask;
+ align = least_bit_hwi (ptr_bitmask);
  addr = TREE_OPERAND (addr, 0);
}
 
@@ -325,7 +325,7 @@ get_object_alignment_2 (tree exp, unsigned int *alignp,
  

[Bug tree-optimization/77536] New: Vectorizer not maintaining relationship of relative block frequencies in absence of real profile data

2016-09-08 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77536

Bug ID: 77536
   Summary: Vectorizer not maintaining relationship of relative
block frequencies in absence of real profile data
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pthaugen at gcc dot gnu.org
CC: dje at gcc dot gnu.org, rguenth at gcc dot gnu.org,
wschmidt at gcc dot gnu.org
  Target Milestone: ---
  Host: powerpc64*-unknown-linux-gnu
Target: powerpc64*-unknown-linux-gnu
 Build: powerpc64*-unknown-linux-gnu

// gcc -O3 -ffast-math -mcpu=power8 -mrecip=all -S test.c

void sub(float *restrict x, float *restrict y, float *restrict z, int n)
{
  int i;

  for (i = 0; i < n; i++)
x[i] = z[i]/y[i];
}


In the dump before vectorizer (148t.ifcvt) the loop block has freq=8500 which
is almost 6X the freq of the entry block(1500). After vectoriztion the now
vectorized loop block has freq=662. While I agree the freq of the loop block
should drop some after vectorization, it should still maintain the relationship
of being the hottest block in the function.

[Bug bootstrap/77512] gcc compilation stops with Arithmetic Exception

2016-09-08 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77512

Eric Botcazou  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-09-08
 CC||ebotcazou at gcc dot gnu.org
 Ever confirmed|0   |1
  Build||sparc-sun-solaris2.10

--- Comment #1 from Eric Botcazou  ---
> First stage was compiled using gcc 4.0.3

Any options passed to this compiler during the first stage?  Could you post the
output of 'uname -a' on the machine?

> Prerequisits are dowload using contrib/download_prerequisites
> GNU binutils-2.27 are used
> configuration was done with:
> ../gcc-6.2.0/configure --prefix=$HOME/gcc-6.2.0-bin
> --target=sparc-sun-solaris2.10 --with-gnu-as --with-gnu-ld --disable-libgcj
> --enable-languages=c,c++

I'll try to reproduce locally.

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread David Malcolm
On Thu, 2016-09-08 at 13:03 -0600, Martin Sebor wrote:
> Attached is another update to the patch to address the last round
> of comments and suggestions, most notably to:
> 
>   *  implement the checking of the implementation limit of 4,095 on
>  the output of a single directive to allow for the Glibc failure
>  due to ENOMEM (the patch issues a warning and disables the
>  optimization when this happens)
>   *  implement checking for exceeding INT_MAX bytes (warn and disable
>  optimization)
>   *  call set_range_info when the return value optimization is not
>  possible
>   *  remove code to work around tree-optimization/71831 (now on
>  trunk)
> 
> The -fprintf-return value optimization is still disabled.  GCC
> successfully bootstraps with it and most tests pass but there's
> a failure in the Fortran libgomp tests that I am yet to figure
> out.
> 
> I'm hoping to get the patch reviewed and hopefully approved while
> I debug the libgomp failure.
> 
> Martin

I see that you also integrated the substring_loc and format_warning API
into this revision - thanks.

The patch has a lot of macro-based testcases, presumably for exercising
all of the format codes and boundary conditions, but it seems to be
lacking what I'd call a "usability test case" - a test case that shows
a representative example of idiomatic but buggy code, along with the
full output of the warning, with carets and underlining, so that we can
easily see what the user experience is.  (sorry if there is one and I
didn't see it).

>From a marketing point-of-view, I think any new diagnostics like this
deserve a screenshot on the website's gcc-7/changes.html page, showing
a simple example of the above that makes a casual reader think "gcc 7
looks neat; I've definitely made that mistake; I wonder if that's going
to find bugs in my code; I'd better try it at some point".

So please can you add a test case that demonstrates such a screenshot
-worthy example, using:

  /* { dg-options "-fdiagnostics-show-caret" } */

and you can use:

  /* { dg-begin-multiline-output "" }
copy the source, underlines and carets here, omitting trailing dg
directives.
 { dg-end-multiline-output "" } */

(we'd strip away all the dg- directives when making the screenshots for
the website).

The act of creating such an example sometimes suggests tweaks e.g. to
the exact wording of the warning.

Sorry if this seems like I'm picking on you Martin; I just wanted to
share some thoughts that I'm trying to crystallize into general
guidelines on writing diagnostics.

Hope this is constructive
Dave


[PATCH] ada/77535: GNAT.Perfect_Hash_Generators for non-1-based strings

2016-09-08 Thread Florian Weimer
This patch fixes GNAT.Perfect_Hash_Generators for strings which are
not 1-based.  It does this by introducing its own storage type which
fixes the first index as 1.  This is also a minor optimization because
it avoids the need to store the index.

Okay for trunk?

Should I try to construct a new test case for this?  I don't see any
existing tests for this package.

2016-09-08  Florian Weimer  

PR ada/77535
Make all word strings start with 1.
* g-pehage.adb (Word_Storage): New type.
(Word_Type): Use Word_Storage.
(Free_Word): Instantiate Unchecked_Deallocation.
(Apply_Position_Selection, Put_Initial_Keys, Put_Reduced_Keys)
(Resize_Word, Select_Char_Position, Select_Character_Set): Adjust
indirection through Word_Type.
(New_Word): Allocate Word_Storage instead of String.

Index: gcc/ada/g-pehage.adb
===
--- gcc/ada/g-pehage.adb	(revision 240038)
+++ gcc/ada/g-pehage.adb	(working copy)
@@ -32,6 +32,7 @@
 with Ada.IO_Exceptions;   use Ada.IO_Exceptions;
 with Ada.Characters.Handling; use Ada.Characters.Handling;
 with Ada.Directories;
+with Ada.Unchecked_Deallocation;
 
 with GNAT.Heap_Sort_G;
 with GNAT.OS_Lib;  use GNAT.OS_Lib;
@@ -102,8 +103,12 @@
No_Edge   : constant Edge_Id   := -1;
No_Table  : constant Table_Id  := -1;
 
-   type Word_Type is new String_Access;
-   procedure Free_Word (W : in out Word_Type) renames Free;
+   type Word_Storage (Length : Natural) is record
+  Word : String (1 .. Length);
+   end record;
+   type Word_Type is access Word_Storage;
+   procedure Free_Word is
+  new Ada.Unchecked_Deallocation (Word_Storage, Word_Type);
function New_Word (S : String) return Word_Type;
 
procedure Resize_Word (W : in out Word_Type; Len : Natural);
@@ -574,7 +579,7 @@
begin
   for J in 0 .. NK - 1 loop
  declare
-IW : constant String := WT.Table (Initial (J)).all;
+IW : constant String := WT.Table (Initial (J)).Word;
 RW : String (1 .. IW'Length) := (others => ASCII.NUL);
 N  : Natural := IW'First - 1;
 
@@ -1312,7 +1317,8 @@
 
function New_Word (S : String) return Word_Type is
begin
-  return new String'(S);
+  return new Word_Storage'(Length => S'Length,
+   Word => S);
end New_Word;
 
--
@@ -1913,7 +1919,7 @@
  K := Get_Key (J);
  Put (File, Image (J, M),   F1, L1, J, 1, 3, 1);
  Put (File, Image (K.Edge, M),  F1, L1, J, 1, 3, 2);
- Put (File, Trim_Trailing_Nuls (WT.Table (Initial (J)).all),
+ Put (File, Trim_Trailing_Nuls (WT.Table (Initial (J)).Word),
 F1, L1, J, 1, 3, 3);
   end loop;
end Put_Initial_Keys;
@@ -1995,7 +2001,7 @@
  K := Get_Key (J);
  Put (File, Image (J, M),   F1, L1, J, 1, 3, 1);
  Put (File, Image (K.Edge, M),  F1, L1, J, 1, 3, 2);
- Put (File, Trim_Trailing_Nuls (WT.Table (Reduced (J)).all),
+ Put (File, Trim_Trailing_Nuls (WT.Table (Reduced (J)).Word),
 F1, L1, J, 1, 3, 3);
   end loop;
end Put_Reduced_Keys;
@@ -2075,7 +2081,7 @@
-
 
procedure Resize_Word (W : in out Word_Type; Len : Natural) is
-  S1 : constant String := W.all;
+  S1 : constant String := W.Word;
   S2 : String (1 .. Len) := (others => ASCII.NUL);
   L  : constant Natural := S1'Length;
begin
@@ -2161,7 +2167,7 @@
Right := Offset + R;
 end if;
 
-return WT.Table (Left)(C) < WT.Table (Right)(C);
+return WT.Table (Left).Word (C) < WT.Table (Right).Word (C);
  end Lt;
 
  --
@@ -2221,8 +2227,8 @@
   --  Two contiguous words are identical when they have the
   --  same Cth character.
 
-  elsif WT.Table (Reduced (N))(C) =
-WT.Table (Reduced (N + 1))(C)
+  elsif WT.Table (Reduced (N)).Word (C) =
+WT.Table (Reduced (N + 1)).Word (C)
   then
  L := N + 1;
 
@@ -2265,7 +2271,7 @@
 
 N := (others => 0);
 for K in Table (S).First .. Table (S).Last loop
-   C := WT.Table (Reduced (K))(Pos);
+   C := WT.Table (Reduced (K)).Word (Pos);
N (C) := N (C) + 1;
 end loop;
 
@@ -2288,7 +2294,7 @@
   --  Initialize the reduced words set
 
   for K in 0 .. NK - 1 loop
- WT.Table (Reduced (K)) := New_Word (WT.Table (Initial (K)).all);
+ WT.Table (Reduced (K)) := New_Word (WT.Table (Initial (K)).Word);
   end loop;
 
   declare
@@ -2384,7 +2390,7 @@
 Same_Keys_Sets_Table (J).Last
   loop
  Put (Output,
-  

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread Martin Sebor

On 09/08/2016 01:21 PM, Jeff Law wrote:

On 08/24/2016 10:40 AM, Martin Sebor wrote:

On 08/23/2016 05:00 PM, Joseph Myers wrote:

Some observations:

* Does -fprintf-return-value allow for the possibility of snprintf
failing
because of a memory allocation failure and so returning -1 when GCC
computed bounds on what it could return if successful?


No.  I recall having seen Glibc fail with ENOMEM years ago when
formatting a floating point number to a very large precision but
I haven't seen any implementation fail.  I haven't yet looked to
see if the Glibc failure can still happen.  My reading of C and
POSIX is that snprintf is only allowed to fail due to an encoding
error, not because it runs out of memory, so such a failure would
seem like a bug.

At the same time, the cause of the failure doesn't really matter.
If the function could fail, unless GCC could determine the failure
at compile time and avoid the optimization, the transformation
wouldn't be guaranteed to be safe.  It might be something to
consider and possibly accommodate in the implementation.  It's
one of the reasons why I want to expose the optimization to
more code before enabling it.

I also haven't yet thought about how to deal with it but if it
is a possibility we want to allow for then maybe a target hook
for libc implementers to set to indicate whether sprintf can fail
and when would work.  Libc implementations that can fail under
any conditions (whether allowed by the standard or not) would
need to disable (or not enable, depending on the default) the
optimization.  I'm certainly open to other ideas.

So what are the implications for the optimization part of this patch?


By coincidence I've just posted an updated patch that handles this
situation by avoiding the optimization (and issuing a warning pointing
out that a directive produced more that 4,095 bytes worth of output).
Ditto for INT_MAX.

Martin


Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread Jeff Law

On 08/24/2016 10:40 AM, Martin Sebor wrote:

On 08/23/2016 05:00 PM, Joseph Myers wrote:

Some observations:

* Does -fprintf-return-value allow for the possibility of snprintf
failing
because of a memory allocation failure and so returning -1 when GCC
computed bounds on what it could return if successful?


No.  I recall having seen Glibc fail with ENOMEM years ago when
formatting a floating point number to a very large precision but
I haven't seen any implementation fail.  I haven't yet looked to
see if the Glibc failure can still happen.  My reading of C and
POSIX is that snprintf is only allowed to fail due to an encoding
error, not because it runs out of memory, so such a failure would
seem like a bug.

At the same time, the cause of the failure doesn't really matter.
If the function could fail, unless GCC could determine the failure
at compile time and avoid the optimization, the transformation
wouldn't be guaranteed to be safe.  It might be something to
consider and possibly accommodate in the implementation.  It's
one of the reasons why I want to expose the optimization to
more code before enabling it.

I also haven't yet thought about how to deal with it but if it
is a possibility we want to allow for then maybe a target hook
for libc implementers to set to indicate whether sprintf can fail
and when would work.  Libc implementations that can fail under
any conditions (whether allowed by the standard or not) would
need to disable (or not enable, depending on the default) the
optimization.  I'm certainly open to other ideas.

So what are the implications for the optimization part of this patch?

Jeff


[Bug ada/77535] GNAT.Perfect_Hash_Generators access invalid memory with non-1-based strings

2016-09-08 Thread fw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77535

Florian Weimer  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-09-08
   Assignee|unassigned at gcc dot gnu.org  |fw at gcc dot gnu.org
 Ever confirmed|0   |1

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-09-08 Thread Jeff Law

On 08/24/2016 05:14 PM, Manuel López-Ibáñez wrote:



I agree.  The challenge is that not all the bits this depends on
(the g_string_concat_db and parse_in globals defined in the front
end) are available in the middle end.  I've been talking to David
Malcolm about how best to factor things out of c-format.c and make
it available in both parts of the compiler under a convenient API.


Perhaps diagnostics_context could have pointers to those, forward
defined in the .h file and include the relevant libcpp headers in
diagnostics.c (or input.c). FEs that make use of those features could
initialize them (via some API) to some existing object. Otherwise,
they will work like in your patch (but within diagnostic.c). Similar
to how we initialize the FE-specific pretty-printers.

We already depend on libcpp for line-map.c, so internally depending on
other libcpp features is not so bad. The important thing is to hide
this from the clients, so that the clients do not need to be aware of
what diagnostics.c requires. That is, the middle-end and Ada should
not include headers that include libcpp headers, but diagnostics.c can
include whatever it needs.

Otherwise, the future will be again a mess and we get further away
from ever separating the FEs from the ME.
Warnings which rely on things like const/copy propagation, range 
analysis, inherently belong in the the middle end.  They're next to 
useless in the front-end.


This implies that a goodly amount of what's in c-format needs to move 
and likely bits of libcpp/line-map as well.





BTW, it would be nice to explain in comments why each header needs to
be included, besides obvious ones such as tree.h and gimple.h (it
would be great if we had guidelines on how to order included headers,
why not group together all gimple*, tree*, backend-stuff, diagnostics
stuff?). On the other hand, it is unfair to nitpick your patch
regarding this when other commits do the same.

I see this as busy work and work that easily gets out of date.

I'd rather just use Andrew's header file refactoring work to be able to 
answer these kinds of questions, canonicalize include ordering and to 
eliminate unnecessary/redundant includes.  In fact, ISTM it ought to be 
run before we hit stage1 close :-)


Jeff



[Bug ada/77535] New: GNAT.Perfect_Hash_Generators access invalid memory with non-1-based strings

2016-09-08 Thread fw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77535

Bug ID: 77535
   Summary: GNAT.Perfect_Hash_Generators access invalid memory
with non-1-based strings
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fw at gcc dot gnu.org
  Target Milestone: ---

Created attachment 39585
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39585=edit
phg2.adb

Natasha Kerensikova reported this to comp.lang.ada.  The attached file was
downloaded from:

  http://users.instinctive.eu/nat/phg/

When running it under valgrind, I get:

==4196== Invalid read of size 1
==4196==at 0x5032D23: ??? (in /usr/lib/x86_64-linux-gnu/libgnat-4.9.so.1)
==4196==by 0x5035254: gnat__perfect_hash_generators__compute (in
/usr/lib/x86_64-linux-gnu/libgnat-4.9.so.1)
==4196==by 0x402F3B: _ada_phg2 (in /tmp/g/phg2)
==4196==by 0x402DA5: main (in /tmp/g/phg2)
==4196==  Address 0x5e61837 is 23 bytes after a block of size 16 alloc'd
==4196==at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==4196==by 0x507F403: __gnat_malloc (in
/usr/lib/x86_64-linux-gnu/libgnat-4.9.so.1)
==4196==by 0x5033413: ??? (in /usr/lib/x86_64-linux-gnu/libgnat-4.9.so.1)
==4196==by 0x5035098: gnat__perfect_hash_generators__compute (in
/usr/lib/x86_64-linux-gnu/libgnat-4.9.so.1)
==4196==by 0x402F3B: _ada_phg2 (in /tmp/g/phg2)
==4196==by 0x402DA5: main (in /tmp/g/phg2)

Others have reported a segfault.

This is apparently due to GNAT.Perfect_Hash_Generators not dealing properly
with non-1-based strings.

[Bug fortran/77532] ICE in check_dtio_interface1, at fortran/interface.c:4622

2016-09-08 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77532

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #2 from Paul Thomas  ---
It's mine!

Paul

[Bug fortran/77534] ICE in check_dtio_arg_TKR_intent, at fortran/interface.c:4572

2016-09-08 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77534

Paul Thomas  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #3 from Paul Thomas  ---
Thanks Gerhard and Steve! I'll take it.

Paul

[Bug fortran/77533] ICE in check_dtio_interface1, at fortran/interface.c:4614

2016-09-08 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77533

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #3 from Paul Thomas  ---
It's obviously mine :-(

Paul

[Bug fortran/77533] ICE in check_dtio_interface1, at fortran/interface.c:4614

2016-09-08 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77533

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 CC||pault at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres  ---
Confirmed on trunk (7.0).

[Bug fortran/77534] ICE in check_dtio_arg_TKR_intent, at fortran/interface.c:4572

2016-09-08 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77534

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres  ---
Confirmed.

[Bug fortran/77534] ICE in check_dtio_arg_TKR_intent, at fortran/interface.c:4572

2016-09-08 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77534

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
One should return from check_dtio_arg_TKR_intent() after
any error is issued instead of falling through to the 
end of the function.  The following prevents the ICE.


Index: interface.c
===
--- interface.c (revision 240038)
+++ interface.c (working copy)
@@ -4559,8 +4570,11 @@ check_dtio_arg_TKR_intent (gfc_symbol *f
   int kind, int rank, sym_intent intent)
 {
   if (fsym->ts.type != type)
-gfc_error ("DTIO dummy argument at %L must be of type %s",
-  >declared_at, gfc_basic_typename (type));
+{
+  gfc_error ("DTIO dummy argument at %L must be of type %s",
+>declared_at, gfc_basic_typename (type));
+  return;
+}

   if (fsym->ts.type != BT_CLASS && fsym->ts.type != BT_DERIVED
   && fsym->ts.kind != kind)

[Bug fortran/77532] ICE in check_dtio_interface1, at fortran/interface.c:4622

2016-09-08 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77532

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 CC||pault at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed on trunk (7.0), not implemented before.

[Bug fortran/77525] wrong requirement of an upper bound for an array argument

2016-09-08 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77525

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.5 up to trunk (7.0). Fortran 2003: Procedure components at (1)
are not yet implemented in gfortran-4.4.

> This is a regression because gfortran 4.9.2 is OK

Are you sure of that? I don't have 4.9.2, but I see the error with various
4.9.0 and 4.9.4.

[Bug fortran/77534] New: ICE in check_dtio_arg_TKR_intent, at fortran/interface.c:4572

2016-09-08 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77534

Bug ID: 77534
   Summary: ICE in check_dtio_arg_TKR_intent, at
fortran/interface.c:4572
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

With a nonpolymorphic dtv :


$ cat z1.f90
module m
   type t
   end type
   interface read(unformatted)
  module procedure s
   end interface
contains
   subroutine s(dtv)
  type(t), intent(inout) :: dtv
   end
end


$ gfortran-7-20160904 z1.f90
z1.f90:8:19:

subroutine s(dtv)
   1
Error: DTIO dummy argument at (1) must be of type CLASS
f951: internal compiler error: Segmentation fault
0xc2100f crash_signal
../../gcc/toplev.c:336
0x6926db check_dtio_arg_TKR_intent
../../gcc/fortran/interface.c:4572
0x6927f9 check_dtio_interface1
../../gcc/fortran/interface.c:4662
0x69a2d8 gfc_check_dtio_interfaces(gfc_symbol*)
../../gcc/fortran/interface.c:4741
0x70c87b do_traverse_symtree
../../gcc/fortran/symbol.c:3939
0x6f65c0 resolve_types
../../gcc/fortran/resolve.c:15658
0x6f1d9c gfc_resolve(gfc_namespace*)
../../gcc/fortran/resolve.c:15730
0x6dd3a4 gfc_parse_file()
../../gcc/fortran/parse.c:6056
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

[Bug fortran/77533] ICE in check_dtio_interface1, at fortran/interface.c:4614

2016-09-08 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77533

--- Comment #1 from Gerhard Steinmetz  
---

Sidenote : one message mentions a "KIND = 0".


$ cat z3.f90
module m
   type t
   contains
  procedure :: s
  generic :: write(formatted) => s
   end type
contains
   subroutine s(x)
  class(t), intent(in) : x
   end
end


$ gfortran-7-20160904 z3.f90
z3.f90:9:15:

   class(t), intent(in) : x
   1
Error: Invalid character in name at (1)
z3.f90:4:15:

   procedure :: s
   1
Error: Non-polymorphic passed-object dummy argument of 's' at (1)
z3.f90:8:17:

subroutine s(x)
 1
Error: DTIO dummy argument at (1) must be of type CLASS
z3.f90:8:17:

subroutine s(x)
 1
Error: DTIO dummy argument at (1) must be of KIND = 0
z3.f90:8:17:

subroutine s(x)
 1
Error: DTIO dummy argument at (1) must have intent IN

Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate components

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:


Deciding what blocks should run with a certain component active so that
the total cost of executing the prologues (and epilogues) is optimal, is
not a computationally feasible problem.
Really?  It's just a dataflow problem is it not and one that ought to 
converge very quickly I'd think.  Or is it more a function of having to 
run it on so many independent components?  I'm still pondering the value 
of having every GPR be an independent component :-)


ISTM this adds a fair amount of complexity to the implementation in that 
prologues and epilogues for a particular component can run more than 
once.  Can you give a concrete example where this happens so that we can 
all understand it better?


If we keep this aspect of the implementation it seems like a note in the 
developer section would be in order.  It's certainly non-intuitive.


I only glanced over the code that seems related to this aspect of the 
implementation.  If we decide to go forward, I'd like to look at it 
again more closely.




Now all that is left is inserting prologues and epilogues on all edges
that jump into resp. out of the "active" set of blocks.  Often we need
to insert some components' prologues (or epilogues) on all edges into
(or out of) a block.  In theory cross-jumping can unify all such, but
in practice that often fails; besides, that is a lot of work.  So in
this case we insert the prologue and epilogue components at the "head"
or "tail" of a block, instead.
Cross jumping is rather simplistic, so I'm not surprised that it doesn't 
catch all this stuff.




As a final optimisation, if a block needs a prologue and its immediate
dominator has the block as a post-dominator, the dominator gets the
prologue as well.
So why not just put it in the idom and not in the dominated block? 
Doesn't this just end up running that component's prologue twice?




2016-06-07  Segher Boessenkool  

* function.c (thread_prologue_and_epilogue_insns): Recompute the
live info.  Call try_shrink_wrapping_separate.  Compute the
prologue_seq afterwards, if it has possibly changed.  Compute the
split_prologue_seq and epilogue_seq later, too.
* shrink-wrap.c: #include cfgbuild.h.
(dump_components): New function.
(struct sw): New struct.
(SW): New function.
(init_separate_shrink_wrap): New function.
(fini_separate_shrink_wrap): New function.
(place_prologue_for_one_component): New function.
(spread_components): New function.
(disqualify_problematic_components): New function.
(emit_common_heads_for_components): New function.
(emit_common_tails_for_components): New function.
(insert_prologue_epilogue_for_components): New function.
(try_shrink_wrapping_separate): New function.
* shrink-wrap.h: Declare try_shrink_wrapping_separate.
---
 gcc/function.c|  15 +-
 gcc/shrink-wrap.c | 715 ++
 gcc/shrink-wrap.h |   1 +
 3 files changed, 729 insertions(+), 2 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index bba0705..390e9a6 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5912,6 +5912,12 @@ make_epilogue_seq (void)
 void
 thread_prologue_and_epilogue_insns (void)
 {
+  if (optimize > 1)
+{
+  df_live_add_problem ();
+  df_live_set_all_dirty ();
+}

Perhaps conditional on separate shrink wrapping?


@@ -5922,9 +5928,7 @@ thread_prologue_and_epilogue_insns (void)
   edge entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
   edge orig_entry_edge = entry_edge;

-  rtx_insn *split_prologue_seq = make_split_prologue_seq ();
   rtx_insn *prologue_seq = make_prologue_seq ();
-  rtx_insn *epilogue_seq = make_epilogue_seq ();

   /* Try to perform a kind of shrink-wrapping, making sure the
  prologue/epilogue is emitted only around those parts of the
@@ -5932,6 +5936,13 @@ thread_prologue_and_epilogue_insns (void)

   try_shrink_wrapping (_edge, prologue_seq);

+  df_analyze ();
+  try_shrink_wrapping_separate (entry_edge->dest);
+  if (crtl->shrink_wrapped_separate)
+prologue_seq = make_prologue_seq ();

Perhaps push the df_analyze call into try_shrink_wrapping_separate?

ANd if it was successful, do you need to update the DF information?  Is 
that perhaps the cause of some of the issues we're seeing with DCE, 
regrename, the scheduler?





diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index b85b1c3..643e375 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "output.h"
 #include "tree-pass.h"
 #include "cfgrtl.h"
+#include "cfgbuild.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -1006,3 +1007,717 @@ try_shrink_wrapping (edge *entry_edge, rtx_insn 
*prologue_seq)
   BITMAP_FREE (bb_with);
   free_dominance_info 

[Bug fortran/77533] New: ICE in check_dtio_interface1, at fortran/interface.c:4614

2016-09-08 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77533

Bug ID: 77533
   Summary: ICE in check_dtio_interface1, at
fortran/interface.c:4614
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

Referencing an unknown type :


$ cat z1.f90
module m
   type t
  type(unknown), pointer :: next
   contains
  procedure :: s
  generic :: write(formatted) => s
   end type
contains
   subroutine s(x)
   end
end


$ gfortran-7-20160904 z1.f90
z1.f90:3:36:

   type(unknown), pointer :: next
1
Error: The pointer component 'next' of 't' at (1) is a type that has not been
declared
f951: internal compiler error: Segmentation fault
0xc2100f crash_signal
../../gcc/toplev.c:336
0x692790 check_dtio_interface1
../../gcc/fortran/interface.c:4614
0x69a293 gfc_check_dtio_interfaces(gfc_symbol*)
../../gcc/fortran/interface.c:4729
0x70c87b do_traverse_symtree
../../gcc/fortran/symbol.c:3939
0x6f65c0 resolve_types
../../gcc/fortran/resolve.c:15658
0x6f1d9c gfc_resolve(gfc_namespace*)
../../gcc/fortran/resolve.c:15730
0x6dd3a4 gfc_parse_file()
../../gcc/fortran/parse.c:6056
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

[Bug fortran/77532] New: ICE in check_dtio_interface1, at fortran/interface.c:4622

2016-09-08 Thread gerhard.steinmetz.fort...@t-online.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77532

Bug ID: 77532
   Summary: ICE in check_dtio_interface1, at
fortran/interface.c:4622
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gerhard.steinmetz.fort...@t-online.de
  Target Milestone: ---

With following test skeleton :


$ cat z1.f90
module m
   type t
   end type
   interface read(unformatted)
   end interface
end


$ gfortran-7-20160904 z1.f90
f951: internal compiler error: in check_dtio_interface1, at
fortran/interface.c:4622
0x6929f4 check_dtio_interface1
../../gcc/fortran/interface.c:4622
0x69a2d8 gfc_check_dtio_interfaces(gfc_symbol*)
../../gcc/fortran/interface.c:4741
0x70c87b do_traverse_symtree
../../gcc/fortran/symbol.c:3939
0x6f65c0 resolve_types
../../gcc/fortran/resolve.c:15658
0x6f1d9c gfc_resolve(gfc_namespace*)
../../gcc/fortran/resolve.c:15730
0x6dd3a4 gfc_parse_file()
../../gcc/fortran/parse.c:6056
0x71f602 gfc_be_parse_file
../../gcc/fortran/f95-lang.c:198

Re: [PATCH 7/9] cprop: Leave RTX_FRAME_RELATED_P instructions alone

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

Doing cprop on frame-related instructions blows up spectacularly.
So don't.

2016-06-07  Segher Boessenkool  

* regcprop.c (copyprop_hardreg_forward_1): Don't change
RTX_FRAME_RELATED_P instructions.

Example or testcase?

jeff



Re: [PATCH 4/9] regrename: Don't rename restores

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

A restore is supposed to restore some certain register.  Restoring it
into some other register will not work.  Don't.

2016-06-07  Segher Boessenkool  

* regrename.c (build_def_use): Invalidate chains that have a
REG_CFA_RESTORE on some instruction.
Again, how is this different from a normal epilogue that restores 
registers which regrename seems to not muck with.


Jeff



Re: [PATCH 6/9] sel-sched: Don't mess with register restores

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

If selective scheduling copies register restores it confuses dwarf2cfi.

2016-06-07  Segher Boessenkool  

* sel-sched-ir.c (init_global_and_expr_for_insn): Don't copy
instructions with a REG_CFA_RESTORE note.
Similarly, I think you're papering over a lifetime problem of some kind 
here.


jeff



Re: [PATCH 5/9] regrename: Don't run if function was separately shrink-wrapped

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

Unfortunately even after the previous patch there are still a few cases
where regrename creates invalid code when used together with separate
shrink-wrapping.  At noreturn exits regrename thinks it can use all
callee-saved registers, but that is not true.  This patch disables
regrename for functions that were separately shrink-wrapped.

2016-06-07  Segher Boessenkool  

* regrename.c (gate): Disable regrename if shrink_wrapped_separate
is set in crtl.
I think this (and the prior patches) are masking a deeper issue.  It's 
almost as if we don't have correct lifetime information and the various 
return points.  I'd really like to see a deeper analysis of these issues.


jeff



Re: [PATCH 3/9] dce: Don't dead-code delete separately wrapped restores

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

Deleting restores (before a noreturn) that are dead confuses dwarf2cfi.

2016-06-07  Segher Boessenkool  

* dce.c (delete_unmarked_insns): Don't delete instructions with
a REG_CFA_RESTORE note.
I don't really understand this one.  Why is the restore marked dead and 
why doesn't that happen for normal epilogues?  Something wonky seems to 
be going on here.


jeff



Re: [PATCH 2/9] cfgcleanup: Don't confuse CFI when -fshrink-wrap-separate

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

cfgcleanup would try to join noreturn paths with different components
handled.  This then fails in dwarf2cfi.

2016-06-07  Segher Boessenkool  

* cfgcleanup.c (outgoing_edges_match): Don't join noreturn calls
if shrink_wrapped_separate.
So if you only have fake edges, then you've got a non-returning call. If 
you could twiddle the comment to make that clear it'd be appreciated.


Obviously you could look at the components and allow joining if the 
components are the same.  But I don't know if that happens enough in 
practice to be worth the effort.


OK with the comment update if/when the kit as a whole is approved.

jeff





[Bug driver/77529] -fno-pie disables -fPIC

2016-09-08 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77529

--- Comment #4 from H.J. Lu  ---
(In reply to Matthias Klose from comment #0)
> seen with a GCC 6 configured with --enable-default-pie:
> 
> $ gcc -E -dM  - < /dev/null 2>&1|grep -i 'pi[ce]'
> #define __pie__ 2
> #define __PIE__ 2
> #define __pic__ 2
> #define __PIC__ 2
> 
> $ gcc -E -dM -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
> #define __pic__ 2
> #define __PIC__ 2
> 
> $ gcc -E -dM -fPIC -fno-pie - < /dev/null 2>&1|grep -i 'pi[ce]'
> $ gcc -E -dM -fno-pie -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
> #define __pic__ 2
> #define __PIC__ 2
> 
> I would expect the last behavior to be the default one.

--enable-default-pie should only change the default.  When -fno-pie -fPIC
are used, the driver behavior should be independent of --enable-default-pie.
GCC 6 without --enable-default-pie, I got

[hjl@gnu-6 gas]$ gcc -E -dM  - < /dev/null 2>&1|grep -i 'pi[ce]'
[hjl@gnu-6 gas]$ gcc -E -dM -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
#define __pic__ 2
#define __PIC__ 2
[hjl@gnu-6 gas]$ gcc -E -dM -fPIC -fno-pie - < /dev/null 2>&1|grep -i 'pi[ce]'
[hjl@gnu-6 gas]$ gcc -E -dM -fno-pie -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
#define __pic__ 2
#define __PIC__ 2
[hjl@gnu-6 gas]$ 

They match what you got with --enable-default-pie when -fPIC or -fno-pie
are used.

Re: [PATCH 1/9] separate shrink-wrap: New command-line flag, status flag, hooks, and doc

2016-09-08 Thread Jeff Law

On 07/31/2016 07:42 PM, Segher Boessenkool wrote:

This patch adds a new command-line flag "-fshrink-wrap-separate", a status
flag "shrink_wrapped_separate", hooks for abstracting the target components,
and documentation for all those.

2016-06-07  Segher Boessenkool  

* common.opt (-fshrink-wrap-separate): New flag.
* doc/invoke.texi: Document it.
* doc/tm.texi.in (Shrink-wrapping separate components): New subsection.
* doc/tm.texi: Regenerate.
* emit-rtl.h (struct rtl_data): New field shrink_wrapped_separate.
* target.def (shrink_wrap): New hook vector.
(get_separate_components, components_for_bb, disqualify_components,
emit_prologue_components, emit_epilogue_components,
set_handled_components): New hooks.
---
 gcc/common.opt  |  4 
 gcc/doc/invoke.texi | 11 ++-
 gcc/doc/tm.texi | 54 ++
 gcc/doc/tm.texi.in  | 29 +++
 gcc/emit-rtl.h  |  4 
 gcc/target.def  | 57 +
 6 files changed, 158 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..97d305f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 9edb006..5a5c5cab 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4852,6 +4853,59 @@ This hook should add additional registers that are 
computed by the prologue to t
 True if a function's return statements should be checked for matching the 
function's return type.  This includes checking for falling off the end of a 
non-void function.  Return false if no such check should be made.
 @end deftypefn

+@node Shrink-wrapping separate components
+@subsection Shrink-wrapping separate components
+@cindex shrink-wrapping separate components
+
+The prologue does a lot of separate things: save callee-saved registers,
+do whatever needs to be done to be able to call things (save the return
+address, align the stack, whatever; different for each target), set up a
+stack frame, do whatever needs to be done for the static chain (if anything),
+set up registers for PIC, etc.
The prologue may perform a variety of target dependent tasks such as 
saving callee saved registers, saving the return address, aligning the 
stack, create a local stack frame, initialize the PIC register, etc.


On some targets some of these tasks may be independent of others and 
thus may be shrink-wrapped separately.  These independent tasks are 
referred to as components and are handled generically by the target 
independent parts of GCC.


Each component has a slot in a sbitmap that is generated and maintained 
for each basic block.





+@deftypefn {Target Hook} sbitmap TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS 
(void)
+This hook should return an @code{sbitmap} with the bits set for those
+components that can be separately shrink-wrapped in the current function.
+Return @code{NULL} if the current function should not get any separate
+shrink-wrapping.
+Don't define this hook if it would always return @code{NULL}.
+If it is defined, the other hooks in this group have to be defined as well.
+@end deftypefn
+
+@deftypefn {Target Hook} sbitmap TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB 
(basic_block)
+This hook should return an @code{sbitmap} with the bits set for those
+components where either the prologue component has to be executed before
+the @code{basic_block}, or the epilogue component after it, or both.
+@end deftypefn

Who is responsible for allocating and releasing the sbitmaps?


I don't have major concerns with this patch -- I'd like to see 
clarification done on the ownership of the sbitmaps (ie, who allocates 
and releases those objects).  I'd like to see if we can get a better 
introduction as well.


Jeff



Re: [PATCH 1/9] separate shrink-wrap: New command-line flag, status flag, hooks, and doc

2016-09-08 Thread Jeff Law

On 08/29/2016 03:31 AM, Bernd Schmidt wrote:

On 08/01/2016 03:42 AM, Segher Boessenkool wrote:

+@deftypefn {Target Hook} void
TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS (sbitmap)
+Emit prologue insns for the components indicated by the parameter.
+@end deftypefn
+
+@deftypefn {Target Hook} void
TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS (sbitmap)
+Emit epilogue insns for the components indicated by the parameter.
+@end deftypefn


How do these actually know where to save/restore registers? The frame
pointer may have been eliminated, and SP isn't necessarily constant
during the function. Seems like you'd have to calculate CFA reg/offset
much like dwarf2out does and pass it to this hook.
So I think the confusion here is these hooks are independent of 
placement. ie, the target independent code does something like:


FOR_EACH_BB
  Build the component bitmap using the incoming edge components
  Emit the prologue components at the start of the block
  Emit the epilogue components at the end of the block


The components handled by a particular block start are set/cleared by 
the other hooks.


jeff


Re: why do we need xtensa-config.h?

2016-09-08 Thread Oleksij Rempel
Am 08.09.2016 um 18:10 schrieb augustine.sterl...@gmail.com:
> On Thu, Sep 8, 2016 at 8:14 AM, Oleksij Rempel  wrote:
>> Am 07.09.2016 um 18:21 schrieb augustine.sterl...@gmail.com:
>>> On Tue, Sep 6, 2016 at 11:55 PM, Thomas Schwinge
>>>  wrote:
 Hi!

 Neither do I really know anything about Xtensa, nor do I have a lot of
 experience in these parts of GCC back ends, but:
>>>
>>> There is a lot of background to know here. Unfortunately, I have no
>>> familiarity with making debian packages, so I'm unfamiliar with that
>>> side of it.
>>>
>>> First--and perhaps most important--the current method of configuring
>>> GCC for xtensa targets has worked well for nearly two decades. As far
>>> as I know, it is rare to encounter problems. Because of that, the bar
>>> to change it will probably be fairly high to change it.
>>>
 On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
 wrote:
> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
> way to provide this firmware as opensource/free package for debian. Main
> problem seems to be the need to patch gcc xtensa-config.h to make it
> suitable for our CPU.
>
> I have fallowing questions:
>
> do we really need this patch?
> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch

 That I can't tell.  ;-)
>>>
>>> You need something like that patch, for sure.
>>>
> Is it possible or welcome to extend gcc to be configurable without
> patching it all the time?

 Yes, I would think.  The macros modified in the above patch to GCC's
 include/xtensa-config.h file look like these ought to be modifiable with
 -m* options defined by the Xtensa back end, and you'd then assign
 specific defaults to a specific CPU variant, and build GCC (or build a
 multilib) for that configuration.
>>>
>>> Today, there are literally hundreds of variants of the xtensa cpu
>>> actually realized and in use. Having a list of all those variants and
>>> their defaults inside gcc would be awkward and unwieldy.
>>>
>>> But--and here's the rub--literally tomorrow, someone could design a
>>> hundred more that are different from all of the ones already out
>>> there. There is literally an unlimited number of potential variants,
>>> each with potentially brand new, never conceived instructions. (Adding
>>> clever custom instructions is xtensa's raison d'etre.)
>>>
>>> With the current configurability mechanism, supporting all of those
>>> variants inside gcc (and, in fact, the rest of the gnu-toolchain) is
>>> simply a matter of using the correct xtensa-config.h for that
>>> particular variant. If we were to go with the "-m with defaults"
>>> mechanism, we would need some way of adding the defaults for the new
>>> variant to gcc.
>>>
>>> But that is patching gcc also, and once you go there, you may as well
>>> use the original method.
>>>

 This file include/xtensa-config.h is #included in
 gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,
>>>
>>> Note that "-m" options can't change the instructions in crti.S and
>>> lib?funcs.S, but macros can and do.
>>>
>>>
>>>
>>> On the debian packaging side. Forgive me for my ignorance on the
>>> topic; I don't know that the tool-flow is, or what the requirements
>>> are. As far as I am aware, this is the first time someone has tried to
>>> make a debian package for xtensa.
>>
>> The point is to provide a package for "free" repository. It means, any
>> one should be able to use "apt-get source package_name", patch it and
>> recompile it from source.
>>
>> Right now it would work, but the package scripts should download
>> toolchain source, patch and compile it and the compile actual firmware.
>> This is wary high overhead.
>>
>> This is why i asked my self, why the toolchain can't be modular or
>> configurable enough to work without patching and recompiling it.
>>
>>> Anyway, I wouldn't expect patching gcc (or configuring it) for an
>>> individual package is the right thing. It should probably happen as
>>> part of some kind of "setup toolchain" step.
>>>
>>> Typically, patching gcc for a xtensa config happens just once
>>> immediately after designing the processor, or--if you aren't the
>>> designer yourself--when one starts development for that variant.
>>
>> after quick look i noticed that:
>> XSHAL_USE_ABSOLUTE_LITERALS affects TRAMPOLINE_SIZE. This seems to be
>> hardcoded every where in gcc, so can't be changed dynamically?
> 
> TRAMPOLINE_SIZE is used in quite a few places (so beyond my authority
> to accept patches), but I suspect it would be acceptable to put a
> function behind TRAMPOLINE_SIZE that calculated the value dynamically.
> 
>>
>> XCHAL_HAVE_MUL32_HIGH probably can be changed dynamically.
>> XCHAL_ICACHE_SIZE and XCHAL_DCACHE_SIZE will enable or disable part of
>> __xtensa_sync_caches, why not to make it 

Re: [PATCH v2 0/9] Separate shrink-wrapping

2016-09-08 Thread Jeff Law

On 08/26/2016 09:03 AM, Bernd Schmidt wrote:

On 08/26/2016 04:50 PM, Segher Boessenkool wrote:

The head comment starts with

+/* Separate shrink-wrapping
+
+   Instead of putting all of the prologue and epilogue in one spot, we
+   can put parts of it in places where those components are executed
less
+   frequently.

and that is the long and short of it.


And that comment puzzles me. Surely prologue and epilogue are executed
only once currently, so how does frequency come into it? Again - please
provide an example.
Right, they're executed once currently.  But the prologue could be sunk 
into locations where they are not executed every time the function is 
called.  That's the basis behind shrink wrapping.


Segher's code essentially allows individual components of the prologue 
to sink to different points within the function rather than forcing the 
prologue to be sunk as an atomic unit.


Conceptually you could run the standard algorithm on each independent 
component.






The full-prologue algorithm makes as many blocks run without prologue as
possible, by duplicating blocks where that helps.  If you do this for
every component you can and up with 2**40 blocks for just 40 components,


Ok, so why wouldn't we use the existing code with the duplication part
disabled? That's a later addition anyway and isn't necessary to do
shrink-wrapping in the first place.
I think the concern here is the balance between code explosion and the 
runtime gains.


jeff


Re: [PATCH v2 0/9] Separate shrink-wrapping

2016-09-08 Thread Jeff Law

On 08/26/2016 10:27 AM, Segher Boessenkool wrote:

On Fri, Aug 26, 2016 at 05:03:34PM +0200, Bernd Schmidt wrote:

On 08/26/2016 04:50 PM, Segher Boessenkool wrote:

The head comment starts with

+/* Separate shrink-wrapping
+
+   Instead of putting all of the prologue and epilogue in one spot, we
+   can put parts of it in places where those components are executed less
+   frequently.

and that is the long and short of it.


And that comment puzzles me. Surely prologue and epilogue are executed
only once currently, so how does frequency come into it? Again - please
provide an example.


If some component is only needed for 0.01% of executions of a function,
running it once for every execution is 1 times too much.

The trivial example is a function that does an early exit, but uses one
or a few non-volatile registers before that exit.  This happens in e.g.
glibc's malloc, if you want an easily accessed example.  With the current
code, *all* components will be saved and then restored shortly afterwards.
So can you expand on the malloc example a bit -- I'm pretty sure I 
understand what you're trying to do, but a concrete example may help 
Bernd and be useful for archival purposes.


I also know that Carlos is interested in the malloc example -- so I'd 
like to be able to pass that along to him.


Given the multiple early exit and fast paths through the allocator, I'm 
not at all surprised that sinking different components of the prologue 
to different locations is useful.


Also if there's a case where sinking into a loop occurs, definitely 
point that out.





The full-prologue algorithm makes as many blocks run without prologue as
possible, by duplicating blocks where that helps.  If you do this for
every component you can and up with 2**40 blocks for just 40 components,


Ok, so why wouldn't we use the existing code with the duplication part
disabled?


That would not perform nearly as well.


That's a later addition anyway and isn't necessary to do
shrink-wrapping in the first place.


No, it always did that, just not as often (it only duplicated straight-line
code before).
Presumably (I haven't looked yet), the duplication is so that we can 
isolate one or more paths which in turn allows sinking the prologue 
further on some of those paths.


This is something I'll definitely want to look at -- block duplication 
to facilitate code elimination (or in this case avoid code insertion) 
hits several areas of interest to me -- and how we balance duplication 
vs runtime savings is always interesting.


Jeff



Re: [PATCH v2 0/9] Separate shrink-wrapping

2016-09-08 Thread Jeff Law

On 08/30/2016 06:31 AM, Michael Matz wrote:

Hi,

On Fri, 26 Aug 2016, Bernd Schmidt wrote:


And that comment puzzles me. Surely prologue and epilogue are executed only
once currently, so how does frequency come into it? Again - please provide an
example.


int some_global;
int foo (void) {
  if (!some_global) {
call_this();
call_that();
x = some + large / computation;
while (x--) { much_more_code(); }
some_global = 1;
  }
  return some_global;
}

Now you need the full pro/epilogue (saving/restoring all call clobbered
regs presumably used by the large computation and loop) for only one call
of that function (the first), and then never again.  But you do need a
part of it always assuming the architecture is right, and this is a shared
library, namely the setup of the PIC pointer for the access to
some_global.




A prologue does many things, and in some cases many of them aren't
necessary for all calls (indeed that's often the case for functions
containing early-outs), so much so that the full prologue including
unnecessary parts needs more time than the functional body of a functions
particular call.  It's obvious that it would be better to move those parts
of the prologue to places where they actually are needed:

[ ... ]
Right.  Essentially Segher's patch introduces the concept of prologue 
components that are independent of each other and which can be 
shrink-wrapped separately.  The degree of independence is highly target 
specific, of course.


I think one of the questions (and I haven't looked through the whole 
thread yet to see if it's answered) is why the basic shrink-wrapping 
algorithm can't be applied to each of the prologue components -- though 
you may have answered it with your loop example below.





int f2 (void) {
  setup_pic_reg();
  if (!some_global) {
save_call_clobbers();
code_from_above();
restore_call_clobbers();
  }
  retreg = some_global;
  restore_pic_reg();
}

That includes moving parts of the prologue into loops:

int g() {
  int sum = 0;
  for (int i = 0; i < NUM; i++) {
sum += i;
if (sum >= CUTOFF) {
  some_long_winded_expression();
  that_eventually_calls_abort();
}
  }
  return sum;
}

Here all parts of the prologue that somehow make it possible to call other
functions are necessary only when the program will abort eventually: hence
is necessary only at one call of g() at most.  Again it's sensible to move
those parts inside the unlikely condition, even though it's inside a loop.
Thanks.  I'd been wondering about when it'd be useful to push prologue 
code into a loop nest when I saw the patches fly by, but didn't think 
about it too much.  I haven't looked at the shrink-wrapping literature 
in years, but I don't recall it having any concept that there were cases 
where sinking into a loop nest was profitable.


Jeff


Re: [x86] Disable STV pass if -mstackrealign is enabled.

2016-09-08 Thread Eric Botcazou
> What should I look for with  'svn annotate' ?

"Disable STV" line 5959 of config/i386/i386.c.

-- 
Eric Botcazou


[Bug preprocessor/71681] header.gcc file lookup is broken for -remap

2016-09-08 Thread andris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71681

--- Comment #3 from Andris Pavenis  ---
Patch in mailing list

https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00395.html

Re: why do we need xtensa-config.h?

2016-09-08 Thread augustine.sterl...@gmail.com
On Thu, Sep 8, 2016 at 8:14 AM, Oleksij Rempel  wrote:
> Am 07.09.2016 um 18:21 schrieb augustine.sterl...@gmail.com:
>> On Tue, Sep 6, 2016 at 11:55 PM, Thomas Schwinge
>>  wrote:
>>> Hi!
>>>
>>> Neither do I really know anything about Xtensa, nor do I have a lot of
>>> experience in these parts of GCC back ends, but:
>>
>> There is a lot of background to know here. Unfortunately, I have no
>> familiarity with making debian packages, so I'm unfamiliar with that
>> side of it.
>>
>> First--and perhaps most important--the current method of configuring
>> GCC for xtensa targets has worked well for nearly two decades. As far
>> as I know, it is rare to encounter problems. Because of that, the bar
>> to change it will probably be fairly high to change it.
>>
>>> On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
>>> wrote:
 i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
 way to provide this firmware as opensource/free package for debian. Main
 problem seems to be the need to patch gcc xtensa-config.h to make it
 suitable for our CPU.

 I have fallowing questions:

 do we really need this patch?
 https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch
>>>
>>> That I can't tell.  ;-)
>>
>> You need something like that patch, for sure.
>>
 Is it possible or welcome to extend gcc to be configurable without
 patching it all the time?
>>>
>>> Yes, I would think.  The macros modified in the above patch to GCC's
>>> include/xtensa-config.h file look like these ought to be modifiable with
>>> -m* options defined by the Xtensa back end, and you'd then assign
>>> specific defaults to a specific CPU variant, and build GCC (or build a
>>> multilib) for that configuration.
>>
>> Today, there are literally hundreds of variants of the xtensa cpu
>> actually realized and in use. Having a list of all those variants and
>> their defaults inside gcc would be awkward and unwieldy.
>>
>> But--and here's the rub--literally tomorrow, someone could design a
>> hundred more that are different from all of the ones already out
>> there. There is literally an unlimited number of potential variants,
>> each with potentially brand new, never conceived instructions. (Adding
>> clever custom instructions is xtensa's raison d'etre.)
>>
>> With the current configurability mechanism, supporting all of those
>> variants inside gcc (and, in fact, the rest of the gnu-toolchain) is
>> simply a matter of using the correct xtensa-config.h for that
>> particular variant. If we were to go with the "-m with defaults"
>> mechanism, we would need some way of adding the defaults for the new
>> variant to gcc.
>>
>> But that is patching gcc also, and once you go there, you may as well
>> use the original method.
>>
>>>
>>> This file include/xtensa-config.h is #included in
>>> gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,
>>
>> Note that "-m" options can't change the instructions in crti.S and
>> lib?funcs.S, but macros can and do.
>>
>>
>>
>> On the debian packaging side. Forgive me for my ignorance on the
>> topic; I don't know that the tool-flow is, or what the requirements
>> are. As far as I am aware, this is the first time someone has tried to
>> make a debian package for xtensa.
>
> The point is to provide a package for "free" repository. It means, any
> one should be able to use "apt-get source package_name", patch it and
> recompile it from source.
>
> Right now it would work, but the package scripts should download
> toolchain source, patch and compile it and the compile actual firmware.
> This is wary high overhead.
>
> This is why i asked my self, why the toolchain can't be modular or
> configurable enough to work without patching and recompiling it.
>
>> Anyway, I wouldn't expect patching gcc (or configuring it) for an
>> individual package is the right thing. It should probably happen as
>> part of some kind of "setup toolchain" step.
>>
>> Typically, patching gcc for a xtensa config happens just once
>> immediately after designing the processor, or--if you aren't the
>> designer yourself--when one starts development for that variant.
>
> after quick look i noticed that:
> XSHAL_USE_ABSOLUTE_LITERALS affects TRAMPOLINE_SIZE. This seems to be
> hardcoded every where in gcc, so can't be changed dynamically?

TRAMPOLINE_SIZE is used in quite a few places (so beyond my authority
to accept patches), but I suspect it would be acceptable to put a
function behind TRAMPOLINE_SIZE that calculated the value dynamically.

>
> XCHAL_HAVE_MUL32_HIGH probably can be changed dynamically.
> XCHAL_ICACHE_SIZE and XCHAL_DCACHE_SIZE will enable or disable part of
> __xtensa_sync_caches, why not to make it dynamically as command line option?
>
>
> IMO most of xtensa-config.h can be changed on runtime. Or do i miss some
> thing?

Unless we can make it all of what xtensa-config.h 

Re: RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-08 Thread Joseph Myers
On Thu, 8 Sep 2016, Jason Merrill wrote:

> Various places in GCC use negate, bit-and and compare to test whether
> an integer is a power of 2, but I think it would be clearer for this
> test to be wrapped in a function.

(x & -x) == x is also true for 0.  Whatever the correct function semantics 
for 0, the comment needs to reflect them.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][expmed.c] PR middle-end/77426 Delete duplicate condition in synth_mult

2016-09-08 Thread Jeff Law

On 09/07/2016 06:59 AM, Kyrill Tkachov wrote:

Hi all,

The duplicate mode check in synth can just be deleted IMO. It was
introduced as part of r139821 that was
a much larger change introducing size/speed differentiation to the RTL
midend. So I think it's just a typo/copy-pasto.

Tested on aarch64-none-elf.
Ok?

Thanks,
Kyrill

2016-09-07  Kyrylo Tkachov  

PR middle-end/77426
* expmed.c (synth_mult): Delete duplicate mode check.

OK.
jeff


[Bug c/77530] optimization prevents excess precision from being removed with x86/amd64 long double and rounding to 53 bits

2016-09-08 Thread vincent-gcc at vinc17 dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77530

--- Comment #3 from Vincent Lefèvre  ---
More precisely, up to NetBSD 6, it is 53 bits (gcc70 is NetBSD 5.1). As of
NetBSD 7, it is 64 bits.

[Bug c/77530] optimization prevents excess precision from being removed with x86/amd64 long double and rounding to 53 bits

2016-09-08 Thread vincent-gcc at vinc17 dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77530

--- Comment #2 from Vincent Lefèvre  ---
This seems to depend on the NetBSD version. The default rounding precision has
changed in NetBSD 7.0: https://www.netbsd.org/changes/changes-7.0.html

[Bug c/77530] optimization prevents excess precision from being removed with x86/amd64 long double and rounding to 53 bits

2016-09-08 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77530

--- Comment #1 from joseph at codesourcery dot com  ---
The x86 back end uses TARGET_96_ROUND_53_LONG_DOUBLE only for the case of 
32-bit mode on DragonflyBSD and FreeBSD.  From what you're saying here, it 
needs to do so for NetBSD as well (both 32-bit and 64-bit?).

[Bug c/77531] New: __attribute__((alloc_size(1,2))) could also warn on multiplication overflow

2016-09-08 Thread crrodriguez at opensuse dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77531

Bug ID: 77531
   Summary: __attribute__((alloc_size(1,2))) could also warn on
multiplication overflow
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: crrodriguez at opensuse dot org
  Target Milestone: ---

Using the example in the docs:

void* my_calloc(size_t x , size_t y) __attribute__((alloc_size(1,2)))

since alloc_size(1,2) means the function will return memory of  x *y..when the
compiler knows that multiplying x * y will cause an integer overflow..let's say
a obvious case my_calloc(SIZE_MAX, SIZE_MAX); it could either warn, error or
trap before an integer overflow actually happens.. (using the same logic as
__builtin_mul_overflow() I guess)

Re: [PATCH][v3] GIMPLE store merging pass

2016-09-08 Thread Bernhard Reutner-Fischer
On 8 September 2016 at 10:31, Kyrill Tkachov
 wrote:
>
> On 07/09/16 20:03, Bernhard Reutner-Fischer wrote:
>>
>> On September 6, 2016 5:14:47 PM GMT+02:00, Kyrill Tkachov
>>  wrote:

> Thanks, fixed all the above in my tree (will be retesting).
>

>> What about debug statements? ISTM you should skip those.
>> (Isn't visited reset before entry of a pass?)
>
>
> Yes, I'll skip debug statements. Comments in gimple.h say that the visited
> property is undefined at pass boundaries, so I'd rather initialize it here.

Right.
>
>
>> Maybe I missed the bikeshedding about the name but I'd have used
>> -fmerge-stores instead.
>
>
> Wouldn't be hard to change. I can change it any point if there's a
> consensus.

Did you consider any relation to any of
https://gcc.gnu.org/PR22141
https://gcc.gnu.org/PR23684
https://gcc.gnu.org/PR47059
https://gcc.gnu.org/PR54422
and their dups
or https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg77311.html
(the -fmerge-bitfields suggestion from imgtec; maybe the testcases are
of interest)

thanks,


[Bug driver/77529] -fno-pie disables -fPIC

2016-09-08 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77529

Markus Trippelsdorf  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #3 from Markus Trippelsdorf  ---
CCing H.J.

Re: why do we need xtensa-config.h?

2016-09-08 Thread Oleksij Rempel
Am 07.09.2016 um 18:21 schrieb augustine.sterl...@gmail.com:
> On Tue, Sep 6, 2016 at 11:55 PM, Thomas Schwinge
>  wrote:
>> Hi!
>>
>> Neither do I really know anything about Xtensa, nor do I have a lot of
>> experience in these parts of GCC back ends, but:
> 
> There is a lot of background to know here. Unfortunately, I have no
> familiarity with making debian packages, so I'm unfamiliar with that
> side of it.
> 
> First--and perhaps most important--the current method of configuring
> GCC for xtensa targets has worked well for nearly two decades. As far
> as I know, it is rare to encounter problems. Because of that, the bar
> to change it will probably be fairly high to change it.
> 
>> On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
>> wrote:
>>> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
>>> way to provide this firmware as opensource/free package for debian. Main
>>> problem seems to be the need to patch gcc xtensa-config.h to make it
>>> suitable for our CPU.
>>>
>>> I have fallowing questions:
>>>
>>> do we really need this patch?
>>> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch
>>
>> That I can't tell.  ;-)
> 
> You need something like that patch, for sure.
> 
>>> Is it possible or welcome to extend gcc to be configurable without
>>> patching it all the time?
>>
>> Yes, I would think.  The macros modified in the above patch to GCC's
>> include/xtensa-config.h file look like these ought to be modifiable with
>> -m* options defined by the Xtensa back end, and you'd then assign
>> specific defaults to a specific CPU variant, and build GCC (or build a
>> multilib) for that configuration.
> 
> Today, there are literally hundreds of variants of the xtensa cpu
> actually realized and in use. Having a list of all those variants and
> their defaults inside gcc would be awkward and unwieldy.
> 
> But--and here's the rub--literally tomorrow, someone could design a
> hundred more that are different from all of the ones already out
> there. There is literally an unlimited number of potential variants,
> each with potentially brand new, never conceived instructions. (Adding
> clever custom instructions is xtensa's raison d'etre.)
> 
> With the current configurability mechanism, supporting all of those
> variants inside gcc (and, in fact, the rest of the gnu-toolchain) is
> simply a matter of using the correct xtensa-config.h for that
> particular variant. If we were to go with the "-m with defaults"
> mechanism, we would need some way of adding the defaults for the new
> variant to gcc.
> 
> But that is patching gcc also, and once you go there, you may as well
> use the original method.
> 
>>
>> This file include/xtensa-config.h is #included in
>> gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,
> 
> Note that "-m" options can't change the instructions in crti.S and
> lib?funcs.S, but macros can and do.
> 
> 
> 
> On the debian packaging side. Forgive me for my ignorance on the
> topic; I don't know that the tool-flow is, or what the requirements
> are. As far as I am aware, this is the first time someone has tried to
> make a debian package for xtensa.

The point is to provide a package for "free" repository. It means, any
one should be able to use "apt-get source package_name", patch it and
recompile it from source.

Right now it would work, but the package scripts should download
toolchain source, patch and compile it and the compile actual firmware.
This is wary high overhead.

This is why i asked my self, why the toolchain can't be modular or
configurable enough to work without patching and recompiling it.

> Anyway, I wouldn't expect patching gcc (or configuring it) for an
> individual package is the right thing. It should probably happen as
> part of some kind of "setup toolchain" step.
> 
> Typically, patching gcc for a xtensa config happens just once
> immediately after designing the processor, or--if you aren't the
> designer yourself--when one starts development for that variant.

after quick look i noticed that:
XSHAL_USE_ABSOLUTE_LITERALS affects TRAMPOLINE_SIZE. This seems to be
hardcoded every where in gcc, so can't be changed dynamically?

XCHAL_HAVE_MUL32_HIGH probably can be changed dynamically.
XCHAL_ICACHE_SIZE and XCHAL_DCACHE_SIZE will enable or disable part of
__xtensa_sync_caches, why not to make it dynamically as command line option?


IMO most of xtensa-config.h can be changed on runtime. Or do i miss some
thing?

> Surely if someone is building this package, they would have already
> built some other software for that particular xtensa target. (Perhaps
> as part of a larger debian build?)
> 
> Also, this package should probably only be built when targeting this
> particular xtensa variant (not xtensa generally). I don't know how one
> restricts this in the debian packaging mechanism.
> 
> Hope this helps, and I'm happy to answer more questions.
> 


[Bug driver/77529] -fno-pie disables -fPIC

2016-09-08 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77529

--- Comment #2 from Markus Trippelsdorf  ---
Clang behaves exactly the same.

Re: RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-08 Thread Jakub Jelinek
On Thu, Sep 08, 2016 at 08:53:26AM -0600, Jeff Law wrote:
> On 09/07/2016 11:56 PM, Jason Merrill wrote:
> > Various places in GCC use negate, bit-and and compare to test whether
> > an integer is a power of 2, but I think it would be clearer for this
> > test to be wrapped in a function.
> > 
> > OK for trunk?
> > 
> I think the canonical way we've written that is
> 
> exact_log2 (x) != -1

We have integer_pow2p which works on trees (and is actually implemented by
testing if popcnt is 1 on the wide-int).
Calling exact_log2 for such a test if you don't need the value is a waste of
compile time, ctz_hwi doesn't need to be computed.  And, exact_log2 is -1
even on 0, which is not pow2p_hwi.

I think having pow2p_hwi would be nice, if we really start using it where
appropriate.

Jakub


Re: Make max_align_t respect _Float128 [version 2]

2016-09-08 Thread Paul Eggert

On 09/08/2016 04:56 AM, Mark Wielaard wrote:

I don't think there is anything valgrind can do to detect that
compw really only depends on d[0] if the result is false.

valgrind (with the default --partial-loads-ok=yes) could and should do 
the same thing with cmpw that it already does with cmpl. The attached 
program compiles and runs OK with gcc -O2 and valgrind on x86-64, even 
though the machine code uses cmpl to load bytes that were not allocated.



Do gnulib and glibc syncronize?


They do at times, though not as often as we'd like. fts.c in particular 
has strayed quite a bit, to cater GNU utilities' robustness needs over 
what glibc provides. (GNU coreutils, for example, uses Gnulib fts.c even 
on glibc platforms.) At some point somebody should merge Gnulib fts.c 
back into glibc.


#include 
#include 

struct s { struct s *next; char d[]; };

int
main (void)
{
  struct s *p = malloc (offsetof (struct s, d) + 1);
  p->d[0] = 1;
  return p->d[0] == 2 && p->d[1] == 3 && p->d[2] == 4 && p->d[3] == 5;
}


[Bug driver/77529] -fno-pie disables -fPIC

2016-09-08 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77529

Markus Trippelsdorf  changed:

   What|Removed |Added

 CC||trippels at gcc dot gnu.org

--- Comment #1 from Markus Trippelsdorf  ---
Pilot error. Looks like a dup of PR77464.

Re: RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-08 Thread Jeff Law

On 09/07/2016 11:56 PM, Jason Merrill wrote:

Various places in GCC use negate, bit-and and compare to test whether
an integer is a power of 2, but I think it would be clearer for this
test to be wrapped in a function.

OK for trunk?


I think the canonical way we've written that is

exact_log2 (x) != -1


Though you could argue that's far from obvious.

Presumably you have a follow-up to start using pow2p_hwi?

jeff


[Bug debug/61013] [4.9/4.10 Regression] Option parsing difference between < 4.9 and 4.9

2016-09-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61013

--- Comment #17 from Jakub Jelinek  ---
As mentioned in PR77454, do we want to treat -gdwarf-N the same as -g (in
addition to setting the dwarf version), or should it be treated just like
setting of the DWARF version only if some debug info level is already enabled?
In particular, the questionable option order is
-g1 -gdwarf-4
which is now treated like
-g1 -g2 -gdwarf-4
rather than
-gdwarf-4 -g1

[Bug driver/77497] [5/6/7 Regression] Setting DWARF level and debug level together has flag-ordering-dependent results

2016-09-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77497

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
This changed in r210442 aka PR61013.

[Bug driver/77497] [5/6/7 Regression] Setting DWARF level and debug level together has flag-ordering-dependent results

2016-09-08 Thread torne at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77497

--- Comment #2 from torne at google dot com ---
Ah. I can see the logic in "-g1 -g" resulting in -g2 level output (as explained
in that bug), but it seems less ideal here where the second -g is only really
intending to define the DWARF level - especially since there's no other choice
than to do it in two flags because you can't set the debug level and the dwarf
level at once, unlike the other debug formats where you can say -gstabs1 or
similar to set both together.

[Bug c/77530] New: optimization prevents excess precision from being removed with x86/amd64 long double and rounding to 53 bits

2016-09-08 Thread vincent-gcc at vinc17 dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77530

Bug ID: 77530
   Summary: optimization prevents excess precision from being
removed with x86/amd64 long double and rounding to 53
bits
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincent-gcc at vinc17 dot net
  Target Milestone: ---

Under NetBSD/amd64, long double is on 80 bits but the processor is configured
to round on 53 bits (IEEE double precision) instead of 64 bits. The problem is
that when evaluating expressions at runtime, GCC seems to assume that the
rounding precision is 64 bits, yielding inconsistencies.

A test case:

#include 
#include 

int main (void)
{
  long double d = LDBL_MAX;
  volatile long double e = 0x8p+16380L;

  d -= 1.0;  /* no excess precision */
  printf ("d=%La e=%La\n", d, e);
  d -= e;
  printf ("d=%La\n", d);
  return 0;
}

When I compile this program with GCC 4.1.3 and the -O option under NetBSD
(gcc70.fsffrance.org), I get:

d=0xf.fffp+16380 e=0x8p+16380
d=0x8p+16380

The excess precision is kept in the first subtraction, but removed in the
second one.

There isn't a more recent GCC version installed on this machine, but I can
simulate this bug under Linux with the options "-std=c99 -mpc64 -O". I get the
same incorrect result with GCC 6.2.0 under Debian/unstable (amd64).

[Bug target/77476] [7 Regression] [AVX-512] illegal kmovb instruction on KNL

2016-09-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77476

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug target/77527] [7 Regression] ICE: in make_decl_rtl, at varasm.c:1311 with -mstringop-strategy=libcall

2016-09-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77527

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
So, what I see happening is that lim2 creates a v_lsm.5 VAR_DECL here, and it
is is_gimple_reg and only used in SSA_NAMEs.  As it is a wide vector, it is
BLKmode in the end, and during expansion emit_block_move called because of an
assignment to it calls mark_decl_addressable on the VAR_DECL (which queues
TREE_ADDRESSABLE set for after expansion).  The v_lsm.5 VAR_DECL then appears
in MEM_EXPRs of various MEMs, but because the VAR_DECL has been originally
gimple register, it doesn't have DECL_RTL set during expansion (each SSA_NAME
for it has its own I bet).  Then when nonoverlapping_memrefs_p is called on it
during DSE, it sees  the MEM_EXPR is not is_gimple_reg and uses DECL_RTL, which
ICEs, because it doesn't have the rtl and is not meant to get rtl created that
way.

I wonder if the queued addressable decls set to TREE_ADDRESSABLE after it
shouldn't have DECL_RTL set at that point (but to which one), or alternatively
if nonoverlapping_memrefs_p should use DECL_RTL_IF_SET instead and somehow deal
with the fact when that returns NULL_RTX (dunno if it should handle that
conservatively, by giving up, or if there are cases which it could still
handle).

Re: [x86] Disable STV pass if -mstackrealign is enabled.

2016-09-08 Thread H.J. Lu
On Wed, Sep 7, 2016 at 11:00 PM, Eric Botcazou  wrote:
>> Is there a testcase to show the problem with -mincoming-stack-boundary=
>> on Linux?
>
> I don't know, 'svn annotate' will probably tell you.

What should I look for with  'svn annotate' ?

-- 
H.J.


[Bug driver/77529] New: -fno-pie disables -fPIC

2016-09-08 Thread doko at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77529

Bug ID: 77529
   Summary: -fno-pie disables -fPIC
   Product: gcc
   Version: 6.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: doko at gcc dot gnu.org
  Target Milestone: ---

seen with a GCC 6 configured with --enable-default-pie:

$ gcc -E -dM  - < /dev/null 2>&1|grep -i 'pi[ce]'
#define __pie__ 2
#define __PIE__ 2
#define __pic__ 2
#define __PIC__ 2

$ gcc -E -dM -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
#define __pic__ 2
#define __PIC__ 2

$ gcc -E -dM -fPIC -fno-pie - < /dev/null 2>&1|grep -i 'pi[ce]'
$ gcc -E -dM -fno-pie -fPIC - < /dev/null 2>&1|grep -i 'pi[ce]'
#define __pic__ 2
#define __PIC__ 2

I would expect the last behavior to be the default one.

[PATCH][AArch64] Align FP callee-saves

2016-09-08 Thread Wilco Dijkstra
If the number of integer callee-saves is odd, the FP callee-saves use 8-byte 
aligned
LDP/STP.  Since 16-byte alignment may be faster on some CPUs, align the FP
callee-saves to 16 bytes and use the alignment gap for the last FP callee-save 
when
possible. Besides slightly different offsets for FP callee-saves, the generated 
code
doesn't change.

Bootstrap and regression pass, OK for commit?


ChangeLog:
2016-09-08  Wilco Dijkstra  

* config/aarch64/aarch64.c (aarch64_layout_frame):
Align FP callee-saves.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
fed3b6e803821392194dc34a6c3df5f653d2e33e..075b3802c72a68f63b47574e19186e7ce3440b28
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2735,7 +2735,7 @@ static void
 aarch64_layout_frame (void)
 {
   HOST_WIDE_INT offset = 0;
-  int regno;
+  int regno, last_fp_reg = INVALID_REGNUM;
 
   if (reload_completed && cfun->machine->frame.laid_out)
 return;
@@ -2781,7 +2781,10 @@ aarch64_layout_frame (void)
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
 if (df_regs_ever_live_p (regno)
&& !call_used_regs[regno])
-  cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
+  {
+   cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
+   last_fp_reg = regno;
+  }
 
   if (cfun->machine->frame.emit_frame_chain)
 {
@@ -2805,9 +2808,21 @@ aarch64_layout_frame (void)
offset += UNITS_PER_WORD;
   }
 
+  HOST_WIDE_INT max_int_offset = offset;
+  offset = ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT);
+  bool has_align_gap = offset != max_int_offset;
+
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
 if (cfun->machine->frame.reg_offset[regno] == SLOT_REQUIRED)
   {
+   /* If there is an alignment gap between integer and fp callee-saves,
+  allocate the last fp register to it if possible.  */
+   if (regno == last_fp_reg && has_align_gap && (offset & 8) == 0)
+ {
+   cfun->machine->frame.reg_offset[regno] = max_int_offset;
+   break;
+ }
+
cfun->machine->frame.reg_offset[regno] = offset;
if (cfun->machine->frame.wb_candidate1 == INVALID_REGNUM)
  cfun->machine->frame.wb_candidate1 = regno;



Re: [patch, libgomp, OpenACC] Additional enter/exit data map handling

2016-09-08 Thread Thomas Schwinge
Hi!

On Thu, 8 Sep 2016 19:18:30 +0800, Chung-Lin Tang  
wrote:
> On 2016/9/6 8:11 PM, Thomas Schwinge wrote:
> > On Mon, 29 Aug 2016 15:46:47 +0800, Chung-Lin Tang 
> >  wrote:
> >> this patch is a port of some changes from gomp-4_0-branch,
> >> including adding additional map type handling in OpenACC enter/exit data
> >> directives, and some pointer set handling changes. Updated
> >> testsuite case are also included.
> >>
> >> Tested on trunk to ensure no regressions, is this okay for trunk?
> > 
> >> 2016-08-29  Cesar Philippidis  
> >> Thomas Schwinge  
> >> Chung-Lin Tang  
> > 
> > Maybe I'm misremembering, but I can't remember having been involved in
> > this.  ;-)
> 
> A part of this was picked from r223178, which you committed to 
> gomp-4_0-branch.

Heh, right, though that was a commit containing "Assorted OpenACC
changes", so merging various changes from our internal development
branch, done by several people.  Anyway, nothing to waste much time on.
;-)


> >> +/* Returns the number of mappings associated with the pointer or pset. 
> >> PSET
> >> +   have three mappings, whereas pointer have two.  */
> >> +
> >>  static int
> >> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
> >> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
> >>  {
> >>if (pos + 1 >= mapnum)
> >>  return 0;
> >>  
> >>unsigned char kind = kinds[pos+1] & 0xff;
> >>  
> >> -  return kind == GOMP_MAP_TO_PSET;
> >> +  if (kind == GOMP_MAP_TO_PSET)
> >> +return 3;
> >> +  else if (kind == GOMP_MAP_POINTER)
> >> +return 2;
> >> +
> >> +  return 0;
> >>  }
> > 
> > I'm still confused about that find_pset/find_pointer handling.  Why is
> > that required?  Essentially, that means that GOACC_enter_exit_data is
> > skipping over some mappings, right?  If yes, why do the front ends
> > (Fortran only?) then emit these mappings to begin with, if we're then
> > ignoring them in the runtime?
> 
> It's not skipping mappings. GOMP_MAP_PSET uses 3 continuous entries while
> GOMP_MAP_POINTER uses 2, see how these are eventually processed together
> in gomp_map_vars().

I now see how for the "pointer != 0" case, *the address of*
"hostaddrs[i]" etc. is passed to gomp_acc_insert_pointer, which then
calls gomp_map_vars.  So, you're (or more precisely, those who once
committed these changes to our internal development branch) indeed just
extend the existing GOMP_MAP_TO_PSET handling to also cover
GOMP_MAP_POINTER.  This code still doesn't look very pretty generally,
but that's not your task to fix, right now.


Thus, your patch is back in the queue, waiting for approval.


Grüße
 Thomas


[Bug tree-optimization/65068] Improve rewriting for address type induction variables in IVOPT

2016-09-08 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65068

Wilco  changed:

   What|Removed |Added

 CC||wdijkstr at arm dot com

--- Comment #3 from Wilco  ---
Note gcc.target/aarch64/ldp_vec_64_1.c shows the same issue. Same story there,
most cores produce inefficient code, only -mcpu=vulcan gets close:

foo:
add x1, x1, 8
add x2, x0, 8192
.p2align 3
.L2:
ldr d0, [x1, -8]
ldr d1, [x1], 16
add v0.2s, v0.2s, v1.2s
str d0, [x0], 16
cmp x2, x0
bne .L2
ret

That still shows an odd pre-increment in the loop header, which blocks the use
of ldp.

Also in all cases aarch64_legitimize_address is called with offset 32760 on
V2SI - this offset does not occur anywhere in the source, so it should not
matter how we split it. However what we do for that case affects IVOpt, which
is clearly a bug.

[Bug libstdc++/77528] std::queue default constructor unnecessarily creates temporary of underlying Container

2016-09-08 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77528

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-08
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
Untested fix:

--- a/libstdc++-v3/include/bits/stl_queue.h
+++ b/libstdc++-v3/include/bits/stl_queue.h
@@ -143,12 +143,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   queue(const _Sequence& __c = _Sequence())
   : c(__c) { }
 #else
+  template::value>::type>
+   queue()
+   : c() { }
+
   explicit
   queue(const _Sequence& __c)
   : c(__c) { }

   explicit
-  queue(_Sequence&& __c = _Sequence())
+  queue(_Sequence&& __c)
   : c(std::move(__c)) { }

   template>

[PATCH] -fsanitize=thread fixes (PR sanitizer/68260, take 2)

2016-09-08 Thread Jakub Jelinek
On Thu, Sep 08, 2016 at 02:34:18PM +0200, Jakub Jelinek wrote:
> I can split the patch into two, one dealing just with the
> __atomic_clear/__atomic_test_and_set instrumentation and another for the
> preexisting -fnon-call-exceptions ICEs.  For the latter, one possibility
> would be to error out on the -fsanitize=thread -fnon-call-exceptions
> combination.

Here is just the hopefully non-controversial first split part.
Ok for trunk?

2016-09-08  Jakub Jelinek  

PR sanitizer/68260
* tsan.c: Include target.h.
(enum tsan_atomic_action): Add bool_clear and bool_test_and_set.
(BOOL_CLEAR, BOOL_TEST_AND_SET): Define.
(tsan_atomic_table): Add BUILT_IN_ATOMIC_CLEAR and
BUILT_IN_ATOMIC_TEST_AND_SET entries.
(instrument_builtin_call): Handle bool_clear and bool_test_and_set.

* c-c++-common/tsan/pr68260.c: New test.

--- gcc/tsan.c.jj   2016-07-22 15:55:31.591287885 +0200
+++ gcc/tsan.c  2016-09-08 14:46:08.645027959 +0200
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.
 #include "tsan.h"
 #include "asan.h"
 #include "builtins.h"
+#include "target.h"
 
 /* Number of instrumented memory accesses in the current function.  */
 
@@ -240,7 +241,8 @@ instrument_expr (gimple_stmt_iterator gs
 enum tsan_atomic_action
 {
   check_last, add_seq_cst, add_acquire, weak_cas, strong_cas,
-  bool_cas, val_cas, lock_release, fetch_op, fetch_op_seq_cst
+  bool_cas, val_cas, lock_release, fetch_op, fetch_op_seq_cst,
+  bool_clear, bool_test_and_set
 };
 
 /* Table how to map sync/atomic builtins to their corresponding
@@ -274,6 +276,10 @@ static const struct tsan_map_atomic
   TRANSFORM (fcode, tsan_fcode, fetch_op, code)
 #define FETCH_OPS(fcode, tsan_fcode, code) \
   TRANSFORM (fcode, tsan_fcode, fetch_op_seq_cst, code)
+#define BOOL_CLEAR(fcode, tsan_fcode) \
+  TRANSFORM (fcode, tsan_fcode, bool_clear, ERROR_MARK)
+#define BOOL_TEST_AND_SET(fcode, tsan_fcode) \
+  TRANSFORM (fcode, tsan_fcode, bool_test_and_set, ERROR_MARK)
 
   CHECK_LAST (ATOMIC_LOAD_1, TSAN_ATOMIC8_LOAD),
   CHECK_LAST (ATOMIC_LOAD_2, TSAN_ATOMIC16_LOAD),
@@ -463,7 +469,11 @@ static const struct tsan_map_atomic
   LOCK_RELEASE (SYNC_LOCK_RELEASE_2, TSAN_ATOMIC16_STORE),
   LOCK_RELEASE (SYNC_LOCK_RELEASE_4, TSAN_ATOMIC32_STORE),
   LOCK_RELEASE (SYNC_LOCK_RELEASE_8, TSAN_ATOMIC64_STORE),
-  LOCK_RELEASE (SYNC_LOCK_RELEASE_16, TSAN_ATOMIC128_STORE)
+  LOCK_RELEASE (SYNC_LOCK_RELEASE_16, TSAN_ATOMIC128_STORE),
+
+  BOOL_CLEAR (ATOMIC_CLEAR, TSAN_ATOMIC8_STORE),
+
+  BOOL_TEST_AND_SET (ATOMIC_TEST_AND_SET, TSAN_ATOMIC8_EXCHANGE)
 };
 
 /* Instrument an atomic builtin.  */
@@ -615,6 +625,57 @@ instrument_builtin_call (gimple_stmt_ite
build_int_cst (NULL_TREE,
   MEMMODEL_RELEASE));
return;
+ case bool_clear:
+ case bool_test_and_set:
+   if (BOOL_TYPE_SIZE != 8)
+ {
+   decl = NULL_TREE;
+   for (j = 1; j < 5; j++)
+ if (BOOL_TYPE_SIZE == (8 << j))
+   {
+ enum built_in_function tsan_fcode
+   = (enum built_in_function)
+ (tsan_atomic_table[i].tsan_fcode + j);
+ decl = builtin_decl_implicit (tsan_fcode);
+ break;
+   }
+   if (decl == NULL_TREE)
+ return;
+ }
+   last_arg = gimple_call_arg (stmt, num - 1);
+   if (!tree_fits_uhwi_p (last_arg)
+   || memmodel_base (tree_to_uhwi (last_arg)) >= MEMMODEL_LAST)
+ return;
+   t = TYPE_ARG_TYPES (TREE_TYPE (decl));
+   t = TREE_VALUE (TREE_CHAIN (t));
+   if (tsan_atomic_table[i].action == bool_clear)
+ {
+   update_gimple_call (gsi, decl, 3, gimple_call_arg (stmt, 0),
+   build_int_cst (t, 0), last_arg);
+   return;
+ }
+   t = build_int_cst (t, targetm.atomic_test_and_set_trueval);
+   update_gimple_call (gsi, decl, 3, gimple_call_arg (stmt, 0),
+   t, last_arg);
+   stmt = gsi_stmt (*gsi);
+   lhs = gimple_call_lhs (stmt);
+   if (lhs == NULL_TREE)
+ return;
+   if (targetm.atomic_test_and_set_trueval != 1
+   || !useless_type_conversion_p (TREE_TYPE (lhs),
+  TREE_TYPE (t)))
+ {
+   tree new_lhs = make_ssa_name (TREE_TYPE (t));
+   gimple_call_set_lhs (stmt, new_lhs);
+   if (targetm.atomic_test_and_set_trueval != 1)
+ g = gimple_build_assign (lhs, NE_EXPR, new_lhs,
+  build_int_cst (TREE_TYPE (t), 0));
+   else
+ g = gimple_build_assign (lhs, NOP_EXPR, new_lhs);
+

[Bug libstdc++/77524] Empty std::deque allocates unnecessary heap memory

2016-09-08 Thread d.rigby at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77524

--- Comment #5 from Dave Rigby  ---
(In reply to Jonathan Wakely from comment #4)
> Please create a new bug for those container adaptors, as that can and should
> be fixed for the default configuration. We'll keep this bug for the
> non-backwards-compatible std::deque change to be done at some future date.
> Thanks.

Done - bug 77528

[Bug libstdc++/77528] New: std::queue default constructor unnecessarily creates temporary of underlying Container

2016-09-08 Thread d.rigby at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77528

Bug ID: 77528
   Summary: std::queue default constructor unnecessarily creates
temporary of underlying Container
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: d.rigby at me dot com
  Target Milestone: ---

Spun out from bug 77524 - std::queue (defaulting to using std::deque as the
underlying Container) performs 4 allocations for an empty std::queue.
Initial analysis from Jonathan Wakely (bug 77524 comment 4):


Ouch. It's caused by the same issue, but the default constructor for std::queue
constructs a temporary of the underlying sequence, then move constructs the
data member from the temporary. With std::deque that means we allocate for the
temporary, then allocate again for the data member, then deallocate the memory
of the temporary. That sucks.

  explicit
  queue(_Sequence&& __c = _Sequence())
  : c(std::move(__c)) { }

 IIRC the "default" constructor is defined that way so that it's possible to
explicitly instantiate std::queue<_Sequence> with a non-DefaultConstructible 
_Sequence. But it means std::queue is horribly inefficient. At
least for an empty std::deque the initial allocation is probably going to
end up being useful, because in most cases the deque doesn't stay empty and the
allocation ends up being useful later. In this std::queue constructor the
temporary is definitely never used (it goes out of scope immediately) so
pre-allocating is entirely useless. std::stack has the same issue.

Please create a new bug for those container adaptors, as that can and should be
fixed for the default configuration. We'll keep this bug for the
non-backwards-compatible std::deque change to be done at some future date.
Thanks.


Re: [PATCH] -fsanitize=thread fixes (PR sanitizer/68260)

2016-09-08 Thread Jakub Jelinek
On Wed, Sep 07, 2016 at 10:40:20PM +0200, Jakub Jelinek wrote:
> On Wed, Sep 07, 2016 at 09:07:45AM +0200, Richard Biener wrote:
> > > @@ -493,6 +504,8 @@ instrument_builtin_call (gimple_stmt_ite
> > >   if (!tree_fits_uhwi_p (last_arg)
> > >   || memmodel_base (tree_to_uhwi (last_arg)) >= MEMMODEL_LAST)
> > > return;
> > > + if (lookup_stmt_eh_lp (stmt))
> > > +   remove_stmt_from_eh_lp (stmt);
> > 
> > These changes look bogus to me -- how can the tsan instrumentation
> > function _not_ throw when the original function we replace it can?
> 
> The __tsan*atomic* functions are right now declared to be nothrow, so the
> patch just matches how they are declared.
> While the sync-builtins.def use
> #define ATTR_NOTHROWCALL_LEAF_LIST (flag_non_call_exceptions ? \
> ATTR_LEAF_LIST : ATTR_NOTHROW_LEAF_LIST)
> I guess I could use the same for the tsan atomics, but wonder if it will work
> properly when the libtsan is built with exceptions disabled and without
> -fnon-call-exceptions.  Perhaps it would, at least if it is built with
> -fasynchronous-unwind-tables (which is the case for x86_64 and aarch64 and
> tsan isn't supported elsewhere).

Actually, looking at the libsanitizer/tsan/ implementation, I have doubts
the atomics handling code would behave reasonably if an exception is thrown
during the atomic memory access, I'm afraid some classes wouldn't be
destructed.

I can split the patch into two, one dealing just with the
__atomic_clear/__atomic_test_and_set instrumentation and another for the
preexisting -fnon-call-exceptions ICEs.  For the latter, one possibility
would be to error out on the -fsanitize=thread -fnon-call-exceptions
combination.

Jakub


Re: Make max_align_t respect _Float128 [version 2]

2016-09-08 Thread Florian Weimer

On 09/08/2016 02:26 PM, Bernd Schmidt wrote:

On 09/08/2016 01:21 AM, Paul Eggert wrote:


Sure, attached. On Fedora 24 x86-64 (GCC 6.1.1 20160621, valgrind
3.11.0), when I compile with "gcc -O2 flexouch.c" and run with "valgrind
./a.out", valgrind complains "Invalid read of size 2". This is because
GCC compiles "p->d[0] == 2 && p->d[1] == 3" into "cmpw $770, 8(%rax);
sete %al", which loads the uninitialized byte p->d[1] simultaneously
with the initialized byte p->d[0].


Interesting. That optimization doesn't really depend on d being a
flexible array, so you can also reproduce a (different) valgrind warning
with the following:

#include 
#include 

struct s { int x; char d[2]; };

__attribute__((noinline,noclone)) void foo (struct s *p)
{
  p->d[0] = 1;
}

int
main (void)
{
  struct s *p = malloc (sizeof *p);
  foo (p);
  return p->d[0] == 2 && p->d[1] == 3;
}


Very interesting.  So the ASan failure reported for gnulib fts and this 
valgrind issue have separate causes (ASan does not care about undefined 
memory).


Thanks,
Florian



[Bug libstdc++/77524] Empty std::deque allocates unnecessary heap memory

2016-09-08 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77524

--- Comment #4 from Jonathan Wakely  ---
(In reply to Dave Rigby from comment #3)
> As an aside, std::queue<> (defaulting to using std::deque as the underlying
> Container) suffers from a similar issue - I see 4 allocations for an empty
> std::queue. Is this the same underlying issue, or should I raise an
> additional bug on that?

Ouch. It's caused by the same issue, but the default constructor for std::queue
constructs a temporary of the underlying sequence, then move constructs the
data member from the temporary. With std::deque that means we allocate for the
temporary, then allocate again for the data member, then deallocate the memory
of the temporary. That sucks.

  explicit
  queue(_Sequence&& __c = _Sequence())
  : c(std::move(__c)) { }

IIRC the "default" constructor is defined that way so that it's possible to
explicitly instantiate std::queue<_Sequence> with a non-DefaultConstructible
_Sequence. But it means std::queue is horribly inefficient. At
least for an empty std::deque the initial allocation is probably going to
end up being useful, because in most cases the deque doesn't stay empty and the
allocation ends up being useful later. In this std::queue constructor the
temporary is definitely never used (it goes out of scope immediately) so
pre-allocating is entirely useless. std::stack has the same issue.

Please create a new bug for those container adaptors, as that can and should be
fixed for the default configuration. We'll keep this bug for the
non-backwards-compatible std::deque change to be done at some future date.
Thanks.

Re: Make max_align_t respect _Float128 [version 2]

2016-09-08 Thread Bernd Schmidt

On 09/08/2016 01:21 AM, Paul Eggert wrote:


Sure, attached. On Fedora 24 x86-64 (GCC 6.1.1 20160621, valgrind
3.11.0), when I compile with "gcc -O2 flexouch.c" and run with "valgrind
./a.out", valgrind complains "Invalid read of size 2". This is because
GCC compiles "p->d[0] == 2 && p->d[1] == 3" into "cmpw $770, 8(%rax);
sete %al", which loads the uninitialized byte p->d[1] simultaneously
with the initialized byte p->d[0].


Interesting. That optimization doesn't really depend on d being a 
flexible array, so you can also reproduce a (different) valgrind warning 
with the following:


#include 
#include 

struct s { int x; char d[2]; };

__attribute__((noinline,noclone)) void foo (struct s *p)
{
  p->d[0] = 1;
}

int
main (void)
{
  struct s *p = malloc (sizeof *p);
  foo (p);
  return p->d[0] == 2 && p->d[1] == 3;
}


Bernd


[committed] Fix ICE during fortran !$omp atomic write expansion (PR fortran/77500)

2016-09-08 Thread Jakub Jelinek
Hi!

expr2 for atomic write or swap is not something we want to take appart, we
just want to expand it as an rvalue as is, so we shouldn't be skipping its
conversions.  And, for the other cases, the patch adds function.isym check
so that we don't ICE if there is a call to an external function rather than
internal.

Tested on x86_64-linux, committed to trunk so far.

2016-09-08  Jakub Jelinek  

PR fortran/77500
* trans-openmp.c (gfc_trans_omp_atomic): For atomic write or
swap, don't try to look through GFC_ISYM_CONVERSION.  In other cases,
check that value.function.isym is non-NULL before dereferencing it.

* gfortran.dg/gomp/pr77500.f90: New test.

--- gcc/fortran/trans-openmp.c.jj   2016-09-06 20:00:37.0 +0200
+++ gcc/fortran/trans-openmp.c  2016-09-08 12:24:13.112540373 +0200
@@ -2818,7 +2818,11 @@ gfc_trans_omp_atomic (gfc_code *code)
   gfc_start_block ();
 
   expr2 = code->expr2;
-  if (expr2->expr_type == EXPR_FUNCTION
+  if (((atomic_code->ext.omp_atomic & GFC_OMP_ATOMIC_MASK)
+   != GFC_OMP_ATOMIC_WRITE)
+  && (atomic_code->ext.omp_atomic & GFC_OMP_ATOMIC_SWAP) == 0
+  && expr2->expr_type == EXPR_FUNCTION
+  && expr2->value.function.isym
   && expr2->value.function.isym->id == GFC_ISYM_CONVERSION)
 expr2 = expr2->value.function.actual->expr;
 
@@ -2857,6 +2861,7 @@ gfc_trans_omp_atomic (gfc_code *code)
  var = code->expr1->symtree->n.sym;
  expr2 = code->expr2;
  if (expr2->expr_type == EXPR_FUNCTION
+ && expr2->value.function.isym
  && expr2->value.function.isym->id == GFC_ISYM_CONVERSION)
expr2 = expr2->value.function.actual->expr;
}
@@ -2914,6 +2919,7 @@ gfc_trans_omp_atomic (gfc_code *code)
}
   e = expr2->value.op.op1;
   if (e->expr_type == EXPR_FUNCTION
+ && e->value.function.isym
  && e->value.function.isym->id == GFC_ISYM_CONVERSION)
e = e->value.function.actual->expr;
   if (e->expr_type == EXPR_VARIABLE
@@ -2927,6 +2933,7 @@ gfc_trans_omp_atomic (gfc_code *code)
{
  e = expr2->value.op.op2;
  if (e->expr_type == EXPR_FUNCTION
+ && e->value.function.isym
  && e->value.function.isym->id == GFC_ISYM_CONVERSION)
e = e->value.function.actual->expr;
  gcc_assert (e->expr_type == EXPR_VARIABLE
@@ -3041,6 +3048,7 @@ gfc_trans_omp_atomic (gfc_code *code)
  code = code->next;
  expr2 = code->expr2;
  if (expr2->expr_type == EXPR_FUNCTION
+ && expr2->value.function.isym
  && expr2->value.function.isym->id == GFC_ISYM_CONVERSION)
expr2 = expr2->value.function.actual->expr;
 
--- gcc/testsuite/gfortran.dg/gomp/pr77500.f90.jj   2016-09-08 
12:36:11.252253375 +0200
+++ gcc/testsuite/gfortran.dg/gomp/pr77500.f90  2016-09-08 12:30:50.0 
+0200
@@ -0,0 +1,9 @@
+! PR fortran/77500
+! { dg-do compile }
+
+program pr77500
+   real :: x
+!$omp atomic write
+   x = f()
+!$omp end atomic
+end

Jakub


[committed] Fix ICE with safelen(0) in fortran (PR fortran/77516)

2016-09-08 Thread Jakub Jelinek
Hi!

This patch fixes ICE on safelen(0).  Not adding a warning for this in the
FE, as it has been added already on gomp-4_5-branch, so when it is merged,
the testcase will just need to be adjusted for the added warning.

Tested on x86_64-linux, committed to trunk so far.

2016-09-08  Jakub Jelinek  

PR fortran/77516
* omp-low.c (lower_rec_simd_input_clauses): Use max_vf for non-positive
OMP_CLAUSE_SAFELEN_EXPR.

* gfortran.dg/gomp/pr77516.f90: New test.

--- gcc/omp-low.c.jj2016-09-06 20:00:33.0 +0200
+++ gcc/omp-low.c   2016-09-08 11:35:42.960174529 +0200
@@ -4302,7 +4302,9 @@ lower_rec_simd_input_clauses (tree new_v
{
  tree c = find_omp_clause (gimple_omp_for_clauses (ctx->stmt),
OMP_CLAUSE_SAFELEN);
- if (c && TREE_CODE (OMP_CLAUSE_SAFELEN_EXPR (c)) != INTEGER_CST)
+ if (c
+ && (TREE_CODE (OMP_CLAUSE_SAFELEN_EXPR (c)) != INTEGER_CST
+ || tree_int_cst_sgn (OMP_CLAUSE_SAFELEN_EXPR (c)) != 1))
max_vf = 1;
  else if (c && compare_tree_int (OMP_CLAUSE_SAFELEN_EXPR (c),
  max_vf) == -1)
--- gcc/testsuite/gfortran.dg/gomp/pr77516.f90.jj   2016-09-08 
12:06:07.363581303 +0200
+++ gcc/testsuite/gfortran.dg/gomp/pr77516.f90  2016-09-08 11:59:33.0 
+0200
@@ -0,0 +1,12 @@
+! PR fortran/77516
+! { dg-do compile }
+
+program pr77516
+   integer :: i, x
+   x = 0
+!$omp simd safelen(0) reduction(+:x)
+   do i = 1, 8
+  x = x + 1
+   end do
+   print *, x
+end

Jakub


  1   2   >