[v3] __once_proxy(void) workaround for implicit extern "C" systems

2011-11-06 Thread Jonathan Wakely
This is necessary for systems that treat all system headers as
implicitly extern "C" (at least AIX and OpenBSD) because otherwise the
empty parameter list is treated as declaring void __once_proxy(...)

This has come up recently because we are now enabling  etc. on
additional platforms.

PR libstdc++/50982
* include/std/mutex (__once_proxy): Use void parameter list to
work on implicit extern "C" systems.

Tested x86_64-linux and with reports of this solving bootstrap
failures on AIX and OpenBSD, committed to trunk.

Index: include/std/mutex
===
--- include/std/mutex   (revision 181054)
+++ include/std/mutex   (working copy)
@@ -796,7 +796,7 @@
   __get_once_mutex();
 #endif

-  extern "C" void __once_proxy();
+  extern "C" void __once_proxy(void);

   /// call_once
   template


vector garbaged collected while still in use

2011-11-06 Thread Xinliang David Li
I have seen compiler build error (segmentation fault) in libstdc++-v3.
It turns out that a vector allocated in gc memory is GCed before the
vector is released. The gc call is from a call to synethesize_method
from cp_finish_decl.

The following patch fixes the problem. Compiler bootstraps and tested
on linux/x86-64. Ok for trunk (or better fix suggested)?

thanks,

David

2011-11-05  Xinliang David Li  

* cp/decl.c (cp_finish_decl): Prevent cleanups from
being garbage collected before being released.


Index: cp/decl.c
===
--- cp/decl.c   (revision 181013)
+++ cp/decl.c   (working copy)
@@ -5902,6 +5902,8 @@ value_dependent_init_p (tree init)
FLAGS is LOOKUP_ONLYCONVERTING if the = init syntax was used, else 0
if the (init) syntax was used.  */

+static GTY (()) VEC(tree,gc) *cleanups_vec;
+
 void
 cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
tree asmspec_tree, int flags)
@@ -5914,6 +5916,7 @@ cp_finish_decl (tree decl, tree init, bo
   bool var_definition_p = false;
   tree auto_node;

+  cleanups_vec = cleanups;
   if (decl == error_mark_node)
 return;
   else if (! decl)
@@ -6319,6 +6322,7 @@ cp_finish_decl (tree decl, tree init, bo
   FOR_EACH_VEC_ELT (tree, cleanups, i, t)
 push_cleanup (decl, t, false);
   release_tree_vector (cleanups);
+  cleanups_vec = NULL;

   if (was_readonly)
 TREE_READONLY (decl) = 1;


Re: [Patch, Fortran, OOP] PR 50919: Don't use vtable for NON_OVERRIDABLE TBP

2011-11-06 Thread Paul Richard Thomas
Dear Janus,

On Mon, Nov 7, 2011 at 12:14 AM, Janus Weil  wrote:

> The patch actually consists of two parts:
> 1) The resolve.c part prevents the conversion to a PPC call via the
> _vptr (for functions and subroutines).

This is obviously OK

> 2) The class.c parts prevents adding the non-overridable TBP to the vtable.
>
> As noted by Tobias, the second part breaks the ABI, so we might
> consider deferring it until other ABI-breaking features will be
> implemented (cf. http://gcc.gnu.org/wiki/LibgfortranAbiCleanup). On
> the other hand, one could argue that the OOP ABI is still quite young
> and hasn't really stabilized yet (it was broken already from 4.5 to
> 4.6), so we might as well break it again. I know that there are a
> couple of real-world codes out there, which make use of gfortran's OOP
> features already, but I have a hard time estimating how many such
> projects exists, or how problematic an ABI breaking would be for them
> (user input welcome).

Do we need to exclude it from the _vtable?  I have to confess that,
although I tried, I could not think of any reason to exclude it.  On
the other hand, I could not see any great harm in retaining a pointer,
albeit a redundant one.

>
> So, the question is: Should I commit both parts, or only the resolve.c
> one for now? The patch was regtested on x86_64-unknown-linux-gnu.

In spite of the above remark, I think that you should commit both
parts.  Perhaps, until just before 4.7 release, a warning should be
triggered that says that pre-existing code containing non-overridable
procedures, should be recompiled before linking?

OK for trunk.

Thanks for the patch

Paul


Re: [PATCH] Add an intermediate coverage format for gcov

2011-11-06 Thread Sharad Singhai
Sorry about the delay. I have updated the patch to output demangled
names under a new option (-m) and added a test case. Okay for trunk?

Sharad

2011-11-06   Sharad Singhai  

* doc/gcov.texi: Document gcov intermediate format.
* gcov.c (print_usage): Handle new option.
(process_args): Handle new option.
(get_gcov_file_intermediate_name): New function.
(output_intermediate_file): New function.
(generate_results): Handle new option.
* testsuite/lib/gcov.exp: Handle intermediate format.
* testsuite/g++.dg/gcov/gcov-8.C: New testcase.
* testsuite/g++.dg/gcov/gcov-9.C: New testcase.

Index: doc/gcov.texi
===
--- doc/gcov.texi   (revision 179873)
+++ doc/gcov.texi   (working copy)
@@ -131,6 +131,8 @@ gcov [@option{-v}|@option{--version}] [@
  [@option{-o}|@option{--object-directory} @var{directory|file}]
@var{sourcefiles}
  [@option{-u}|@option{--unconditional-branches}]
  [@option{-d}|@option{--display-progress}]
+ [@option{-i}|@option{--intermediate-format}]
+ [@option{-m}|@option{--demangled-names}]
 @c man end
 @c man begin SEEALSO
 gpl(7), gfdl(7), fsf-funding(7), gcc(1) and the Info entry for @file{gcc}.
@@ -216,6 +218,49 @@ Unconditional branches are normally not
 @itemx --display-progress
 Display the progress on the standard output.

+@item -i
+@itemx --intermediate-format
+Output gcov file in an easy-to-parse intermediate text format that can
+be used by @command{lcov} or other tools. The output is a single
+@file{.gcov} file per @file{.gcda} file. No source code is required.
+
+The format of the intermediate @file{.gcov} file is plain text with
+one entry per line
+
+@smallexample
+file:@var{source_file_name}
+function:@var{line_number},@var{execution_count},@var{function_name}
+lcount:@var{line number},@var{execution_count}
+branch:@var{line_number},@var{branch_coverage_type}
+
+Where the @var{branch_coverage_type} is
+   notexec (Branch not executed)
+   taken (Branch executed and taken)
+   nottaken (Branch executed, but not taken)
+
+There can be multiple @var{file} entries in an intermediate gcov
+file. All entries following a @var{file} pertain to that source file
+until the next @var{file} entry.
+@end smallexample
+
+Here is a sample when @option{-i} is used in conjuction with
@option{-b} option:
+
+@smallexample
+file:array.cc
+function:11,1,_Z3sumRKSt6vectorIPiSaIS0_EE
+function:22,1,main
+lcount:11,1
+lcount:12,1
+lcount:14,1
+branch:14,taken
+lcount:26,1
+branch:28,nottaken
+@end smallexample
+
+@item -m
+@itemx --demangled-names
+Display demangled function names in output.
+
 @end table

 @command{gcov} should be run with the current directory the same as that
Index: gcov.c
===
--- gcov.c  (revision 179873)
+++ gcov.c  (working copy)
@@ -39,6 +39,7 @@ along with Gcov; see the file COPYING3.
 #include "intl.h"
 #include "diagnostic.h"
 #include "version.h"
+#include "demangle.h"

 #include 

@@ -168,6 +169,7 @@ typedef struct function_info
 {
   /* Name of function.  */
   char *name;
+  char *demangled_name;
   unsigned ident;
   unsigned lineno_checksum;
   unsigned cfg_checksum;
@@ -311,6 +313,14 @@ static int flag_gcov_file = 1;

 static int flag_display_progress = 0;

+/* Output *.gcov file in intermediate format used by 'lcov'.  */
+
+static int flag_intermediate_format = 0;
+
+/* Output demangled function names.  */
+
+static int flag_demangled_names = 0;
+
 /* For included files, make the gcov output file name include the name
of the input source file.  For example, if x.h is included in a.c,
then the output file name is a.c##x.h.gcov instead of x.h.gcov.  */
@@ -433,9 +443,11 @@ print_usage (int error_p)
   fnotice (file, "  -l, --long-file-names   Use long output
file names for included\n\
 source files\n");
   fnotice (file, "  -f, --function-summariesOutput summaries
for each function\n");
+  fnotice (file, "  -m, --demangled-names   Output demangled
function names\n");
   fnotice (file, "  -o, --object-directory DIR|FILE Search for object
files in DIR or called FILE\n");
   fnotice (file, "  -p, --preserve-pathsPreserve all
pathname components\n");
   fnotice (file, "  -u, --unconditional-branchesShow
unconditional branch counts too\n");
+  fnotice (file, "  -i, --intermediate-format   Output .gcov file
in intermediate text format\n");
   fnotice (file, "  -d, --display-progress  Display progress
information\n");
   fnotice (file, "\nFor bug reporting instructions, please see:\n%s.\n",
   bug_report_url);
@@ -467,11 +479,13 @@ static const struct option options[] =
   { "no-output",no_argument,   NULL, 'n' },
   { "long-file-names",  no_argument,   NULL, 'l' },
   { "function-summaries",   no_argument,   NULL, 'f'

Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Aldy Hernandez



False.  You get the equivalent of bootstrap comparison mismatches.
If we actually used tm during the bootstrap.

The simplest thing to do is to change the hash this table uses.
E.g. use the DECL_UID right from the start, rather than the pointer.


Woah!  Can it be that easy?  That's as easy as changing the hash, no 
conversion necessary.


OK for branch?

* varasm.c (record_tm_clone_pair): Use DECL_UID as hash.
(get_tm_clone_pair): Same.

Index: varasm.c
===
--- varasm.c(revision 181067)
+++ varasm.c(working copy)
@@ -5875,7 +5875,7 @@ record_tm_clone_pair (tree o, tree n)
 tm_clone_pairs = htab_create_ggc (32, tree_map_hash, tree_map_eq, 0);

   h = ggc_alloc_tree_map ();
-  h->hash = htab_hash_pointer (o);
+  h->hash = DECL_UID (o);
   h->base.from = o;
   h->to = n;

@@ -5892,7 +5892,7 @@ get_tm_clone_pair (tree o)
   struct tree_map *h, in;

   in.base.from = o;
-  in.hash = htab_hash_pointer (o);
+  in.hash = DECL_UID (o);
   h = (struct tree_map *) htab_find_with_hash (tm_clone_pairs,
   &in, in.hash);
   if (h)


Re: Many testsuite failures on x86_64 due recent "fix" about f16cintrin.h header

2011-11-06 Thread Quentin Neill
On Sun, Nov 6, 2011 at 2:13 PM, Dominique Dhumieres  wrote:
> Following http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02901.html, I have 
> applied
> the following patch on x86_64-apple-darwin10
>
> --- ../_clean/gcc/config.gcc    2011-11-05 22:25:37.0 +0100
> +++ gcc/config.gcc      2011-11-06 12:35:57.0 +0100
> @@ -350,7 +350,7 @@ i[34567]86-*-*)
>                       immintrin.h x86intrin.h avxintrin.h xopintrin.h
>                       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
>                       lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
> -                      avx2intrin.h fmaintrin.h"
> +                      avx2intrin.h fmaintrin.h f16cintrin.h"
>        ;;
>  x86_64-*-*)
>        cpu_type=i386
> @@ -363,7 +363,7 @@ x86_64-*-*)
>                       immintrin.h x86intrin.h avxintrin.h xopintrin.h
>                       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
>                       lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
> -                      avx2intrin.h fmaintrin.h"
> +                      avx2intrin.h fmaintrin.h f16cintrin.h"
>        need_64bit_hwint=yes
>        ;;
>  ia64-*-*)
> --- ../_clean/gcc/config/i386/f16cintrin.h      2011-11-05 10:03:10.0 
> +0100
> +++ gcc/config/i386/f16cintrin.h        2011-11-06 16:55:05.0 +0100
> @@ -88,7 +88,8 @@ _mm256_cvtps_ph (__m256 __A, const int _
>
>  #define _mm256_cvtps_ph(A, I) \
>   ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I)))
> -#endif
> +#endif /* __OPTIMIZE__ */
> +#endif /* _F16CINTRIN_H_INCLUDED */
>
>  #endif /* __F16C__ */
>  #endif
>
> (the second part fixes a missing endif). However I still have most of the 
> failures:
>
> FAIL: gcc.target/i386/sse-14.c
> Excess errors:
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:95:1: error: implicit 
> declaration of function '_cvtss_sh' [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: implicit 
> declaration of function '_mm_cvtps_ph' [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: 
> incompatible types when returning type 'int' but '__m128i' was expected
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: implicit 
> declaration of function '_mm256_cvtps_ph' 
> [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: 
> incompatible types when returning type 'int' but '__m128i' was expected
> ...
> FAIL: gcc.target/i386/testimm-1.c (test for excess errors)
> Excess errors:
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:36:6: error: 
> incompatible types when assigning to type '__m128i' from type 'int'
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:45:6: error: 
> incompatible types when assigning to type '__m128i' from type 'int'
>
> At this point I have no idea about how to fix those.
>
> Cheers,
>
> Dominique

Thanks for the testing.
I committed the changes to add f16cintrin.h to config.gcc and added
the missing endif.
I'm not sure why my testing did not see the missing endif or the failures above.
My first guess would be that the missing endif was causing ifndef
guards to not work across other intrinsics headers, causing some
builtins not to be defined.
My test machine is down right now, but I will post a patch if I can
figure things out tonight.
-- 
Quentin


Re: increase call_saved_regs[] in caller-save.c

2011-11-06 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/04/11 14:23, Peter Bergner wrote:
> On Fri, 2011-11-04 at 12:25 -0600, Jeff Law wrote:
>> The only way I can think of to have two pseudos assigned the same
>> hard reg at the same point in the insn stream is if the two
>> pseudos are known to have the same value.
> 
> Having the same value is the more common way two overlapping
> pseudos don't conflict, but if one or both of the pseudos are
> undefined (ie, no definition point), then they won't conflict
> either.
However, I'm pretty sure we do not take advantage of the fact that two
pseudos with the same value with overlapping lifetimes do not conflict
and thus can live in the same hard reg.  In fact, if you try to
implement that optimization, you'll find that reload chokes in fun and
unpleasant ways.  I actually tried this myself a while back ;-)


Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOt2NYAAoJEBRtltQi2kC764cIAKvRkCrVtEjUfhZSg63dzkDp
jbp9HObcAxT8I/vlDcgkftzx5xpQN4QpRRIN/cFcPdVcgsJHk97W/ZPo7Tnmwqkm
DJRwNxCxrrSJ3xNiOdRZCJTomb/28K1nthJVpRin0i4QdHjvUP6Cs5NYQ4/Nuj+Z
ponJRdldJYaAx7QFcb+4Z6R0e7mjaeQ8NUvs3SUkrmTeIn0JArZ9gkR1061Dv2KI
Th3FPY9jnOHZw75Tmszt3DEKzXpwjA3NzFM919u3chrfDx21F2HqrycyLqECxylg
fIyAFCPlMyj4Vhf07Nr59NId2j84bZy5KC2Zakvl0xYAIwyHZqqrpY0U7qQgr7Y=
=59oA
-END PGP SIGNATURE-


[C++ PATCH] PR 50958 and correct lookup of literal operators

2011-11-06 Thread Ed Smith-Rowland
The lookup rules require that all the operators for a given name be 
searched for one that has the exact same arguments implied by the literal.


Also, an error concerning the length of raw operator strings was fixed.

Index: gcc/testsuite/g++.dg/cpp0x/udlit-raw-length.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-raw-length.C   (revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-raw-length.C   (revision 0)
@@ -0,0 +1,27 @@
+// { dg-options "-std=c++0x" }
+// PR c++/50958
+
+typedef decltype(sizeof(0)) size_type;
+
+constexpr size_type
+cstrlen_impl(const char* s, size_type i)
+{
+  return s[i] ? cstrlen_impl(s, i + 1) : i;
+}
+
+constexpr size_type
+cstrlen(const char* s)
+{
+  return s ? cstrlen_impl(s, 0) : throw 0;
+}
+
+constexpr size_type
+operator "" _lenraw(const char* digits)
+{
+  return cstrlen(digits);
+}
+
+static_assert(123_lenraw == 3, "Ouch");
+static_assert(1_lenraw == 1, "Ouch");
+static_assert(012_lenraw == 3, "Ouch");
+static_assert(0_lenraw == 1, "Ouch");
Index: gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C(revision 
181034)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C(working copy)
@@ -5,4 +5,4 @@
 int operator"" _embedraw(const char*)
 { return 41; };
 
-int k = "Boo!"_embedraw;  //  { dg-error "unable to find valid user-defined 
string literal operator" }
+int k = "Boo!"_embedraw;  //  { dg-error "unable to find valid string literal 
operator" }
Index: gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C   (revision 181034)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C   (working copy)
@@ -8,7 +8,7 @@
 };
 
 int i = operator"" _Bar(U'x');  // { dg-error "was not declared in this scope" 
}
-int j = U'x'_Bar;  // { dg-error "unable to find user-defined character 
literal operator" }
+int j = U'x'_Bar;  // { dg-error "unable to find character literal operator" }
 
 int
 Foo::operator"" _Bar(char32_t)  // { dg-error "must be a non-member function" }
Index: gcc/testsuite/g++.dg/cpp0x/udlit-implicit-conv-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-implicit-conv-neg.C(revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-implicit-conv-neg.C(revision 0)
@@ -0,0 +1,63 @@
+// { dg-options -std=c++0x }
+
+#include 
+
+int operator"" _bar (long double);
+
+double operator"" _foo (long long unsigned);
+
+int i = 12_bar; // { dg-error "unable to find numeric literal 
operator|with|argument" }
+
+double d = 1.2_foo; // { dg-error "unable to find numeric literal 
operator|with|argument" }
+
+int operator"" _char(char);
+
+int operator"" _wchar_t(wchar_t);
+
+int operator"" _char16_t(char16_t);
+
+int operator"" _char32_t(char32_t);
+
+int cwcx = 'c'_wchar_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+int cc16 = 'c'_char16_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+int cc32 = 'c'_char32_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+
+int wccx = L'c'_char; // { dg-error "unable to find character literal 
operator|with|argument" }
+int wcc16 = L'c'_char16_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+int wcc32 = L'c'_char32_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+
+int c16c = u'c'_char; // { dg-error "unable to find character literal 
operator|with|argument" }
+int c16wc = u'c'_wchar_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+int c16c32 = u'c'_char32_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+
+int c32c = U'c'_char; // { dg-error "unable to find character literal 
operator|with|argument" }
+int c32wc = U'c'_wchar_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+int c32c16 = U'c'_char16_t; // { dg-error "unable to find character literal 
operator|with|argument" }
+
+int operator"" _char_str(const char*, std::size_t);
+
+int operator"" _wchar_t_str(const wchar_t*, std::size_t);
+
+int operator"" _char16_t_str(const char16_t*, std::size_t);
+
+int operator"" _char32_t_str(const char32_t*, std::size_t);
+
+int strwstr = "str"_wchar_t_str; // { dg-error "unable to find string literal 
operator|with|arguments" }
+int strstr16 = "str"_char16_t_str; // { dg-error "unable to find string 
literal operator|with|arguments" }
+int strstr32 = "str"_char32_t_str; // { dg-error "unable to find string 
literal operator|with|arguments" }
+
+int str8wstr = u8"str"_wchar_t_str; // { dg-error "unable to find string 
literal operator|with|arguments" }
+int str8str16 = u8"str"_char16_t_str; // { dg-error "unable to find string 
liter

Re: increase call_saved_regs[] in caller-save.c

2011-11-06 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/04/11 17:37, DJ Delorie wrote:
>> The only way I can think of to have two pseudos assigned the
>> same hard reg at the same point in the insn stream is if the two
>> pseudos are known to have the same value.
> 
> Since all we're doing is figuring out which hard regs need to be
> saved in pro/epilogue, it could be that the two pseudos are not
> live at the same time, despite being live in the same function.
?!?  caller-save emit saves/restores at call sites, not in the
prologue/epilogue.

It looks at the set of registers live, call-clobbered and not
currently saved in the stack at each call site to make this determination.

So the question I would still ask is precisely how multiple hard
registers are being stored into that array.  I don't think that's
supposed to be happening, but I could be wrong.   Knowing how this is
happening is critical to determining if the patch is correct or just
papering over a problem elsewhere.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOt2KPAAoJEBRtltQi2kC7PFYH/jXyJAjdc+X3slLuebXBP4ux
ENcRFeQDIZh0P1+bqysd6WZ4eD3Wu93fkJjzqSXySI8MLQ60P9TGEzfGFjL7Sc+Z
KMqytPC5dSZ8vJWXRHc7jI176gX00bc51G73CC/ETKMdJVxwv9j87Qs2V5eFahNM
WE5YDM6Ghv/0YJgBs9gqgV4dOy7hd3go/TS5KgfT8n77VxdfrXeB4kD4sGM2vLez
QRocJeOIhOx+fjwmBoNYIMnLkQuyVMNyrTXSP7T+X8ionPSGPxH/nk8tX4/2xHwp
iR1dDh1vZczfvb8nUGris6TnYE/oJq3zKoCXiZffQlcENdvARf3nNDw6oeitWfg=
=PPHL
-END PGP SIGNATURE-


C++/c-common PATCH for c++/35688 (wrong visibility of template instantiation)

2011-11-06 Thread Jason Merrill
The function constrain_visibility_for_template tries to set the 
visibility of a template instantiation properly by giving it the minimum 
visibility of the template itself and the template arguments.  But this 
PR points out that we were failing to do that in the case that the 
template is within a namespace with a visibility attribute, because then 
it gets DECL_VISIBILITY_SPECIFIED.  This patch fixes that by re-using 
some of the C front end's visibility code, so that we can further 
constrain visibility on a decl with DECL_VISIBILITY_SPECIFIED so long as 
it doesn't actually have a visibility attribute.


Joseph, I assume you have no objection to the c-common refactoring?

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 3f9fa74f7ee3f2074316bfb49fa1a09b98f78602
Author: Jason Merrill 
Date:   Sun Nov 6 15:44:49 2011 -0500

	PR c++/35688
gcc/c-common/
	* c-common.c (decl_has_visibility_attr): Split out from...
	(c_determine_visibility): ...here.
	* c-common.h: Declare it.
gcc/cp/
	* decl2.c (constrain_visibility): Check decl_has_visibility_attr
	rather than DECL_VISIBILITY_SPECIFIED.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 5627fe1..71b3721 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -7043,6 +7043,22 @@ handle_visibility_attribute (tree *node, tree name, tree args,
   return NULL_TREE;
 }
 
+/* Returns true iff DECL actually has visibility specified by an attribute.
+   We check for an explicit attribute, rather than just checking
+   DECL_VISIBILITY_SPECIFIED, to distinguish the use of an attribute from
+   the use of a "#pragma GCC visibility push(...)"; in the latter case we
+   still want other considerations to be able to overrule the #pragma.  */
+
+bool
+decl_has_visibility_attr (tree decl)
+{
+  tree attrs = DECL_ATTRIBUTES (decl);
+  return (lookup_attribute ("visibility", attrs)
+	  || (TARGET_DLLIMPORT_DECL_ATTRIBUTES
+	  && (lookup_attribute ("dllimport", attrs)
+		  || lookup_attribute ("dllexport", attrs;
+}
+
 /* Determine the ELF symbol visibility for DECL, which is either a
variable or a function.  It is an error to use this function if a
definition of DECL is not available in this translation unit.
@@ -7058,15 +7074,8 @@ c_determine_visibility (tree decl)
 
   /* If the user explicitly specified the visibility with an
  attribute, honor that.  DECL_VISIBILITY will have been set during
- the processing of the attribute.  We check for an explicit
- attribute, rather than just checking DECL_VISIBILITY_SPECIFIED,
- to distinguish the use of an attribute from the use of a "#pragma
- GCC visibility push(...)"; in the latter case we still want other
- considerations to be able to overrule the #pragma.  */
-  if (lookup_attribute ("visibility", DECL_ATTRIBUTES (decl))
-  || (TARGET_DLLIMPORT_DECL_ATTRIBUTES
-	  && (lookup_attribute ("dllimport", DECL_ATTRIBUTES (decl))
-	  || lookup_attribute ("dllexport", DECL_ATTRIBUTES (decl)
+ the processing of the attribute.  */
+  if (decl_has_visibility_attr (decl))
 return true;
 
   /* Set default visibility to whatever the user supplied with
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7ecb57e..0b22d3d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -771,6 +771,7 @@ extern void overflow_warning (location_t, tree);
 extern void warn_logical_operator (location_t, enum tree_code, tree,
    enum tree_code, tree, enum tree_code, tree);
 extern void check_main_parameter_types (tree decl);
+extern bool decl_has_visibility_attr (tree);
 extern bool c_determine_visibility (tree);
 extern bool same_scalar_type_ignoring_signedness (tree, tree);
 extern void mark_valid_location_for_stdc_pragma (bool);
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 32b5c7e..4bc02bd 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1974,8 +1974,11 @@ constrain_visibility (tree decl, int visibility)
 	DECL_NOT_REALLY_EXTERN (decl) = 1;
 	}
 }
+  /* We check decl_has_visibility_attr rather than
+ DECL_VISIBILITY_SPECIFIED here because we want other considerations
+ to override visibility from a namespace or #pragma.  */
   else if (visibility > DECL_VISIBILITY (decl)
-	   && !DECL_VISIBILITY_SPECIFIED (decl))
+	   && !decl_has_visibility_attr (decl))
 {
   DECL_VISIBILITY (decl) = (enum symbol_visibility) visibility;
   return true;
diff --git a/gcc/testsuite/g++.dg/ext/visibility/template7.C b/gcc/testsuite/g++.dg/ext/visibility/template7.C
new file mode 100644
index 000..5197fb1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/visibility/template7.C
@@ -0,0 +1,29 @@
+// PR c++/35688
+// { dg-require-visibility "" }
+// { dg-options "-fvisibility=hidden" }
+
+// { dg-final { scan-hidden "_ZN1s6vectorI1AEC1Ev" } }
+// { dg-final { scan-hidden "_ZN1s3fooI1AEEvT_" } }
+
+namespace s __attribute__((visibility("default"))) {
+  template 
+class vector

Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-06 Thread Andrew MacLeod

On 11/06/2011 07:38 PM, Hans-Peter Nilsson wrote:


This (formally a change in the range 181027:181034) got me three
libstdc++ regressions for cris-elf, which has no "atomic"
support whatsoever (well, not the version represented in
"cris-elf"), so something is amiss at the bottom of the default
path:
yes, I have a final pending patch which didn't make it to the branch 
before the merge.  It changes the behaviour of atomic_flag on targets 
with no compare_and_swap.  I *think* it will resolve your problem.


I've attached the early version of the patch which you can try. Its 
missing a documentation change I was going to add tomorrow before 
submitting, but we can see if it resolves your problem.  Give it a shot 
and let me know.


Andrew



gcc
* expr.h (expand_atomic_exchange): Add extra parameter.
* builtins.c (expand_builtin_sync_lock_test_and_set): Call
expand_atomic_exchange with true.
(expand_builtin_atomic_exchange): Call expand_atomic_exchange with
false.
* optabs.c (expand_atomic_exchange): Add use_test_and_set param and
only fall back to __sync_test_and_set when true.
(expand_atomic_store): Add false to expand_atomic_exchange call.

libstdc++-v3
* include/bits/atomic_base.h (test_and_set): Call __atomic_exchange
only if it is always lock free, otherwise __sync_lock_test_and_set.


Index: gcc/expr.h
===
*** gcc/expr.h  (revision 180770)
--- gcc/expr.h  (working copy)
*** rtx emit_conditional_add (rtx, enum rtx_
*** 215,221 
  rtx expand_sync_operation (rtx, rtx, enum rtx_code);
  rtx expand_sync_fetch_operation (rtx, rtx, enum rtx_code, bool, rtx);
  
! rtx expand_atomic_exchange (rtx, rtx, rtx, enum memmodel);
  rtx expand_atomic_load (rtx, rtx, enum memmodel);
  rtx expand_atomic_store (rtx, rtx, enum memmodel);
  rtx expand_atomic_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel, 
--- 215,221 
  rtx expand_sync_operation (rtx, rtx, enum rtx_code);
  rtx expand_sync_fetch_operation (rtx, rtx, enum rtx_code, bool, rtx);
  
! rtx expand_atomic_exchange (rtx, rtx, rtx, enum memmodel, bool);
  rtx expand_atomic_load (rtx, rtx, enum memmodel);
  rtx expand_atomic_store (rtx, rtx, enum memmodel);
  rtx expand_atomic_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel, 
Index: gcc/builtins.c
===
*** gcc/builtins.c  (revision 180789)
--- gcc/builtins.c  (working copy)
*** expand_builtin_sync_lock_test_and_set (e
*** 5221,5227 
mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode);
val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), mode);
  
!   return expand_atomic_exchange (target, mem, val, MEMMODEL_ACQUIRE);
  }
  
  /* Expand the __sync_lock_release intrinsic.  EXP is the CALL_EXPR.  */
--- 5221,5227 
mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode);
val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), mode);
  
!   return expand_atomic_exchange (target, mem, val, MEMMODEL_ACQUIRE, true);
  }
  
  /* Expand the __sync_lock_release intrinsic.  EXP is the CALL_EXPR.  */
*** expand_builtin_atomic_exchange (enum mac
*** 5284,5290 
mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode);
val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), mode);
  
!   return expand_atomic_exchange (target, mem, val, model);
  }
  
  /* Expand the __atomic_compare_exchange intrinsic:
--- 5284,5290 
mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode);
val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), mode);
  
!   return expand_atomic_exchange (target, mem, val, model, false);
  }
  
  /* Expand the __atomic_compare_exchange intrinsic:
Index: gcc/optabs.c
===
*** gcc/optabs.c(revision 180770)
--- gcc/optabs.c(working copy)
*** expand_compare_and_swap_loop (rtx mem, r
*** 6872,6881 
 atomically store VAL in MEM and return the previous value in MEM.
  
 MEMMODEL is the memory model variant to use.
!TARGET is an option place to stick the return value.  */
  
  rtx
! expand_atomic_exchange (rtx target, rtx mem, rtx val, enum memmodel model)
  {
enum machine_mode mode = GET_MODE (mem);
enum insn_code icode;
--- 6872,6884 
 atomically store VAL in MEM and return the previous value in MEM.
  
 MEMMODEL is the memory model variant to use.
!TARGET is an optional place to stick the return value.  
!USE_TEST_AND_SET indicates whether __sync_lock_test_and_set should be used
!as a fall back if the atomic_exchange pattern does not exist.  */
  
  rtx
! expand_atomic_exchange (rtx target, rtx mem, rtx val, enum memmodel model,
!   bool use_test_and_set)  
  {
enum machine_mode mode = GET_MODE (mem);
e

Re: [trans-mem] XFAIL known failures

2011-11-06 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/03/11 17:40, Aldy Hernandez wrote:
> On 11/03/11 18:30, Andrew Pinski wrote:
>> On Thu, Nov 3, 2011 at 3:12 PM, Aldy Hernandez
>> wrote:
>>> These are known failures, mostly missed optimizations.
>>> XFAILing them.
>> 
>> I think you should file a bug about each missed optimization and 
>> reference the bug # in the testcase.  This is so we don't lose
>> track of the missed optimizations.
>> 
>> Thanks, Andrew Pinski
> 
> Sure, I can do that.
> 
> What do you suggest, a bug report per failure with nothing but the 
> directory/name of the test?
I'd say a bug report for each distinct failure.  It can get awful
confusing when there's multiple bugs in a single PR...



> Do you mind if I do this after the merge (if it gets approved)?
> I'm trying to concentrate on merge blockers.
XFAIL prior to merge, file bugs after is fine.

Thanks,
jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOt1yBAAoJEBRtltQi2kC7a40H/jgJToPzkqC4ukuSKRY3rW86
htsGB7BveZm+9pcuMl5mHHfdtHaLsamCEx9Jt1DNIVZApCYAreAQTg/3kTQowXVy
ATzXsGg0ymcgb89QwA9m92LKNswgHrTuVWRqbQMV/69cPC++C5RGo6dovn6RmLtm
ok7YVDQX+4XUBqK4+C0GQn6T8uIAVuopa/qoIvUHrMWyoLeqJw/3hD2wve2WdZVv
PrsVSI6l8sOJbO7e0Xa5eZRiqWs4nguO6ROXBZRCxI1uIXeaEa65kmfCFg2XrvsW
Th037xmKs/Ua2mJ0sv0TFFBj0ckjGPxrrZ1CAsZX8oBdL7+F8M90RLy4VktKYXE=
=430I
-END PGP SIGNATURE-


Re: [Patch] ARM EABI support for RTEMS

2011-11-06 Thread Ralf Corsepius

On 11/04/2011 03:30 PM, Sebastian Huber wrote:

On 11/04/2011 01:57 PM, Sebastian Huber wrote:

It builds well and the test suite runs currently.


http://gcc.gnu.org/ml/gcc-testresults/2011-11/msg00407.html



The second version of your patch is OK with me and seems to work fine.

Patch commited to svn-trunk.

Ralf


Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Aldy Hernandez

On 11/06/11 12:20, Richard Henderson wrote:

-  if (!computed_goto_p (stmt))
+ if (!computed_goto_p (stmt))
 {
- tree new_dest = main_block_label (gimple_goto_dest (stmt));
- gimple_goto_set_dest (stmt, new_dest);
+ label = gimple_goto_dest (stmt);
+ new_label = main_block_label (label);
+ if (new_label != label)
+   gimple_goto_set_dest (stmt, new_label);


What's the reason for this changes?  Optimization?


Yes.  Rth can elaborate if you deem necessary.


Really?  I have no idea what this change achieves.
I actually wonder if this is a merge error.


Removing this caused various TM tests failures, which I have yet to 
fully investigate.  I found the original patch by you [rth] (attached). 
 Perhaps you can elaborate as to its original use.


It may be that we need to remove all of the GIMPLE_GOTO, GIMPLE_COND, 
and GIMPLE_SWITCH hacks in cleanup_dead_labels, but I will wait for a 
double check by you before touching any more of this.


In the meantime, I will commit the patch sans this GIMPLE_GOTO removal 
which may still be used.  That is, after another round of tests.
Index: cgraphbuild.c
===
--- cgraphbuild.c   (revision 141199)
+++ cgraphbuild.c   (revision 141200)
@@ -125,7 +125,7 @@ compute_call_stmt_bb_frequency (basic_bl
 /* Eagerly clone functions so that TM expansion can create
and redirect calls to a transactional clone.  */
 
-static void
+static void ATTRIBUTE_UNUSED
 prepare_tm_clone (struct cgraph_node *node)
 {
   struct cgraph_node *tm_node;
@@ -275,7 +275,7 @@ build_cgraph_edges (void)
 
   build_cgraph_edges_from_node (node);
   
-  prepare_tm_clone (node);
+  /* prepare_tm_clone (node); */
 
   return 0;
 }
Index: tree-pass.h
===
--- tree-pass.h (revision 141199)
+++ tree-pass.h (revision 141200)
@@ -388,7 +388,7 @@ extern struct gimple_opt_pass pass_reass
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
 extern struct gimple_opt_pass pass_reset_cc_flags;
-extern struct gimple_opt_pass pass_expand_tm;
+extern struct gimple_opt_pass pass_lower_tm;
 extern struct gimple_opt_pass pass_checkpoint_tm;
 
 /* IPA Passes */
Index: gtm-low.c
===
--- gtm-low.c   (revision 141199)
+++ gtm-low.c   (revision 141200)
@@ -1,6 +1,6 @@
 /* Lowering pass for transactional memory directives.
Converts markers of transactions into explicit calls to
-   the STM runtime library.
+   the TM runtime library.
 
Copyright (C) 2008 Free Software Foundation, Inc.
 
@@ -18,34 +18,144 @@
 
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
-   .
-
-*/
+   .  */
 
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
 #include "tree.h"
-#include "rtl.h"
 #include "gimple.h"
-#include "langhooks.h"
-#include "diagnostic.h"
 #include "tree-flow.h"
-#include "timevar.h"
-#include "flags.h"
-#include "function.h"
-#include "expr.h"
-#include "toplev.h"
 #include "tree-pass.h"
-#include "ggc.h"
 #include "except.h"
-#include "splay-tree.h"
-#include "optabs.h"
-#include "cfgloop.h"
-#include "tree-ssa-live.h"
-#include "tree-flow.h"
+#include "diagnostic.h"
+
+
+/* The representation of a transaction changes several times during the
+   lowering process.  In the beginning, in the front-end we have the
+   GENERIC tree TM_ATOMIC.  For example,
+
+   __tm_atomic {
+ local++;
+ if (++global == 10)
+   __tm_abort;
+   }
+
+  is represented as
+
+   TM_ATOMIC {
+ local++;
+ if (++global == 10)
+   __builtin___tm_abort ();
+   }
+
+  During initial gimplification (gimplify.c) the TM_ATOMIC node is
+  trivially replaced with a GIMPLE_TM_ATOMIC node, and we add bits
+  to handle EH cleanup of the transaction:
+
+   GIMPLE_TM_ATOMIC [label=NULL] {
+ try {
+   local = local + 1;
+   t0 [tm_load]= global;
+   t1 = t0 + 1;
+   global [tm_store]= t1;
+   if (t1 == 10)
+ __builtin___tm_abort ();
+ } finally {
+   __builtin___tm_commit ();
+ }
+   }
+
+  During pass_lower_eh, we create EH regions for the transactions,
+  intermixed with the regular EH stuff.  This gives us a nice persistent
+  mapping (all the way through rtl) from transactional memory operation
+  back to the transaction, which allows us to get the abnormal edges
+  correct to model transaction aborts and restarts.
+
+  During pass_lower_tm, we mark the gimple statements that perform
+  transactional memory operations with TM_LOAD/TM_STORE, and swap out
+  function calls with their (

Re: [PATCH] More improvements to sparc VIS vec_init code generation.

2011-11-06 Thread David Miller
From: Richard Henderson 
Date: Sun, 06 Nov 2011 09:55:17 -0800

> On 11/05/2011 07:39 PM, David Miller wrote:
>> Richard, is there a better way to represent this in RTL?  These
>> instructions basically load a single byte or half-word into the bottom
>> of a 64-bit float register, and clear the rest of that register with
>> zeros.  So the v4hi one is essentially loading the vector:
>> 
>>  [(const_int 0) (const_int 0)
>>  (const_int 0) (mem:HI (register:P ...))]
> 
> Try

That works, thanks a lot Richard.


[PATCH] Get rid of sparc's UNSPEC_SHORT_LOAD.

* config/sparc/sparc.md (UNSPEC_SHORT_LOAD): Delete.
(zero_extend_v8qi_vis, zero_extend_v4hi_vis,
*zero_extend_v8qi__insn,
*zero_extend_v4hi__insn): Express using vec_merge
and vec_duplicate instead of using an UNSPEC.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181063 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog |8 
 gcc/config/sparc/sparc.md |   33 ++---
 2 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b3a5be1..59c4ffc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2011-11-06  David S. Miller  
+
+   * config/sparc/sparc.md (UNSPEC_SHORT_LOAD): Delete.
+   (zero_extend_v8qi_vis, zero_extend_v4hi_vis,
+   *zero_extend_v8qi__insn,
+   *zero_extend_v4hi__insn): Express using vec_merge
+   and vec_duplicate instead of using an UNSPEC.
+
 2011-11-07  Alan Modra  
 
PR target/30282
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 7452f96..56f4dc0 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -92,7 +92,6 @@
(UNSPEC_MUL886)
(UNSPEC_MUL8SU  87)
(UNSPEC_MULDSU  88)
-   (UNSPEC_SHORT_LOAD  89)
   ])
 
 (define_constants
@@ -7833,8 +7832,11 @@
 
 (define_expand "zero_extend_v8qi_vis"
   [(set (match_operand:V8QI 0 "register_operand" "")
-(unspec:V8QI [(match_operand:QI 1 "memory_operand" "")]
- UNSPEC_SHORT_LOAD))]
+(vec_merge:V8QI
+  (vec_duplicate:V8QI
+(match_operand:QI 1 "memory_operand" ""))
+  (match_dup 2)
+  (const_int 254)))]
   "TARGET_VIS"
 {
   if (! REG_P (XEXP (operands[1], 0)))
@@ -7842,12 +7844,16 @@
   rtx addr = force_reg (Pmode, XEXP (operands[1], 0));
   operands[1] = replace_equiv_address (operands[1], addr);
 }
+  operands[2] = CONST0_RTX (V8QImode);
 })
 
 (define_expand "zero_extend_v4hi_vis"
   [(set (match_operand:V4HI 0 "register_operand" "")
-(unspec:V4HI [(match_operand:HI 1 "memory_operand" "")]
- UNSPEC_SHORT_LOAD))]
+(vec_merge:V4HI
+  (vec_duplicate:V4HI
+(match_operand:HI 1 "memory_operand" ""))
+  (match_dup 2)
+  (const_int 14)))]
   "TARGET_VIS"
 {
   if (! REG_P (XEXP (operands[1], 0)))
@@ -7855,21 +7861,26 @@
   rtx addr = force_reg (Pmode, XEXP (operands[1], 0));
   operands[1] = replace_equiv_address (operands[1], addr);
 }
+  operands[2] = CONST0_RTX (V4HImode);
 })
 
 (define_insn "*zero_extend_v8qi__insn"
   [(set (match_operand:V8QI 0 "register_operand" "=e")
-(unspec:V8QI [(mem:QI
-   (match_operand:P 1 "register_operand" "r"))]
- UNSPEC_SHORT_LOAD))]
+(vec_merge:V8QI
+  (vec_duplicate:V8QI
+(mem:QI (match_operand:P 1 "register_operand" "r")))
+  (match_operand:V8QI 2 "const_zero_operand" "Y")
+  (const_int 254)))]
   "TARGET_VIS"
   "ldda\t[%1] 0xd0, %0")
 
 (define_insn "*zero_extend_v4hi__insn"
   [(set (match_operand:V4HI 0 "register_operand" "=e")
-(unspec:V4HI [(mem:HI
-   (match_operand:P 1 "register_operand" "r"))]
- UNSPEC_SHORT_LOAD))]
+(vec_merge:V4HI
+  (vec_duplicate:V4HI
+(mem:HI (match_operand:P 1 "register_operand" "r")))
+  (match_operand:V4HI 2 "const_zero_operand" "Y")
+  (const_int 14)))]
   "TARGET_VIS"
   "ldda\t[%1] 0xd2, %0")
 
-- 
1.7.6.401.g6a319



Re: [v3] corrections to C++11 status table

2011-11-06 Thread Jonathan Wakely
On 6 November 2011 22:10, Jonathan Wakely wrote:
> We don't provide  or  - the latter is listed as "C
> library dependency" but AFAIK we don't provide  even if
>  is present.
>
>        * doc/xml/manual/status_cxx2011.xml: Document  and
>         as missing.
>
> Committed to trunk.

Joseph has just added  so I'll add  tomorrow.


Re: Implement C1X _Alignas, _Alignof, max_align_t, stdalign.h

2011-11-06 Thread Jonathan Wakely
On 7 November 2011 00:37, Joseph S. Myers wrote:
> On Mon, 7 Nov 2011, Jonathan Wakely wrote:
>
>> On 6 November 2011 23:53, Joseph S. Myers wrote:
>> >
>> > As with stdnoreturn.h, the contents of stdalign.h are conditioned out
>> > for C++; I'll leave it to C++ people to work out what's most useful
>> > there if something nonempty is wanted (stdnoreturn.h is empty for C++,
>> > stdbool.h defines _Bool and bool to bool, true to true etc.).
>>
>> Thanks Joseph, that will allow use to provide  in libstdc++.
>>
>> By my reading of the C++11 standard, the __alignas_is_defined macro
>> should still be defined in C++, and alignof isn't mentioned, was that
>> a late addition to C1X?
>
> The move from alignof as a keyword to _Alignof as a keyword with alignof
> as a macro in stdalign.h was done at the London meeting (March 2011) in
> response to comment US20.  stdnoreturn.h was also added at that meeting.

Thanks for the info.  I think treating alignof identically to alignas
for C++ makes sense then - the __alignof_is_defined macro is in the
reserved namespace so it's OK for a conforming C++11 implementation to
define it.

> It looks like what GCC defines in stdbool.h for C++ goes beyond what C++11
> says it should ("The header  and the header  shall
> not define macros named bool, true, or false.")  stdint.h is another case
> needing adjustment for C++11, though there what I've suggested is that GCC
> should predefine __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS in C++11
> mode as the best way to work with all implementations of that header that
> follow what C99 footnotes suggest.

Indeed - I was just noticing that the libstdc++ testsuite doesn't
check that stdbool.h doesn't define those macros.


[patch, win32, applied] Update libgcj version number in win32 crtbegin.

2011-11-06 Thread Dave Korn


Hi list,

  The win32 crtbegin code needs to know the full name of the libgcj DLL,
including in particular the trailing version suffix generated from the libtool
version info, and so far it's not autogenerated but needs to be manually
synced.  This patch updates the two places where we have it defined.  I'll do
a proper fix to parse the info out of the libjava/libtool-version file next
stage 1 but at least this keeps it up-to-date for forthcoming 4.7

  Committed revision 181055, with the following changelog:

2011-11-06  Dave Korn  Index: gcc/config/i386/cygwin.h
===
--- gcc/config/i386/cygwin.h	(revision 181051)
+++ gcc/config/i386/cygwin.h	(working copy)
@@ -136,5 +136,5 @@ along with GCC; see the file COPYING3.  If not see
 #define LIBGCC_SONAME "cyggcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-12.dll"
+#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-13.dll"
 
Index: gcc/config/i386/mingw32.h
===
--- gcc/config/i386/mingw32.h	(revision 181051)
+++ gcc/config/i386/mingw32.h	(working copy)
@@ -230,4 +230,4 @@ do {		 \
 #define LIBGCC_SONAME "libgcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-12.dll"
+#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-13.dll"


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-06 Thread Hans-Peter Nilsson
> From: Andrew MacLeod 
> Date: Fri, 4 Nov 2011 00:50:47 +0100

> These are the changes to libstdc++ to make use of the new atomics.  I
> changed the files to use the new atomics, and bkoz did a shuffling of
> the include file layout to better suit the new c++ approach.
> 
> previously, libstdc++ provided a locked implementation in atomic_0.h
> with the theory that eventually it would be used.  The new scheme
> involves leaving non-lock-free implementations to an external library.
> This involved removing the old lock implementation and restructuring
> things now that multiple implementation dont have to be supported.   SO
> a lot fo this is churn... 2 include files deleted and one merged into
> another one..

This (formally a change in the range 181027:181034) got me three
libstdc++ regressions for cris-elf, which has no "atomic"
support whatsoever (well, not the version represented in
"cris-elf"), so something is amiss at the bottom of the default
path:

Running 
/tmp/hpautotest-gcc1/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp ...
...
FAIL: 29_atomics/atomic_flag/clear/1.cc (test for excess errors)
WARNING: 29_atomics/atomic_flag/clear/1.cc compilation failed to produce 
executable
FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc (test for excess errors)
WARNING: 29_atomics/atomic_flag/test_and_set/explicit.cc compilation failed to 
produce executable
FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc (test for excess errors)

And the linker message is:
Executing on host: /tmp/hpautotest-gcc1/cris-elf/gccobj/./gcc/g++ 
-shared-libgcc -B/tmp/hpautotest-gcc1/cris-elf/gccobj/./gcc -nostdinc++ 
-L/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/src 
-L/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/src/.libs 
-nostdinc -B/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/newlib/ -isystem 
/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/newlib/targ-include -isystem 
/tmp/hpautotest-gcc1/gcc/newlib/libc/include 
-B/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libgloss/cris 
-L/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libgloss/libnosys 
-L/tmp/hpautotest-gcc1/gcc/libgloss/cris 
-B/tmp/hpautotest-gcc1/cris-elf/pre/cris-elf/bin/ 
-B/tmp/hpautotest-gcc1/cris-elf/pre/cris-elf/lib/ -isystem 
/tmp/hpautotest-gcc1/cris-elf/pre/cris-elf/include -isystem 
/tmp/hpautotest-gcc1/cris-elf/pre/cris-elf/sys-include 
-B/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./libgloss/cris/ 
-L/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./libgloss/cris
  -L/tmp/hpautotest-gcc1/gcc/libgloss/cris 
-B/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./libstdc++-v3/src/.libs -g -O2 
-D_GLIBCXX_ASSERT -fmessage-length=0 -ffunction-sections -fdata-sections -g -O2 
-g -O2 -DLOCALEDIR="." -nostdinc++ 
-I/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/include/cris-elf 
-I/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/include 
-I/tmp/hpautotest-gcc1/gcc/libstdc++-v3/libsupc++ 
-I/tmp/hpautotest-gcc1/gcc/libstdc++-v3/include/backward 
-I/tmp/hpautotest-gcc1/gcc/libstdc++-v3/testsuite/util 
/tmp/hpautotest-gcc1/gcc/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
   -std=gnu++0x ./libtestc++.a-isystem 
/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./newlib/targ-include -isystem 
/tmp/hpautotest-gcc1/gcc/newlib/libc/include 
-B/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./libgloss/cris/ 
-L/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./libgloss/cris 
-L/tmp/hpautotest-gcc1/gcc/libgloss/cris  -B/tmp/hpaut
 otest-gcc1/cris-elf/gccobj/cris-elf/./newlib/ -L/tmp/hpautot!
 est-gcc1
/cris-elf/gccobj/cris-elf/./newlib -sim3  -lm   -o ./explicit.exe(timeout = 
600)
/tmp/cc21Ui3S.o: In function `ZNSt11atomic_flag12test_and_setESt12memory_order':
/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/include/bits/atomic_base.h:264:
 undefined reference to `__atomic_exchange_1'
collect2: error: ld returned 1 exit status
compiler exited with status 1
output is:
/tmp/cc21Ui3S.o: In function `ZNSt11atomic_flag12test_and_setESt12memory_order':
/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/libstdc++-v3/include/bits/atomic_base.h:264:
 undefined reference to `__atomic_exchange_1'
collect2: error: ld returned 1 exit status

brgds, H-P


Re: Implement C1X _Alignas, _Alignof, max_align_t, stdalign.h

2011-11-06 Thread Joseph S. Myers
On Mon, 7 Nov 2011, Jonathan Wakely wrote:

> On 6 November 2011 23:53, Joseph S. Myers wrote:
> >
> > As with stdnoreturn.h, the contents of stdalign.h are conditioned out
> > for C++; I'll leave it to C++ people to work out what's most useful
> > there if something nonempty is wanted (stdnoreturn.h is empty for C++,
> > stdbool.h defines _Bool and bool to bool, true to true etc.).
> 
> Thanks Joseph, that will allow use to provide  in libstdc++.
> 
> By my reading of the C++11 standard, the __alignas_is_defined macro
> should still be defined in C++, and alignof isn't mentioned, was that
> a late addition to C1X?

The move from alignof as a keyword to _Alignof as a keyword with alignof 
as a macro in stdalign.h was done at the London meeting (March 2011) in 
response to comment US20.  stdnoreturn.h was also added at that meeting.

It looks like what GCC defines in stdbool.h for C++ goes beyond what C++11 
says it should ("The header  and the header  shall 
not define macros named bool, true, or false.")  stdint.h is another case 
needing adjustment for C++11, though there what I've suggested is that GCC 
should predefine __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS in C++11 
mode as the best way to work with all implementations of that header that 
follow what C99 footnotes suggest.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Implement C1X _Alignas, _Alignof, max_align_t, stdalign.h

2011-11-06 Thread Jonathan Wakely
On 6 November 2011 23:53, Joseph S. Myers wrote:
>
> As with stdnoreturn.h, the contents of stdalign.h are conditioned out
> for C++; I'll leave it to C++ people to work out what's most useful
> there if something nonempty is wanted (stdnoreturn.h is empty for C++,
> stdbool.h defines _Bool and bool to bool, true to true etc.).

Thanks Joseph, that will allow use to provide  in libstdc++.

By my reading of the C++11 standard, the __alignas_is_defined macro
should still be defined in C++, and alignof isn't mentioned, was that
a late addition to C1X?


Fix various minor issues in gcse.c

2011-11-06 Thread Eric Botcazou
While working on PR rtl-opt/50904, I ran into long-standing minor issues in 
gcse.c that can easily be addressed:
  - a few outdated comments,
  - poor interface with note_stores,
  - edge_list being sometimes a global variable and sometimes a parameter,
  - gcse_emit_move_after takes source and destination in the opposite order of 
all other emit_move routines,
  - left-overs of aborted encapsulation (first_ls_expr, next_ls_expr),
  - a few long lines and a few typos left and right.

No functional changes.  Tested on i586-suse-linux, applied on mainline,


2011-11-06  Eric Botcazou  

* gcse.c: Adjust outdated comments throughout.
(struct mem_conflict_info): New structure.
(mems_conflict_for_gcse_p): Use it to communicate with caller.
(load_killed_in_block_p): Pass it to note_stores.
(hash_expr): Remove superfluous line break.
(hash_scan_set): Rename PAT parameter into SET.
(hash_scan_insn): Reorder cases.
(canon_list_insert): Fix long line.
(edge_list): Delete.
(prune_expressions): Rename E local variable into EXPR.
(compute_pre_data): Return struct edge_list * object.
(pre_expr_reaches_here_p_work): Fix formatting.
(process_insert_insn): Move around comment.
(pre_edge_insert): Fix long line.
(pre_insert_copies): Likewise.
(gcse_emit_move_after): Swap SRC and DEST parameters.
(pre_delete): Adjust call to gcse_emit_move_after.
(pre_gcse): Take struct edge_list * parameter.  Fix long line.
(one_pre_gcse_pass): Use flag_gcse_lm condition for all routines.
Use a local list of edges.
(hoist_code): Fix long line.  Adjust call to gcse_emit_move_after.
(pre_ldst_expr_hash): Fix long line.
(free_ldst_mems): Rename into...
(free_ld_motion_mems): ...this.
(first_ls_expr): Delete.
(next_ls_expr): Likewise.
(print_ldst_list): Do not use above two functions.
(simple_mem): Adjust interface.
(compute_ld_motion_mems): Fix formatting.
(update_ld_motion_stores): Reuse local variable.


-- 
Eric Botcazou
Index: gcse.c
===
--- gcse.c	(revision 181007)
+++ gcse.c	(working copy)
@@ -23,10 +23,6 @@ along with GCC; see the file COPYING3.
- do rough calc of how many regs are needed in each block, and a rough
  calc of how many regs are available in each class and use that to
  throttle back the code in cases where RTX_COST is minimal.
-   - a store to the same address as a load does not kill the load if the
- source of the store is also the destination of the load.  Handling this
- allows more load motion, particularly out of loops.
-
 */
 
 /* References searched while implementing this.
@@ -267,7 +263,7 @@ struct reg_use {rtx reg_rtx; };
 
 struct expr
 {
-  /* The expression (SET_SRC for expressions, PATTERN for assignments).  */
+  /* The expression.  */
   rtx expr;
   /* Index in the available expression bitmaps.  */
   int bitmap_index;
@@ -346,14 +342,12 @@ static struct hash_table_d expr_hash_tab
 
 /* This is a list of expressions which are MEMs and will be used by load
or store motion.
-   Load motion tracks MEMs which aren't killed by
-   anything except itself. (i.e., loads and stores to a single location).
+   Load motion tracks MEMs which aren't killed by anything except itself,
+   i.e. loads and stores to a single location.
We can then allow movement of these MEM refs with a little special
allowance. (all stores copy the same value to the reaching reg used
for the loads).  This means all values used to store into memory must have
-   no side effects so we can re-issue the setter value.
-   Store Motion uses this structure as an expression table to track stores
-   which look interesting, and might be moveable towards the exit block.  */
+   no side effects so we can re-issue the setter value.  */
 
 struct ls_expr
 {
@@ -454,14 +448,14 @@ static int load_killed_in_block_p (const
 static void canon_list_insert (rtx, const_rtx, void *);
 static void alloc_pre_mem (int, int);
 static void free_pre_mem (void);
-static void compute_pre_data (void);
+static struct edge_list *compute_pre_data (void);
 static int pre_expr_reaches_here_p (basic_block, struct expr *,
 basic_block);
 static void insert_insn_end_basic_block (struct expr *, basic_block);
 static void pre_insert_copy_insn (struct expr *, rtx);
 static void pre_insert_copies (void);
 static int pre_delete (void);
-static int pre_gcse (void);
+static int pre_gcse (struct edge_list *);
 static int one_pre_gcse_pass (void);
 static void add_label_notes (rtx, rtx);
 static void alloc_code_hoist_mem (int, int);
@@ -478,11 +472,9 @@ static int pre_expr_reaches_here_p_work
 	 basic_block, char *);
 static struct ls_expr * ldst_entry (rtx);
 static void free_ldst_entry (struct ls_expr *);
-static void free_ldst_mems

Re: PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Diego Novillo
On Sun, Nov 6, 2011 at 18:56, Andrew Pinski  wrote:

> Why not just fix the issue instead of erroring out?  No other option
> has issues with LTO other than TM.  In fact I think this should have
> been a merge blocker really.

I disagree.  TM is a new experimental feature.  It is fine if it is
not supported with all optimization options.  There are other features
not supported with LTO after all (optimization option nodes, omp
clauses, etc).

In the context of TM, support of LTO and FDO will certainly be an
important optimization vector for future work, so I don't expect it to
be unsupported for long.


Diego.


Re: [v3] define string::pop_back()

2011-11-06 Thread Jonathan Wakely
I first posted this a month ago, this adjusts the exports and moves
the tests under separate char and wchar_t directories as requested by
Paolo.

* include/bits/basic_string.h (basic_string::at): Move adjacent to other
overload.
(basic_string::pop_back): Define.
* include/debug/string (__gnu_debug::basic_string::pop_back): Likewise.
* include/ext/vstring.h (__versa_string::pop_back): Likewise.
* config/abi/pre/gnu.ver: Add new symbols.
* testsuite/21_strings/basic_string/modifiers/char/pop_back.cc: New.
* testsuite/21_strings/basic_string/modifiers/wchar_t/pop_back.cc: New.
* testsuite/21_strings/basic_string/range_access.cc: Split to ...
* testsuite/21_strings/basic_string/range_access/char/1.cc: Here and ...
* testsuite/21_strings/basic_string/range_access/wchar_t/1.cc: Here.
* testsuite/ext/vstring/modifiers/char/pop_back.cc: New.
* testsuite/ext/vstring/modifiers/wchar_t/pop_back.cc: New.

Tested x86_64-linux, committed to trunk.
Index: include/bits/basic_string.h
===
--- include/bits/basic_string.h	(revision 181047)
+++ include/bits/basic_string.h	(working copy)
@@ -865,6 +865,26 @@
 	return _M_data()[__n];
   }
 
+  /**
+   *  @brief  Provides access to the data contained in the %string.
+   *  @param __n The index of the character to access.
+   *  @return  Read/write reference to the character.
+   *  @throw  std::out_of_range  If @a n is an invalid index.
+   *
+   *  This function provides for safer data access.  The parameter is
+   *  first checked that it is in the range of the string.  The function
+   *  throws out_of_range if the check fails.  Success results in
+   *  unsharing the string.
+   */
+  reference
+  at(size_type __n)
+  {
+	if (__n >= size())
+	  __throw_out_of_range(__N("basic_string::at"));
+	_M_leak();
+	return _M_data()[__n];
+  }
+
 #ifdef __GXX_EXPERIMENTAL_CXX0X__
   /**
*  Returns a read/write reference to the data at the first
@@ -899,26 +919,6 @@
   { return operator[](this->size() - 1); }
 #endif
 
-  /**
-   *  @brief  Provides access to the data contained in the %string.
-   *  @param __n The index of the character to access.
-   *  @return  Read/write reference to the character.
-   *  @throw  std::out_of_range  If @a n is an invalid index.
-   *
-   *  This function provides for safer data access.  The parameter is
-   *  first checked that it is in the range of the string.  The function
-   *  throws out_of_range if the check fails.  Success results in
-   *  unsharing the string.
-   */
-  reference
-  at(size_type __n)
-  {
-	if (__n >= size())
-	  __throw_out_of_range(__N("basic_string::at"));
-	_M_leak();
-	return _M_data()[__n];
-  }
-
   // Modifiers:
   /**
*  @brief  Append a string to this string.
@@ -1394,7 +1394,18 @@
   iterator
   erase(iterator __first, iterator __last);
  
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
   /**
+   *  @brief  Remove the last character.
+   *
+   *  The string must be non-empty.
+   */
+  void
+  pop_back()
+  { erase(size()-1, 1); }
+#endif // __GXX_EXPERIMENTAL_CXX0X__
+
+  /**
*  @brief  Replace characters with value from another string.
*  @param __pos  Index of first character to replace.
*  @param __n  Number of characters to be replaced.
Index: include/debug/string
===
--- include/debug/string	(revision 181047)
+++ include/debug/string	(working copy)
@@ -580,6 +580,16 @@
   return iterator(__res, this);
 }
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
+void
+pop_back()
+{
+  __glibcxx_check_nonempty();
+  _Base::pop_back();
+  this->_M_invalidate_all();
+}
+#endif // __GXX_EXPERIMENTAL_CXX0X__
+
 basic_string&
 replace(size_type __pos1, size_type __n1, const basic_string& __str)
 {
Index: include/ext/vstring.h
===
--- include/ext/vstring.h	(revision 181047)
+++ include/ext/vstring.h	(working copy)
@@ -1140,7 +1140,18 @@
 	return iterator(this->_M_data() + __pos);
   }
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
   /**
+   *  @brief  Remove the last character.
+   *
+   *  The string must be non-empty.
+   */
+  void
+  pop_back()
+  { this->_M_erase(size()-1, 1); }
+#endif // __GXX_EXPERIMENTAL_CXX0X__
+
+  /**
*  @brief  Replace characters with value from another string.
*  @param __pos  Index of first character to replace.
*  @param __n  Number of characters to be replaced.
Index: config/abi/pre/gnu.ver
===
--- config/abi/pre/gnu.ver	(revision 181047)

Re: PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Andrew Pinski
On Sun, Nov 6, 2011 at 12:03 PM, Aldy Hernandez  wrote:
> As per your comment in the PR...
>
> OK for branch?


Why not just fix the issue instead of erroring out?  No other option
has issues with LTO other than TM.  In fact I think this should have
been a merge blocker really.

Thanks,
Andrew Pinsi


Implement C1X _Alignas, _Alignof, max_align_t, stdalign.h

2011-11-06 Thread Joseph S. Myers
This patch adds support for most of the C1X alignment features
(_Alignas, _Alignof, max_align_t, stdalign.h).

_Alignof is essentially GNU __alignof__, but restricted to be applied
to types not expressions and not allowed on function types.  It goes
through the existing code, but with diagnostics for these cases (if
pedantic) added along with such checks for use with pre-C1X standards
(if pedantic).  C++11 alignof also disallows use on function types,
which was wrongly accepted before this patch; the fix is in common
code and I added a C++ testcase for it.

_Alignas is similar to the "aligned" attribute, but the value 0 is
allowed as an operand.  The checking code is shared with that
attribute; a previously missing check that the operand has integer
type (recall that integer constants cast to pointer types are
represented as INTEGER_CSTs with pointer type) is added as it's
required for _Alignas but also seems appropriate for the attribute.

stddef.h gets a new type max_align_t (also in C++11 so enabled for
that as well).  I hope the definition here is appropriate for all
supported targets.

typedef struct {
  long long __max_align_ll __attribute__((__aligned__(__alignof__(long long;
  long double __max_align_ld __attribute__((__aligned__(__alignof__(long 
double;
} max_align_t;

(The attributes there are because some targets may give types lower
alignment inside structures and unions than outside; 32-bit x86 in
particular.)

As with stdnoreturn.h, the contents of stdalign.h are conditioned out
for C++; I'll leave it to C++ people to work out what's most useful
there if something nonempty is wanted (stdnoreturn.h is empty for C++,
stdbool.h defines _Bool and bool to bool, true to true etc.).  Also as
with that header, I did nothing about adding it to USER_H in the MIPS
SDE configuration, which I suggested there
 should be
deprecated.

Missing from this patch are support for _Alignas in structures and
unions, and checks that alignments are supported in the context in
which they are used.  Both of these involve issues I've raised with
the WG14 reflector; for the former, the C1X syntax is missing
productions to allow _Alignas in structures although it appears to be
intended to be allowed.  For the latter, the relevant context given
alignment in structures ought to be the declaration of an object using
the structure type rather than the type declaration itself, though
"context" is only mentioned in constraints for _Alignas declarations.

I think the right way of making the checks is unconditional errors if
an object with static storage duration has an alignment greater than
MAX_OFILE_ALIGNMENT (presently just a warning), or an object with
automatic storage duration has an alignment greater than
MAX_STACK_ALIGNMENT (presently the alignment is quietly ignored) -
whether those alignments come from _Alignas or from the existing
attributes; I don't think the existing behavior of compiling code
without following the specified alignment is sensible.  But obviously
changing this does risk breaking existing code that specifies more
alignment than it needs.

The checks for alignof not being used pre-C1X if pedantic are
conditional on the _Alignof spelling being used (rather than the
existing __alignof or __alignof__) because otherwise they break
bootstrap (code such as ansidecl.h expects to use __alignof__ without
__extension__).

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

2011-11-06  Joseph Myers  

* c-decl.c (shadow_tag_warned, grokdeclarator): Handle _Alignas
specifiers.
(build_null_declspecs): Initialize align_log and alignas_p fields.
(declspecs_add_alignas): New.
* c-parser.c (c_token_starts_declspecs): Handle RID_ALIGNAS.
(c_parser_declspecs): Handle _Alignas specifiers.
(c_parser_alignas_specifier): New.
(c_parser_alignof_expression): Diagnose alignof use for non-C1X.
Diagnose _Alignof (expression).
* c-tree.h (struct c_declspecs): Add align_log and alignas_p
fields.
(declspecs_add_alignas): Declare.
* ginclude/stddef.h (max_align_t): Define for C1X and C++11.
* ginclude/stdalign.h: New.
* Makefile.in (USER_H): Add stdalign.h.

c-family:
2011-11-06  Joseph Myers  

* c-common.c (c_common_reswords): Add _Alignas and _Alignof.
(c_sizeof_or_alignof_type): Diagnose alignof applied to a function
type.
(check_user_alignment): New.  Split out of
handle_aligned_attribute.  Disallow integer constants with
noninteger types.  Conditionally allow zero.
(handle_aligned_attribute): Use check_user_alignment.
* c-common.h (RID_ALIGNAS, check_user_alignment): New.

testsuite:
2011-11-06  Joseph Myers  

* g++.dg/cpp0x/alignof3.C, gcc.dg/c1x-align-1.c,
gcc.dg/c1x-align-2.c, gcc.dg/c1x-align-3.c, gcc.dg/c1x-align-4.c,
gcc

Re: [patch] 6/n: trans-mem: runtime

2011-11-06 Thread Torvald Riegel
On Sun, 2011-11-06 at 23:04 +, Joseph S. Myers wrote:
> On Sun, 6 Nov 2011, Torvald Riegel wrote:
> 
> > Is the attached patch what you'd like to see? It doesn't yet use the
> 
> It's plausible, but really a build system maintainer should look at it.
> 

So, can we keep this as-is then and fix this after the TM merge
(together with the similar bits in libgfortran, libgomp, and libstc
++v3), or is this a stopper?
If the latter, could the build system maintainers please look at this
soon?

Torvald



Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Dimitrios Apostolou

On Mon, 7 Nov 2011, Jakub Jelinek wrote:

On Mon, Nov 07, 2011 at 12:01:29AM +0200, Dimitrios Apostolou wrote:

On Sun, 6 Nov 2011, Joern Rennecke wrote:

But where HARD_REG_SETS make no material difference in speed, and the
compilation unit has no other tight coupling with tm.h, it would really
be cleaner to move from HARD_REG_SETS to a target-independent type,
like sbitmap or bitmap.  Maybe we want something more lightweight than
an sbitmap, like passing in a HARD_REG_ELT_TYPE *base, int n_elements
pair to describe the data - n_elements would be set to HARD_REG_SET_LONGS.
When you are doing anything time-critical with hard register sets,
there is little point having the same size stored in multiple places,
you might as well keep it in a register and pass it around.
Basic operations could presumably be inlined, so the main overhead would
be loop overhead.  With the right loop unrolling applied, these loops
should be well predictable.


Working with bitmaps in gcc made me really miss the simple bitmap.h
of the linux kernel. I'm not sure if we can borrow from there. More
info at:

http://lxr.linux.no/#linux+v3.1/include/linux/bitmap.h


Hm, what else is hard-reg-set.h?


It's about the same performance-wise, but:

* much uglier because of all the #ifdef cruft
* less useful since it can't be used as a generic bitmap
** Closest generic bitmap is sbitmap.h, but IMHO it's more complex than it 
should be.


I'd love to have the kernel bitmap.h in libiberty.



When you are dealing with pseudos etc.
sparse bitmaps are very often better.


Indeed. They are much better memory-wise, but they have a non-negligible 
overhead performance-wise. I've noted somewhere to replace all hot bitmaps 
with ebitmaps and see what happens, whenever I have the time...



Dimitris



[v3] fix autoconf examples for testing C++11 support

2011-11-06 Thread Jonathan Wakely
The autoconf examples at
http://gcc.gnu.org/onlinedocs/libstdc++/manual/backwards.html#id541888
are incorrect, relying on pre-standard semantics for rvalue
references, and missing out several C++11 headers.  This patch
corrects those examples.

This also tweaks some markup in the file, and adds some xml:id
attributes to ensure stable anchors when the HTML is regenerated, so
the link above will become
http://gcc.gnu.org/onlinedocs/libstdc++/manual/backwards.html#backwards.third.support_cxx11
and stay the same in future.

* doc/xml/manual/backwards_compatibility.xml: Fix autoconf tests for
C++11 compiler features and library headers. Add stable id
attributes. Use  element for headers and surround in angle
brackets. Use  for classes.
* doc/html/*: Regenerate.

Committed to trunk.
Index: doc/xml/manual/backwards_compatibility.xml
===
--- doc/xml/manual/backwards_compatibility.xml	(revision 181041)
+++ doc/xml/manual/backwards_compatibility.xml	(working copy)
@@ -42,24 +42,24 @@ Committee couldn't include everything, a
 
 Portability notes and known implementation limitations are as follows.
 
-No ios_base
+No ios_base
   
 
  At least some older implementations don't have std::ios_base, so you should use std::ios::badbit, std::ios::failbit and std::ios::eofbit and std::ios::goodbit.
 
 
 
-No cout in ostream.h, no cin in istream.h
+No cout in , no cin in 
 
 
 
 	In earlier versions of the standard,
-	fstream.h,
-	ostream.h
-	and istream.h
+	,
+	
+	and 
 	used to define
 	cout, cin and so on. ISO C++ specifies that one needs to include
-	iostream
+	
 	explicitly to get the required definitions.
  
  Some include adjustment may be required.
@@ -96,7 +96,7 @@ considered replaced and rewritten.
   Portability notes and known implementation limitations are as follows.
 
 
-Namespace std:: not supported
+Namespace std:: not supported
   
 
   
@@ -114,7 +114,7 @@ considered replaced and rewritten.
 First, see if the compiler has a flag for this. Namespace
 back-portability-issues are generally not a problem for g++
 compilers that do not have libstdc++ in std::, as the
-compilers use -fno-honor-std (ignore
+compilers use -fno-honor-std (ignore
 std::, :: = std::) by default. That is,
 the responsibility for enabling or disabling std:: is
 on the user; the maintainer does not have to care about it. This
@@ -182,7 +182,7 @@ AC_DEFUN([AC_CXX_NAMESPACE_STD], [
 
 
 
-Illegal iterator usage
+Illegal iterator usage
 
 
   The following illustrate implementation-allowed illegal iterator
@@ -212,12 +212,12 @@ AC_DEFUN([AC_CXX_NAMESPACE_STD], [
 
 
 
-isspace from cctype is a macro
+isspace from  is a macro
   
   
 
   
-Glibc 2.0.x and 2.1.x define ctype.h functionality as macros
+Glibc 2.0.x and 2.1.x define  functionality as macros
 (isspace, isalpha etc.).
   
 
@@ -242,7 +242,7 @@ std:: (__ctype_b[(int) ( ( 'X' ) )] &
 
 
   A solution is to modify a header-file so that the compiler tells
-  ctype.h to define functions
+   to define functions
   instead of macros:
 
 
@@ -254,20 +254,21 @@ std:: (__ctype_b[(int) ( ( 'X' ) )] &
 
 
 
-  Then, include ctype.h
+  Then, include 
 
 
 
   Another problem arises if you put a using namespace
-  std; declaration at the top, and include ctype.h. This will result in
-  ambiguities between the definitions in the global namespace
-  (ctype.h) and the
+  std; declaration at the top, and include
+  . This will
+  result in ambiguities between the definitions in the global namespace
+  () and the
   definitions in namespace std::
   ().
 
 
 
-No vector::at, deque::at, string::at
+No vector::at, deque::at, string::at
 
 
 
@@ -304,7 +305,7 @@ AC_DEFINE(HAVE_CONTAINER_AT)],
 
 
 
-No std::char_traits::eof
+No std::char_traits::eof
 
 
 
@@ -321,7 +322,7 @@ AC_DEFINE(HAVE_CONTAINER_AT)],
 
 
 
-No string::clear
+No string::clear
 
 
 
@@ -351,7 +352,7 @@ erase(size_type __pos = 0, size_type __n
 
 
 
-
+
   Removal of ostream::form and istream::scan
   extensions
 
@@ -362,14 +363,14 @@ erase(size_type __pos = 0, size_type __n
 
 
 
-No basic_stringbuf, basic_stringstream
+No basic_stringbuf, basic_stringstream
 
 
 
   Although the ISO standard i/ostringstream-classes are
-  provided, (sstream), for
+  provided, (), for
   compatibility with older implementations the pre-ISO
-  i/ostrstream (strstream) interface is also provided,
+  i/ostrstream () interface is also provided,
   with these caveats:
 
 
@@ -490,7 +491,7 @@ particular info iostream.
 
 
 
-Little or no wide character support
+Little or no wide character support
   
   
 Classes wstring and
@@ -499,7 +500,7 @@ particular info iostream.
   
 
 
-No templatized iostreams
+No templatized iostreams
   
   
 Classes 

[Patch, Fortran, OOP] PR 50919: Don't use vtable for NON_OVERRIDABLE TBP

2011-11-06 Thread Janus Weil
Hi all,

up to now we call all type-bound procedures in a dynamic way, i.e.
through their entry in the vtable. However, for non-overridable
procedures this is not necessary. Since they can not be overridden, a
call to those can be resolved at compile time to an ordinary function
call, without the need of a 'detour' through the vtable. This is what
the attached patch does, thereby removing unneeded overhead and
improving the possibility of optimization (inlining, etc).

The patch actually consists of two parts:
1) The resolve.c part prevents the conversion to a PPC call via the
_vptr (for functions and subroutines).
2) The class.c parts prevents adding the non-overridable TBP to the vtable.

As noted by Tobias, the second part breaks the ABI, so we might
consider deferring it until other ABI-breaking features will be
implemented (cf. http://gcc.gnu.org/wiki/LibgfortranAbiCleanup). On
the other hand, one could argue that the OOP ABI is still quite young
and hasn't really stabilized yet (it was broken already from 4.5 to
4.6), so we might as well break it again. I know that there are a
couple of real-world codes out there, which make use of gfortran's OOP
features already, but I have a hard time estimating how many such
projects exists, or how problematic an ABI breaking would be for them
(user input welcome).

So, the question is: Should I commit both parts, or only the resolve.c
one for now? The patch was regtested on x86_64-unknown-linux-gnu.

Cheers,
Janus


2011-11-06  Janus Weil  

PR fortran/50919
* class.c (add_proc_comp): Don't add non-overridable procedures to the
vtable.
* resolve.c (resolve_typebound_function,resolve_typebound_subroutine):
Don't generate a dynamic _vptr call for non-overridable procedures.

2011-11-06  Janus Weil  

PR fortran/50919
* gfortran.dg/typebound_call_21.f03: New.
Index: gcc/fortran/class.c
===
--- gcc/fortran/class.c	(revision 181043)
+++ gcc/fortran/class.c	(working copy)
@@ -288,6 +288,10 @@ static void
 add_proc_comp (gfc_symbol *vtype, const char *name, gfc_typebound_proc *tb)
 {
   gfc_component *c;
+
+  if (tb->non_overridable)
+return;
+  
   c = gfc_find_component (vtype, name, true, true);
 
   if (c == NULL)
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 181044)
+++ gcc/fortran/resolve.c	(working copy)
@@ -5868,11 +5868,13 @@ resolve_typebound_function (gfc_expr* e)
   const char *name;
   gfc_typespec ts;
   gfc_expr *expr;
+  bool overridable;
 
   st = e->symtree;
 
   /* Deal with typebound operators for CLASS objects.  */
   expr = e->value.compcall.base_object;
+  overridable = !e->value.compcall.tbp->non_overridable;
   if (expr && expr->ts.type == BT_CLASS && e->value.compcall.name)
 {
   /* Since the typebound operators are generic, we have to ensure
@@ -5923,22 +5925,26 @@ resolve_typebound_function (gfc_expr* e)
 return FAILURE;
   ts = e->ts;
 
-  /* Then convert the expression to a procedure pointer component call.  */
-  e->value.function.esym = NULL;
-  e->symtree = st;
+  if (overridable)
+{
+  /* Convert the expression to a procedure pointer component call.  */
+  e->value.function.esym = NULL;
+  e->symtree = st;
 
-  if (new_ref)  
-e->ref = new_ref;
+  if (new_ref)  
+	e->ref = new_ref;
 
-  /* '_vptr' points to the vtab, which contains the procedure pointers.  */
-  gfc_add_vptr_component (e);
-  gfc_add_component_ref (e, name);
+  /* '_vptr' points to the vtab, which contains the procedure pointers.  */
+  gfc_add_vptr_component (e);
+  gfc_add_component_ref (e, name);
 
-  /* Recover the typespec for the expression.  This is really only
- necessary for generic procedures, where the additional call
- to gfc_add_component_ref seems to throw the collection of the
- correct typespec.  */
-  e->ts = ts;
+  /* Recover the typespec for the expression.  This is really only
+	necessary for generic procedures, where the additional call
+	to gfc_add_component_ref seems to throw the collection of the
+	correct typespec.  */
+  e->ts = ts;
+}
+
   return SUCCESS;
 }
 
@@ -5957,11 +5963,13 @@ resolve_typebound_subroutine (gfc_code *code)
   const char *name;
   gfc_typespec ts;
   gfc_expr *expr;
+  bool overridable;
 
   st = code->expr1->symtree;
 
   /* Deal with typebound operators for CLASS objects.  */
   expr = code->expr1->value.compcall.base_object;
+  overridable = !code->expr1->value.compcall.tbp->non_overridable;
   if (expr && expr->ts.type == BT_CLASS && code->expr1->value.compcall.name)
 {
   /* Since the typebound operators are generic, we have to ensure
@@ -6006,22 +6014,26 @@ resolve_typebound_subroutine (gfc_code *code)
 return FAILURE;
   ts = code->expr1->ts;
 
-  /* Then convert the expression to a procedure pointer component call.  */
-  co

Re: [patch] 6/n: trans-mem: runtime

2011-11-06 Thread Joseph S. Myers
On Sun, 6 Nov 2011, Torvald Riegel wrote:

> Is the attached patch what you'd like to see? It doesn't yet use the

It's plausible, but really a build system maintainer should look at it.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Jakub Jelinek
On Mon, Nov 07, 2011 at 12:01:29AM +0200, Dimitrios Apostolou wrote:
> On Sun, 6 Nov 2011, Joern Rennecke wrote:
> >But where HARD_REG_SETS make no material difference in speed, and the
> >compilation unit has no other tight coupling with tm.h, it would really
> >be cleaner to move from HARD_REG_SETS to a target-independent type,
> >like sbitmap or bitmap.  Maybe we want something more lightweight than
> >an sbitmap, like passing in a HARD_REG_ELT_TYPE *base, int n_elements
> >pair to describe the data - n_elements would be set to HARD_REG_SET_LONGS.
> >When you are doing anything time-critical with hard register sets,
> >there is little point having the same size stored in multiple places,
> >you might as well keep it in a register and pass it around.
> >Basic operations could presumably be inlined, so the main overhead would
> >be loop overhead.  With the right loop unrolling applied, these loops
> >should be well predictable.
> 
> Working with bitmaps in gcc made me really miss the simple bitmap.h
> of the linux kernel. I'm not sure if we can borrow from there. More
> info at:
> 
> http://lxr.linux.no/#linux+v3.1/include/linux/bitmap.h

Hm, what else is hard-reg-set.h?  When you are dealing with pseudos etc.
sparse bitmaps are very often better.

Jakub


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Joern Rennecke

Quoting Dimitrios Apostolou :


Working with bitmaps in gcc made me really miss the simple bitmap.h of
the linux kernel. I'm not sure if we can borrow from there. More info
at:

http://lxr.linux.no/#linux+v3.1/include/linux/bitmap.h


You would need the Copyright holders to assign their Copyright in this
code to the FSF.  Or give a GPL-v3-or-later license and get the FSF to
agree to use this code in GCC in spite of not owning the Copyright.

As sad as this is, with two Free Software projects both using (some version
of) the GPL, it is probably easier to re-invent the wheel.
Or find some matching stuff that the FSF already owns.


[patch tree-optimization 2/2]: Branch-cost optimizations

2011-11-06 Thread Kai Tietz
Hello,

the second patch extends the tree-ssa-ifcombine pass so, that it chains up 
simple if-and/or-if patterns via associative bitwise-and/or operations.  This 
allows for example optimization for cases like:

if (c == 0) return 2;
if (c == 1) return 2;
if (c == 2) return 2;
...

as now reassociation-pass can optimize on them.

ChangeLog

2011-11-06  Kai Tietz  

* tree-ssa-ifcombine.c (remove_stmt_chain): New helper.
(update_gimple_cond_condtion_from_tree): Likewise.
(stmt_no_side_effects_p): Likewise.
(bb_no_side_effects_p): Use stmt_no_side_effects_p.
(bb_no_side_effects_p_2): New helper function.
(same_phi_args_p_2): Likewise.
(recognize_single_bit_test): Allow equal and not-equal
comparison handling.
(ifcombine_ifandif): Handle equal and not-equal
(X & CST) !=/== 0 optimization.
(ifcombine_ifandif_merge): New helper for tree_ssa_ifmerge_bb.
(ifcombine_iforif_merge): Likewise.
(ifcombine_iforif): Simplify routine.
(tree_ssa_ifmerge_bb): New helper for doing if-branch merging.
(tree_ssa_ifcombine_bb): Adjust pattern-searching for iforif
and ifandif.
(tree_ssa_ifcombine): Add if-branch merging and allow
multiple folding for if-combining.

ChangeLog  testsuite

2011-11-06  Kai Tietz  

* gcc.dg/tree-ssa/phi-opt-2.c: Adjust test.
* gcc.dg/tree-ssa/ifcombine-8.c: New test.
* gcc.dg/tree-ssa/ifcombine-9.c: New test.
* gcc.dg/tree-ssa/ifcombine-10.c: New test.
* gcc.dg/tree-ssa/ifcombine-11.c: New test.
* gcc.dg/tree-ssa/ifcombine-12.c: New test.


Bootstrapped and regression-tested for x86_64-unknown-linux-gnu for all 
languages (include Ada and Obj-C++).  Ok for apply?

Regards,
Kai
ChangeLog

2011-11-06  Kai Tietz  

* tree-ssa-ifcombine.c (remove_stmt_chain): New helper.
(update_gimple_cond_condtion_from_tree): Likewise.
(stmt_no_side_effects_p): Likewise.
(bb_no_side_effects_p): Use stmt_no_side_effects_p.
(bb_no_side_effects_p_2): New helper function.
(same_phi_args_p_2): Likewise.
(recognize_single_bit_test): Allow equal and not-equal
comparison handling.
(ifcombine_ifandif): Handle equal and not-equal
(X & CST) !=/== 0 optimization.
(ifcombine_ifandif_merge): New helper for tree_ssa_ifmerge_bb.
(ifcombine_iforif_merge): Likewise.
(ifcombine_iforif): Simplify routine.
(tree_ssa_ifmerge_bb): New helper for doing if-branch merging.
(tree_ssa_ifcombine_bb): Adjust pattern-searching for iforif
and ifandif.
(tree_ssa_ifcombine): Add if-branch merging and allow
multiple folding for if-combining.

ChangeLog  testsuite

2011-11-06  Kai Tietz  

* gcc.dg/tree-ssa/phi-opt-2.c: Adjust test.
* gcc.dg/tree-ssa/ifcombine-8.c: New test.
* gcc.dg/tree-ssa/ifcombine-9.c: New test.
* gcc.dg/tree-ssa/ifcombine-10.c: New test.
* gcc.dg/tree-ssa/ifcombine-11.c: New test.
* gcc.dg/tree-ssa/ifcombine-12.c: New test.
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-2.c
===
--- gcc-trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-2.c
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-2.c
@@ -10,7 +10,7 @@ _Bool f1(_Bool a, _Bool b)
  else
   return 0;
}
-  return 0;
+  return 2;
 }
 
 
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-10.c
===
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-10.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ifcombine" } */
+
+int foo (int x)
+{
+  if ((x & 4) == 0)
+if ((x & 8) != 0)
+  /* returning 1 causes phiopt to trigger in */
+  return 2;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "\\& 12" "ifcombine" } } */
+/* { dg-final { scan-tree-dump "== 8" "ifcombine" } } */
+/* { dg-final { cleanup-tree-dump "ifcombine" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-11.c
===
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-11.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ifcombine" } */
+
+int foo (int x)
+{
+  if ((x & 4) != 0)
+if ((x & 8) == 0)
+  /* returning 1 causes phiopt to trigger in */
+  return 2;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "\\& 12" "ifcombine" } } */
+/* { dg-final { scan-tree-dump "== 4" "ifcombine" } } */
+/* { dg-final { cleanup-tree-dump "ifcombine" } } */
Index: gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-12.c
===
--- /dev/null
+++ gcc-trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-12.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */

[patch tree-optimization 1/2]: Branch-cost optimizations

2011-11-06 Thread Kai Tietz
Hello,

By this patch branch-cost optimization is moved from tree AST to cfgexpand from 
gimple to RTL.  By this we are able to do better optimization on conditionals 
simliar for all targets and do the final transition for branch-cost that late 
it shows best effect.

This patch is splitted up into two pieces.  First adds feature of BC 
optimization to cfgexpand and scratches out BC-optimization code from 
fold-const.

The second patch adds to tree-ssa-ifcombine pass the feature to merge simple 
if-and/or-if patterns into associative form.

Two tests are failing due this patch in C's testsuite.  This are unit-pred-6_b 
and uniit-pred-6_c testcases.  Those failures are caused by jump-threading 
optimization in vrp, as vrp-pass.  Those failures could be fixed by the second 
patch, if we would move the ifcombine pass before the first vrp pass.

ChangeLog

2011-11-06  Kai Tietz  

* cfgexpand.c (is_bool_op_p): New helper.
(normalize_truth_condition): Likewise.
(cond_assoc_t): New structure type.
(collect_cond_chain): New helper.
(build_cond_expr): Likewise.
(is_bitwise_binary_simple_combine): Likewise.
(preeval_cond_integral): Likewise.
(combine_conds): Likewise.
(branchcost_optimization_on_conditions): Likewise.
(expand_gimple_cond): Use branchcost_optimization_on_condition
function.
* dojump.c (do_jump): Prevent converting bitwise-and/or
to real iffs for branch-cost bigger then zero.
* fold_const.c (simple_operand_p_2): Improve evaluation
of side-effects and trapping for associative truth-bitwise
binary operations.
(fold_range_test): Remove branch-cost specific code.
(fold_truth_andor_1): Likewise.
(fold_truth_andor): Likewise.

ChangeLog testsuite

2011-11-06  Kai Tietz  

* gcc.dg/pr46909.c: Adjust test.
* gcc.dg/tree-ssa/vrp33.c: Likewise.
* gcc.target/i386/branch-cost1.c: Likewise.
* gcc.target/i386/branch-cost2.c: Likewise.
* gcc.target/i386/branch-cost3.c: Likewise.
* gcc.target/i386/branch-cost4.c: Likewise.

Patch was bootstrapped and regression tested for x86_64-unknown-linux-gnu.  Ok 
for apply?
ChangeLog

2011-11-06  Kai Tietz  

* cfgexpand.c (is_bool_op_p): New helper.
(normalize_truth_condition): Likewise.
(cond_assoc_t): New structure type.
(collect_cond_chain): New helper.
(build_cond_expr): Likewise.
(is_bitwise_binary_simple_combine): Likewise.
(preeval_cond_integral): Likewise.
(combine_conds): Likewise.
(branchcost_optimization_on_conditions): Likewise.
(expand_gimple_cond): Use branchcost_optimization_on_condition
function.
* dojump.c (do_jump): Prevent converting bitwise-and/or
to real iffs for branch-cost bigger then zero.
* fold_const.c (simple_operand_p_2): Improve evaluation
of side-effects and trapping for associative truth-bitwise
binary operations.
(fold_range_test): Remove branch-cost specific code.
(fold_truth_andor_1): Likewise.
(fold_truth_andor): Likewise.

ChangeLog testsuite

2011-11-06  Kai Tietz  

* gcc.dg/pr46909.c: Adjust test.
* gcc.dg/tree-ssa/vrp33.c: Likewise.
* gcc.target/i386/branch-cost1.c: Likewise.
* gcc.target/i386/branch-cost2.c: Likewise.
* gcc.target/i386/branch-cost3.c: Likewise.
* gcc.target/i386/branch-cost4.c: Likewise.

Index: gcc-trunk/gcc/cfgexpand.c
===
--- gcc-trunk.orig/gcc/cfgexpand.c
+++ gcc-trunk/gcc/cfgexpand.c
@@ -1650,6 +1650,651 @@ maybe_cleanup_end_of_block (edge e, rtx 
 }
 }
 
+/* Check if statement can be considered as a "simple" one.  Simples are:
+   - minimal invariant
+   - any non-SSA_NAME veriant
+   - any SSA_NAME variant without a definition statement
+   - any SSA_NAME with default definition.
+   - an assignment of kind ~X, if X is minimal invariant, or has no
+ definition statement, We exclude here floating point types for X
+ and Y, as ~ (X cmp Y) can have special meaning on floats..
+   - an assignment of kind X ^ ~0, if X is minimal invariant, or has no
+ definition statement,  */
+
+static bool
+is_bool_op_p (tree op, bool *is_not)
+{
+  gimple s;
+  enum tree_code code;
+
+  *is_not = false;
+
+  /* Reject result types not of boolean kine.  */
+  if (TREE_CODE (TREE_TYPE (op)) != BOOLEAN_TYPE)
+return false;
+
+  if (is_gimple_min_invariant (op)
+  || TREE_CODE (op) != SSA_NAME
+  || SSA_NAME_IS_DEFAULT_DEF (op)
+  || (s = SSA_NAME_DEF_STMT (op)) == NULL)
+return true;
+
+  /* Reject statement which isn't of kind assign.  */
+  if (!is_gimple_assign (s))
+return false;
+
+  code = gimple_assign_rhs_code (s);
+
+  /* See if we have a "simple" logical not.  */
+  if (code == BIT_NOT_EXPR)
+*is_not = true;
+  else if (code

[v3] corrections to C++11 status table

2011-11-06 Thread Jonathan Wakely
We don't provide  or  - the latter is listed as "C
library dependency" but AFAIK we don't provide  even if
 is present.

* doc/xml/manual/status_cxx2011.xml: Document  and
 as missing.

Committed to trunk.
Index: doc/xml/manual/status_cxx2011.xml
===
--- doc/xml/manual/status_cxx2011.xml	(revision 181041)
+++ doc/xml/manual/status_cxx2011.xml	(working copy)
@@ -253,10 +253,11 @@ particular release.
   
 
 
+  
   18.10
   Other runtime support
-  Y
-  
+  Partial
+  Missing  
 
 
   
@@ -1141,10 +1142,13 @@ particular release.
   
 
 
+  
   21.7
   Null-terminated sequence utilities
-  Y
-  C library dependency
+  Partial
+  C library dependency.
+  Missing 
+  
 
 
   


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Dimitrios Apostolou

On Sun, 6 Nov 2011, Joern Rennecke wrote:

But where HARD_REG_SETS make no material difference in speed, and the
compilation unit has no other tight coupling with tm.h, it would really
be cleaner to move from HARD_REG_SETS to a target-independent type,
like sbitmap or bitmap.  Maybe we want something more lightweight than
an sbitmap, like passing in a HARD_REG_ELT_TYPE *base, int n_elements
pair to describe the data - n_elements would be set to HARD_REG_SET_LONGS.
When you are doing anything time-critical with hard register sets,
there is little point having the same size stored in multiple places,
you might as well keep it in a register and pass it around.
Basic operations could presumably be inlined, so the main overhead would
be loop overhead.  With the right loop unrolling applied, these loops
should be well predictable.


Working with bitmaps in gcc made me really miss the simple bitmap.h of 
the linux kernel. I'm not sure if we can borrow from there. More info at:


http://lxr.linux.no/#linux+v3.1/include/linux/bitmap.h


Dimitris


P.S. we badly need a gxr site :-)


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Joern Rennecke

Quoting Jakub Jelinek :


The middle-end uses HARD_REG_SET in lots of places, so supposedly
when you want to support more than one target, you need to ensure
that hard-reg-set.h header won't use FIRST_PSEUDO_REGISTER of one randomly
selected target you want to support, but instead somehow computed
MAX_FIRST_PSEUDO_REGISTER which is a maximum of FIRST_PSEUDO_REGISTER
values for all targets you want to support.


This can also be handled by compiling all the affected compilation units
once for each target, using target-specifically mangled function names.
C++ namespaces can be used to make this a fairly mechanical process,
as I have shown earlier.
But where HARD_REG_SETS make no material difference in speed, and the
compilation unit has no other tight coupling with tm.h, it would really
be cleaner to move from HARD_REG_SETS to a target-independent type,
like sbitmap or bitmap.  Maybe we want something more lightweight than
an sbitmap, like passing in a HARD_REG_ELT_TYPE *base, int n_elements
pair to describe the data - n_elements would be set to HARD_REG_SET_LONGS.
When you are doing anything time-critical with hard register sets,
there is little point having the same size stored in multiple places,
you might as well keep it in a register and pass it around.
Basic operations could presumably be inlined, so the main overhead would
be loop overhead.  With the right loop unrolling applied, these loops
should be well predictable.


In fact perhaps all uses of FIRST_PSEUDO_REGISTER in the middle-end
should be that value.


That can be wasteful when you have disparate regset sizes.
I'd rather have a first_pseudo_register data member of the target vector.

This might even make some use cases faster, like when you have a processor
architecture family with lots of vector registers, and you compile for
a variant without vector support (or code that's should not touch them),
you might have a much smaller range of registers to loop through if  
you are using a specific target vector with a smaller  
first_pseudo_register.


Re: [Patch, Fortran] Cleanup of gfc_extend_expr

2011-11-06 Thread Janus Weil
>> I think the patch is fine and can be committed.  But, give
>> Steven a chance to respond before committing.
>
> Thanks, Steve. I think three days should be long enough. Will commit
> later today (if no one protests in the meantime).

Committed as r181044.

Cheers,
Janus


Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Aldy Hernandez



Actually, the table organization is irrelevant, because upon
registering of the table in the runtime, we qsort the entire thing.


False.  You get the equivalent of bootstrap comparison mismatches.
If we actually used tm during the bootstrap.

The simplest thing to do is to change the hash this table uses.
E.g. use the DECL_UID right from the start, rather than the pointer.


Argh, will fix in a followup patch.


-  if (!computed_goto_p (stmt))
+ if (!computed_goto_p (stmt))
 {
- tree new_dest = main_block_label (gimple_goto_dest (stmt));
- gimple_goto_set_dest (stmt, new_dest);
+ label = gimple_goto_dest (stmt);
+ new_label = main_block_label (label);
+ if (new_label != label)
+   gimple_goto_set_dest (stmt, new_label);


What's the reason for this changes?  Optimization?


Yes.  Rth can elaborate if you deem necessary.


Really?  I have no idea what this change achieves.
I actually wonder if this is a merge error.


I won't complain :).  I have reverted the original patch and am 
including it in the final (attched) version I will commit.



+case GIMPLE_TRANSACTION:
+  /* The ABORT edge has a stored label associated with it, otherwise
+the edges are simply redirectable.  */
+  /* ??? We don't really need this label after the cfg is created.  */
+  if (e->flags == 0)
+   gimple_transaction_set_label (stmt, gimple_block_label (dest));


So why set it (and thus keep it live)?


This seems like leftovers from a previous incantation.  However, I'm not 100% 
sure, so I have disabled the code, but left it in a comment.  A full 
bootstrap/regtest revealed no regressions.

rth, do you have any objections to remove this?


I think that the comment is wrong.  We need that edge, and the label updated 
until pass_tm_edges, at which point the GIMPLE_TRANSACTION itself goes away.  
Thus that label is live throughout the live of the GIMPLE_TRANSACTION node.

Delete that ??? comment instead.


Done.


Patch is otherwise ok.


Attched is the final revision of the patch.  I will commit once tests 
finish.


thank you.
* tree-cfg.c (verify_gimple_transaction): Verify body.  Move down.
(verify_gimple_in_seq_2): Verify the label in a
GIMPLE_TRANSACTION.
(cleanup_dead_labels): Remove GIMPLE_GOTO hiccup from merge.
* function.h (struct function): Move tm_restart field to struct
gimple_df in tree-flow.h
Move tm_restart_node to tree-flow.h
* tree-flow.h (struct gimple_df): New location for tm_restart
field.
New location for tm_restart_node.
(is_transactional_stmt): Remove.
* trans-mem.c (is_transactional_stmt): Remove.
(make_tm_edge): Field tm_restart is now in gimple_df.
* cfgexpand.c (gimple_expand_cfg): Field tm_restart is now in
cfun->gimple_df.
Free tm_restart.
* cfgexpand.c (expand_gimple_stmt): Field tm_restart is now in
gimple_df.
* ipa-inline.c (can_inline_edge_p): Do not check flag_tm.
* trans-mem.c (is_tm_pure): Check flag_tm.
(is_tm_safe): Same.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 181028)
+++ ipa-inline.c(working copy)
@@ -286,8 +286,7 @@ can_inline_edge_p (struct cgraph_edge *e
 }
   /* TM pure functions should not get inlined if the outer function is
  a TM safe function.  */
-  else if (flag_tm
-  && is_tm_pure (callee->decl)
+  else if (is_tm_pure (callee->decl)
   && is_tm_safe (e->caller->decl))
 {
   e->inline_failed = CIF_UNSPECIFIED;
Index: function.h
===
--- function.h  (revision 181028)
+++ function.h  (working copy)
@@ -467,14 +467,6 @@ extern GTY(()) struct rtl_data x_rtl;
want to do differently.  */
 #define crtl (&x_rtl)
 
-/* This structure is used to map a gimple statement to a label,
-   or list of labels to represent transaction restart.  */
-
-struct GTY(()) tm_restart_node {
-  gimple stmt;
-  tree label_or_list;
-};
-
 struct GTY(()) stack_usage
 {
   /* # of bytes of static stack space allocated by the function.  */
@@ -526,10 +518,6 @@ struct GTY(()) function {
   /* Value histograms attached to particular statements.  */
   htab_t GTY((skip)) value_histograms;
 
-  /* Map gimple stmt to tree label (or list of labels) for transaction
- restart and abort.  */
-  htab_t GTY ((param_is (struct tm_restart_node))) tm_restart;
-
   /* For function.c.  */
 
   /* Points to the FUNCTION_DECL of this function.  */
Index: trans-mem.c
===
--- trans-mem.c (revision 181028)
+++ trans-mem.c (working copy)
@@ -172,9 +172,13 @@ get_attrs_for (const_tree x)
 bool
 is_tm_pure (const_tree x)
 {
-  tree attrs = get_attrs_for (x);
-  if (attrs)
-return lookup_

Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Richard Henderson
> On 11/04/11 04:14, Richard Guenther wrote:
>>> new_version = cgraph_create_node (new_decl);
>>>
>>> -   new_version->analyzed = true;
>>> +   new_version->analyzed = old_version->analyzed;
>>
>> Hm?  analyzed means "with body", sure you have a body if you clone.

Incidentally, for TM we also clone functions that do NOT have a body.

An external declaration with __attribute__((transaction_callable))
is an assertion by the user that the transactional clone exists
(or alternately, a directive from the user to generate such a clone
in the file that contains the function).

>>> @@ -2294,6 +2294,7 @@ cgraph_copy_node_for_versioning (struct
>>> new_version->rtl = old_version->rtl;
>>> new_version->reachable = true;
>>> new_version->count = old_version->count;
>>> +   new_version->lowered = true;
>>
>> OTOH this isn't necessary true.  cgraph exists before lowering.

But no clones are created before lowering.


r~


Re: PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Diego Novillo

On 11-11-06 16:05 , Aldy Hernandez wrote:



"LTO support is currently not supported with transactional memory"

'support' mentioned one too many times. Maybe 'LTO is currently not
supported with transactional memory'?


Diego.


How is this?


OK.  Thanks.


Diego.


Re: PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Aldy Hernandez



"LTO support is currently not supported with transactional memory"

'support' mentioned one too many times.  Maybe 'LTO is currently not
supported with transactional memory'?


Diego.


How is this?
* opts.c (finish_options): Error out when using -flto and
-fgnu-tm.

Index: opts.c
===
--- opts.c  (revision 181028)
+++ opts.c  (working copy)
@@ -784,6 +784,8 @@ finish_options (struct gcc_options *opts
 #endif
   if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
 error_at (loc, "-fno-fat-lto-objects are supported only with linker 
plugin.");
+  if (opts->x_flag_tm)
+   error_at (loc, "LTO is currently not supported with transactional 
memory");
 }
   if ((opts->x_flag_lto_partition_balanced != 0) + 
(opts->x_flag_lto_partition_1to1 != 0)
+ (opts->x_flag_lto_partition_none != 0) >= 1)


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Jakub Jelinek
On Sun, Nov 06, 2011 at 03:48:16PM -0500, Joern Rennecke wrote:
> >Or keep HARD_REG_SET type as is and just use a new struct type which
> >contains HARD_REG_SET or HARD_REG_SET * in it.
> >struct hard_reg_set_ptr;
> >void (*live_on_entry) (struct hard_reg_set_ptr *);
> >in the target* headers and
> >struct hard_reg_set_ptr { HARD_REG_SET *set; };
> 
> All these variants don't address the the interface problem we's get in
> the target hooks.  A type used in a target hook should be compatible
> with all targets simultaneously.
> A long * or a void * would be OK, but if you point to a HARD_REG_SET
> or something containing it, you have to consider the one-definition
> rule when it eventally comes to linking/and or loading multiple or
> alternative backends into the compiler.

The middle-end uses HARD_REG_SET in lots of places, so supposedly
when you want to support more than one target, you need to ensure
that hard-reg-set.h header won't use FIRST_PSEUDO_REGISTER of one randomly
selected target you want to support, but instead somehow computed
MAX_FIRST_PSEUDO_REGISTER which is a maximum of FIRST_PSEUDO_REGISTER
values for all targets you want to support.
In fact perhaps all uses of FIRST_PSEUDO_REGISTER in the middle-end
should be that value.

Jakub


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Joern Rennecke

Jakub Jelinek:

On Sun, Nov 06, 2011 at 01:12:21PM +0100, Paolo Bonzini wrote:
> What about adding a macro indirection to more functions (like you
> did with SET_HARD_REG_BIT and friends), so that pass-by-value can be
> changed to pass-by-reference without affecting all the uses
> throughout the compiler?

Or keep HARD_REG_SET type as is and just use a new struct type which
contains HARD_REG_SET or HARD_REG_SET * in it.
struct hard_reg_set_ptr;
void (*live_on_entry) (struct hard_reg_set_ptr *);
in the target* headers and
struct hard_reg_set_ptr { HARD_REG_SET *set; };


All these variants don't address the the interface problem we's get in
the target hooks.  A type used in a target hook should be compatible
with all targets simultaneously.
A long * or a void * would be OK, but if you point to a HARD_REG_SET
or something containing it, you have to consider the one-definition
rule when it eventally comes to linking/and or loading multiple or
alternative backends into the compiler.


Re: PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Diego Novillo
On Sun, Nov 6, 2011 at 15:03, Aldy Hernandez  wrote:
> As per your comment in the PR...
>
> OK for branch?

"LTO support is currently not supported with transactional memory"

'support' mentioned one too many times.  Maybe 'LTO is currently not
supported with transactional memory'?


Diego.


Re: [C++ Patch] PR 47695

2011-11-06 Thread Jason Merrill

OK.

Jason


Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Richard Henderson
On 11/06/2011 10:53 AM, Aldy Hernandez wrote:
>> Did you check bootstrapping GCC with TM enabled and address-space
>> randomization turned on?
> 
> Actually, the table organization is irrelevant, because upon
> registering of the table in the runtime, we qsort the entire thing.

False.  You get the equivalent of bootstrap comparison mismatches.
If we actually used tm during the bootstrap.

The simplest thing to do is to change the hash this table uses.
E.g. use the DECL_UID right from the start, rather than the pointer.

>>> -  if (!computed_goto_p (stmt))
>>> + if (!computed_goto_p (stmt))
>>> {
>>> - tree new_dest = main_block_label (gimple_goto_dest (stmt));
>>> - gimple_goto_set_dest (stmt, new_dest);
>>> + label = gimple_goto_dest (stmt);
>>> + new_label = main_block_label (label);
>>> + if (new_label != label)
>>> +   gimple_goto_set_dest (stmt, new_label);
>>
>> What's the reason for this changes?  Optimization?
> 
> Yes.  Rth can elaborate if you deem necessary.

Really?  I have no idea what this change achieves.
I actually wonder if this is a merge error.

>>> +case GIMPLE_TRANSACTION:
>>> +  /* The ABORT edge has a stored label associated with it, otherwise
>>> +the edges are simply redirectable.  */
>>> +  /* ??? We don't really need this label after the cfg is created.  */
>>> +  if (e->flags == 0)
>>> +   gimple_transaction_set_label (stmt, gimple_block_label (dest));
>>
>> So why set it (and thus keep it live)?
> 
> This seems like leftovers from a previous incantation.  However, I'm not 100% 
> sure, so I have disabled the code, but left it in a comment.  A full 
> bootstrap/regtest revealed no regressions.
> 
> rth, do you have any objections to remove this?

I think that the comment is wrong.  We need that edge, and the label updated 
until pass_tm_edges, at which point the GIMPLE_TRANSACTION itself goes away.  
Thus that label is live throughout the live of the GIMPLE_TRANSACTION node.

Delete that ??? comment instead.

Patch is otherwise ok.


r~


Re: Many testsuite failures on x86_64 due recent "fix" about f16cintrin.h header

2011-11-06 Thread Dominique Dhumieres
Following http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02901.html, I have 
applied
the following patch on x86_64-apple-darwin10

--- ../_clean/gcc/config.gcc2011-11-05 22:25:37.0 +0100
+++ gcc/config.gcc  2011-11-06 12:35:57.0 +0100
@@ -350,7 +350,7 @@ i[34567]86-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
-  avx2intrin.h fmaintrin.h"
+  avx2intrin.h fmaintrin.h f16cintrin.h"
;;
 x86_64-*-*)
cpu_type=i386
@@ -363,7 +363,7 @@ x86_64-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
-  avx2intrin.h fmaintrin.h"
+  avx2intrin.h fmaintrin.h f16cintrin.h"
need_64bit_hwint=yes
;;
 ia64-*-*)
--- ../_clean/gcc/config/i386/f16cintrin.h  2011-11-05 10:03:10.0 
+0100
+++ gcc/config/i386/f16cintrin.h2011-11-06 16:55:05.0 +0100
@@ -88,7 +88,8 @@ _mm256_cvtps_ph (__m256 __A, const int _
 
 #define _mm256_cvtps_ph(A, I) \
   ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I)))
-#endif
+#endif /* __OPTIMIZE__ */
+#endif /* _F16CINTRIN_H_INCLUDED */
 
 #endif /* __F16C__ */
 #endif

(the second part fixes a missing endif). However I still have most of the 
failures:

FAIL: gcc.target/i386/sse-14.c
Excess errors:
/opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:95:1: error: implicit 
declaration of function '_cvtss_sh' [-Werror=implicit-function-declaration]
/opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: implicit 
declaration of function '_mm_cvtps_ph' [-Werror=implicit-function-declaration]
/opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: incompatible 
types when returning type 'int' but '__m128i' was expected
/opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: implicit 
declaration of function '_mm256_cvtps_ph' 
[-Werror=implicit-function-declaration]
/opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: incompatible 
types when returning type 'int' but '__m128i' was expected
...
FAIL: gcc.target/i386/testimm-1.c (test for excess errors)
Excess errors:
/opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:36:6: error: 
incompatible types when assigning to type '__m128i' from type 'int'
/opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:45:6: error: 
incompatible types when assigning to type '__m128i' from type 'int'

At this point I have no idea about how to fix those.

Cheers,

Dominique


PR lto/50964: [trans-mem] fail gracefully when -flto and -fgnu-tm

2011-11-06 Thread Aldy Hernandez

As per your comment in the PR...

OK for branch?
* opts.c (finish_options): Error out when using -flto and
-fgnu-tm.

Index: opts.c
===
--- opts.c  (revision 181028)
+++ opts.c  (working copy)
@@ -784,6 +784,8 @@ finish_options (struct gcc_options *opts
 #endif
   if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
 error_at (loc, "-fno-fat-lto-objects are supported only with linker 
plugin.");
+  if (opts->x_flag_tm)
+   error_at (loc, "LTO support is currently not supported with 
transactional memory");
 }
   if ((opts->x_flag_lto_partition_balanced != 0) + 
(opts->x_flag_lto_partition_1to1 != 0)
+ (opts->x_flag_lto_partition_none != 0) >= 1)


Re: [PATCH] Don't prevent merging of bbs because of user labels (PR tree-optimization/50693)

2011-11-06 Thread Jakub Jelinek
On Sun, Nov 06, 2011 at 06:14:18PM +0100, Eric Botcazou wrote:
> > As has been suggested by Alexandre in the PR, this patch allows merging
> > basic blocks which couldn't be merged before because of user (non-forced)
> > labels at the beginning of the second basic blocks.
> > With this patch the user label is thrown away (for -g0 or -g
> > -fno-var-tracking-assignments) or turned into a debug bind stmt which
> > contains the label.
> 
> Do we really want to do that if !optimize?

You're right, at -O0 we don't do VTA and thus the user labels would be
dropped on the floor.

Fixed thusly, committed to trunk as obvious:

2011-11-06  Jakub Jelinek  

* tree-cfg.c (gimple_can_merge_blocks_p): For -O0 don't remove
any user labels.

--- gcc/tree-cfg.c.jj   2011-11-04 18:01:25.0 +0100
+++ gcc/tree-cfg.c  2011-11-06 20:37:09.0 +0100
@@ -1454,8 +1454,8 @@ gimple_can_merge_blocks_p (basic_block a
break;
   lab = gimple_label_label (stmt);
 
-  /* Do not remove user forced labels.  */
-  if (!DECL_ARTIFICIAL (lab) && FORCED_LABEL (lab))
+  /* Do not remove user forced labels or for -O0 any user labels.  */
+  if (!DECL_ARTIFICIAL (lab) && (!optimize || FORCED_LABEL (lab)))
return false;
 }
 


Jakub


Re: [patch] 6/n: trans-mem: runtime

2011-11-06 Thread Torvald Riegel
On Thu, 2011-11-03 at 20:15 +, Joseph S. Myers wrote:
> On Thu, 3 Nov 2011, Aldy Hernandez wrote:
> 
> > Index: libitm/acinclude.m4
> > ===
> > --- libitm/acinclude.m4 (.../trunk) (revision 0)
> > +++ libitm/acinclude.m4 (.../branches/transactional-memory) 
> > (revision
> > 180773)
> > @@ -0,0 +1,343 @@
> > +dnl --
> > +dnl This whole bit snagged from libgfortran.
> 
> If you need a configure test in more than one library, do not copy it like 
> this; put a common macro in config/ and use it from both libraries.
> 
> > +dnl --
> > +dnl This whole bit snagged from libstdc++-v3.
> 
> Likewise.  There may well be some bits that for whatever reason need to be 
> similar but different, or that are specific to libitm, but if something 
> can be shared then it should be shared.

Is the attached patch what you'd like to see? It doesn't yet use the
generic macros in libgfortran, libgomp, or libstdc++v3, but this could
be added after the TM merge (and somebody who knows more about the build
system than me should have a look first, probably..).
Sync-built-in checks haven't been generalized yet because libitm will
likely switch to the new cxx-mem-model built-ins. LIBITM_ENABLE_SYMVERS
is different from the libstdc++v3 version. Is GCC_CHECK_LINKER_FEATURES
at the right place in exportcontrols.m4, or different name/file?
Bootstrapped on x86_64 and tm.exp/libitm tested. 

OK for branch?
commit b81687945ccb46c860acf66cbb8d391e0062882c
Author: Torvald Riegel 
Date:   Sun Nov 6 20:32:28 2011 +0100

Generalize AC macros.

libitm/
* acinclude.m4: Move export control checks to
../config/exportcontrols.m4. Use GCC_ENABLE; remove LIBITM_ENABLE.
* configure.ac: Likewise.
* aclocal.m4: Include ../config/exportcontrols.m4.
* configure: Regenerate.

config/
* exportcontrols.m4: New file; based on libgfortran's and
libstdc++v3's acinclude.m4.

index 000..1d9a998
--- /dev/null
+++ b/config/ChangeLog.tm-merge
@@ -0,0 +1,6 @@
+2011-11-07  Torvald Riegel  
+
+   Merged from transactional-memory.
+
+   * exportcontrols.m4: New file; based on libgfortran's and
+   libstdc++v3's acinclude.m4.
diff --git a/config/exportcontrols.m4 b/config/exportcontrols.m4
new file mode 100644
index 000..78cf3ec
--- /dev/null
+++ b/config/exportcontrols.m4
@@ -0,0 +1,182 @@
+dnl --
+dnl This whole bit snagged from libgfortran.
+
+dnl Check whether the target supports hidden visibility.
+AC_DEFUN([GCC_CHECK_ATTRIBUTE_VISIBILITY], [
+  AC_CACHE_CHECK([whether the target supports hidden visibility],
+gcc_cv_have_attribute_visibility, [
+  save_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS -Werror"
+  AC_TRY_COMPILE([void __attribute__((visibility("hidden"))) foo(void) { }],
+[], gcc_cv_have_attribute_visibility=yes,
+gcc_cv_have_attribute_visibility=no)
+  CFLAGS="$save_CFLAGS"])
+  if test $gcc_cv_have_attribute_visibility = yes; then
+AC_DEFINE(HAVE_ATTRIBUTE_VISIBILITY, 1,
+  [Define to 1 if the target supports __attribute__((visibility(...))).])
+  fi])
+
+dnl Check whether the target supports dllexport
+AC_DEFUN([GCC_CHECK_ATTRIBUTE_DLLEXPORT], [
+  AC_CACHE_CHECK([whether the target supports dllexport],
+gcc_cv_have_attribute_dllexport, [
+  save_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS -Werror"
+  AC_TRY_COMPILE([void __attribute__((dllexport)) foo(void) { }],
+[], gcc_cv_have_attribute_dllexport=yes,
+gcc_cv_have_attribute_dllexport=no)
+  CFLAGS="$save_CFLAGS"])
+  if test $gcc_cv_have_attribute_dllexport = yes; then
+AC_DEFINE(HAVE_ATTRIBUTE_DLLEXPORT, 1,
+  [Define to 1 if the target supports __attribute__((dllexport)).])
+  fi])
+
+dnl Check whether the target supports symbol aliases.
+AC_DEFUN([GCC_CHECK_ATTRIBUTE_ALIAS], [
+  AC_CACHE_CHECK([whether the target supports symbol aliases],
+gcc_cv_have_attribute_alias, [
+  AC_TRY_LINK([
+void foo(void) { }
+extern void bar(void) __attribute__((alias("foo")));],
+[bar();], gcc_cv_have_attribute_alias=yes, 
gcc_cv_have_attribute_alias=no)])
+  if test $gcc_cv_have_attribute_alias = yes; then
+AC_DEFINE(HAVE_ATTRIBUTE_ALIAS, 1,
+  [Define to 1 if the target supports __attribute__((alias(...))).])
+  fi])
+
+
+dnl --
+dnl This whole bit snagged from libstdc++v3.
+
+dnl
+dnl If GNU ld is in use, check to see if tricky linker opts can be used.  If
+dnl the native linker is in use, all variables will be defined to something
+dnl safe (like an empty string).
+dnl
+dnl Defines:
+dnl  SECTION_LDFLAGS='-Wl,--gc-sections' if possible
+d

[committed] Update PA libfunc initialization

2011-11-06 Thread John David Anglin
This update is to allow libfunc initialization for other target
features.

Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

2011-11-06  John David Anglin  

* config/pa/pa.c (pa_hpux_init_libfuncs): Rename to pa_init_libfuncs.
Remove dependence of declaration and target define on definition of
HPUX_LONG_DOUBLE_LIBRARY.  Update implementation.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 181013)
+++ config/pa/pa.c  (working copy)
@@ -155,9 +155,7 @@
 #ifdef ASM_OUTPUT_EXTERNAL_REAL
 static void pa_hpux_file_end (void);
 #endif
-#if HPUX_LONG_DOUBLE_LIBRARY
-static void pa_hpux_init_libfuncs (void);
-#endif
+static void pa_init_libfuncs (void);
 static rtx pa_struct_value_rtx (tree, int);
 static bool pa_pass_by_reference (cumulative_args_t, enum machine_mode,
  const_tree, bool);
@@ -316,10 +314,8 @@
 #undef TARGET_MACHINE_DEPENDENT_REORG
 #define TARGET_MACHINE_DEPENDENT_REORG pa_reorg
 
-#if HPUX_LONG_DOUBLE_LIBRARY
 #undef TARGET_INIT_LIBFUNCS
-#define TARGET_INIT_LIBFUNCS pa_hpux_init_libfuncs
-#endif
+#define TARGET_INIT_LIBFUNCS pa_init_libfuncs
 
 #undef TARGET_PROMOTE_FUNCTION_MODE
 #define TARGET_PROMOTE_FUNCTION_MODE pa_promote_function_mode
@@ -5542,47 +5538,56 @@
 }
 }
 
-#if HPUX_LONG_DOUBLE_LIBRARY
-/* Initialize optabs to point to HPUX long double emulation routines.  */
+/* Initialize optabs to point to emulation routines.  */
+
 static void
-pa_hpux_init_libfuncs (void)
+pa_init_libfuncs (void)
 {
-  set_optab_libfunc (add_optab, TFmode, "_U_Qfadd");
-  set_optab_libfunc (sub_optab, TFmode, "_U_Qfsub");
-  set_optab_libfunc (smul_optab, TFmode, "_U_Qfmpy");
-  set_optab_libfunc (sdiv_optab, TFmode, "_U_Qfdiv");
-  set_optab_libfunc (smin_optab, TFmode, "_U_Qmin");
-  set_optab_libfunc (smax_optab, TFmode, "_U_Qfmax");
-  set_optab_libfunc (sqrt_optab, TFmode, "_U_Qfsqrt");
-  set_optab_libfunc (abs_optab, TFmode, "_U_Qfabs");
-  set_optab_libfunc (neg_optab, TFmode, "_U_Qfneg");
+  if (HPUX_LONG_DOUBLE_LIBRARY)
+{
+  set_optab_libfunc (add_optab, TFmode, "_U_Qfadd");
+  set_optab_libfunc (sub_optab, TFmode, "_U_Qfsub");
+  set_optab_libfunc (smul_optab, TFmode, "_U_Qfmpy");
+  set_optab_libfunc (sdiv_optab, TFmode, "_U_Qfdiv");
+  set_optab_libfunc (smin_optab, TFmode, "_U_Qmin");
+  set_optab_libfunc (smax_optab, TFmode, "_U_Qfmax");
+  set_optab_libfunc (sqrt_optab, TFmode, "_U_Qfsqrt");
+  set_optab_libfunc (abs_optab, TFmode, "_U_Qfabs");
+  set_optab_libfunc (neg_optab, TFmode, "_U_Qfneg");
 
-  set_optab_libfunc (eq_optab, TFmode, "_U_Qfeq");
-  set_optab_libfunc (ne_optab, TFmode, "_U_Qfne");
-  set_optab_libfunc (gt_optab, TFmode, "_U_Qfgt");
-  set_optab_libfunc (ge_optab, TFmode, "_U_Qfge");
-  set_optab_libfunc (lt_optab, TFmode, "_U_Qflt");
-  set_optab_libfunc (le_optab, TFmode, "_U_Qfle");
-  set_optab_libfunc (unord_optab, TFmode, "_U_Qfunord");
+  set_optab_libfunc (eq_optab, TFmode, "_U_Qfeq");
+  set_optab_libfunc (ne_optab, TFmode, "_U_Qfne");
+  set_optab_libfunc (gt_optab, TFmode, "_U_Qfgt");
+  set_optab_libfunc (ge_optab, TFmode, "_U_Qfge");
+  set_optab_libfunc (lt_optab, TFmode, "_U_Qflt");
+  set_optab_libfunc (le_optab, TFmode, "_U_Qfle");
+  set_optab_libfunc (unord_optab, TFmode, "_U_Qfunord");
 
-  set_conv_libfunc (sext_optab,   TFmode, SFmode, "_U_Qfcnvff_sgl_to_quad");
-  set_conv_libfunc (sext_optab,   TFmode, DFmode, "_U_Qfcnvff_dbl_to_quad");
-  set_conv_libfunc (trunc_optab,  SFmode, TFmode, "_U_Qfcnvff_quad_to_sgl");
-  set_conv_libfunc (trunc_optab,  DFmode, TFmode, "_U_Qfcnvff_quad_to_dbl");
+  set_conv_libfunc (sext_optab, TFmode, SFmode, "_U_Qfcnvff_sgl_to_quad");
+  set_conv_libfunc (sext_optab, TFmode, DFmode, "_U_Qfcnvff_dbl_to_quad");
+  set_conv_libfunc (trunc_optab, SFmode, TFmode, "_U_Qfcnvff_quad_to_sgl");
+  set_conv_libfunc (trunc_optab, DFmode, TFmode, "_U_Qfcnvff_quad_to_dbl");
 
-  set_conv_libfunc (sfix_optab,   SImode, TFmode, TARGET_64BIT
- ? "__U_Qfcnvfxt_quad_to_sgl"
- : "_U_Qfcnvfxt_quad_to_sgl");
-  set_conv_libfunc (sfix_optab,   DImode, TFmode, "_U_Qfcnvfxt_quad_to_dbl");
-  set_conv_libfunc (ufix_optab,   SImode, TFmode, "_U_Qfcnvfxt_quad_to_usgl");
-  set_conv_libfunc (ufix_optab,   DImode, TFmode, "_U_Qfcnvfxt_quad_to_udbl");
+  set_conv_libfunc (sfix_optab, SImode, TFmode,
+   TARGET_64BIT ? "__U_Qfcnvfxt_quad_to_sgl"
+: "_U_Qfcnvfxt_quad_to_sgl");
+  set_conv_libfunc (sfix_optab, DImode, TFmode,
+   "_U_Qfcnvfxt_quad_to_dbl");
+  set_conv_libfunc (ufix

Re: [PATCH] Fix libgcc_tm.h dependency in libgcc Makefile

2011-11-06 Thread Paolo Bonzini

On 11/06/2011 06:43 PM, John David Anglin wrote:

This is fixes a trunk build error noticed on hppa-linux at -j4.  Tested on
hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.

Ok?


Ok.

Paolo


Re: [v3] Re: put __stl_prime_list in comdat section

2011-11-06 Thread Ian Lance Taylor
Xinliang David Li  writes:

> Ping .. (the Nov 7 cutoff date is close ..).

The cutoff date is typically not rigorously applied to patches submitted
before the deadline but approved after the deadline.

However, this patch is OK.

Thanks.

Ian


>
> On Sat, Nov 5, 2011 at 12:22 PM, Xinliang David Li  wrote:
>> thanks. The attached is the revised patch.
>>
>> David
>>
>> On Sat, Nov 5, 2011 at 11:52 AM, Paolo Carlini  
>> wrote:
>>> On 11/05/2011 07:32 PM, Xinliang David Li wrote:

 Hi, the following patch is a follow up to the one posted here
 http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html.

 The new patch is a header only change and can greatly reduce rodata
 section size for some programs.

 Ok for trunk after testing?
>>>
>>> As usual for backward/ stuff, I would say largely Ian's call, but please
>>> don't use get_prime_list unuglified, prefer __get_prime_list, or something
>>> with _S_* as prefix, being a static member.
>>>
>>> Thanks,
>>> Paolo.
>>>
>>> PS: also, make sure to post library patches to libstdc++ too, and, possibly,
>>> add [v3] to the Subject, otherwise you can easily fail to get the attention
>>> of the library people.
>>>
>>


Re: [PATCH Atom] Fix for PR target/50962 (bad AGU stall avoidance)

2011-11-06 Thread Richard Henderson
On 11/06/2011 02:12 AM, Ilya Enkovich wrote:
> It would be great to have computed type here but ix86_use_lea_for_mov
> will check types of other instructions and then call
> extract_insn_cached. It will cause infinite loop, right?

Hum.  That does cause a problem.

No objection to the patch approved by Uros then.


r~


Re: [PATCH] More improvements to sparc VIS vec_init code generation.

2011-11-06 Thread Richard Henderson
On 11/05/2011 07:39 PM, David Miller wrote:
> Richard, is there a better way to represent this in RTL?  These
> instructions basically load a single byte or half-word into the bottom
> of a 64-bit float register, and clear the rest of that register with
> zeros.  So the v4hi one is essentially loading the vector:
> 
>   [(const_int 0) (const_int 0)
>  (const_int 0) (mem:HI (register:P ...))]

Try

(define_insn "*zero_extend_v4hi_vis"
  [(set (match_operand:V4HI 0 "register_operand" "=e")
(vec_merge:V4HI
  (vec_duplicate:V4HI
(match_operand:HI 1 "memory_operand" "m"))
  (match_operand:V4HI 2 "const_zero_operand" "")
  (const_int 14)))]
  ...
)

(define_expand "zero_extend_v4hi_vis"
  [(set (match_operand:V4HI 0 "register_operand" "=e")
(vec_merge:V4HI
  (vec_duplicate:V4HI
(match_operand:HI 1 "memory_operand" "m"))
  (match_dup 2)
  (const_int 14)))]
  "TARGET_VIS"
{
  operands[2] = CONST0_RTX (V4HImode);
})


r~


Re: [patch] 19/n: trans-mem: middle end/misc patches (LAST PATCH)

2011-11-06 Thread Aldy Hernandez

[rth, more comments for you below]

On 11/04/11 04:14, Richard Guenther wrote:


new_version = cgraph_create_node (new_decl);

-   new_version->analyzed = true;
+   new_version->analyzed = old_version->analyzed;


Hm?  analyzed means "with body", sure you have a body if you clone.


new_version->local = old_version->local;
new_version->local.externally_visible = false;
new_version->local.local = true;
@@ -2294,6 +2294,7 @@ cgraph_copy_node_for_versioning (struct
new_version->rtl = old_version->rtl;
new_version->reachable = true;
new_version->count = old_version->count;
+   new_version->lowered = true;


OTOH this isn't necessary true.  cgraph exists before lowering.


I don't understand what you want me to do on either of these two 
comments.  Could you elaborate?



+  /* TM pure functions should not get inlined if the outer function is
+ a TM safe function.  */
+  else if (flag_tm


Please move flag checks into the respective prediates.  Any reason
why the is_tm_pure () predicate wouldn't already do the correct thing
with !flag_tm?


Done.


+  /* Map gimple stmt to tree label (or list of labels) for transaction
+ restart and abort.  */
+  htab_t GTY ((param_is (struct tm_restart_node))) tm_restart;
+


As this maps 'gimple' to tree shouldn't this go to fn->gimple_df instead?
That way you avoid growing generic struct function.  Or in to eh_status,
if that looks like a better fit.


Done.


+  /* Mark all calls that can have a transaction restart.  */


Why isn't this done when we expand the call?  This walking of the
RTL sequence looks like a hack (an easy one, albeit).


+  if (cfun->tm_restart&&  is_gimple_call (stmt))
+{
+  struct tm_restart_node dummy;
+  void **slot;
+
+  dummy.stmt = stmt;
+  slot = htab_find_slot (cfun->tm_restart,&dummy, NO_INSERT);
+  if (slot)
+   {
+ struct tm_restart_node *n = (struct tm_restart_node *) *slot;
+ tree list = n->label_or_list;
+ rtx insn;
+
+ for (insn = next_real_insn (last); !CALL_P (insn);
+  insn = next_real_insn (insn))
+   continue;
+
+ if (TREE_CODE (list) == LABEL_DECL)
+   add_reg_note (insn, REG_TM, label_rtx (list));
+ else
+   for (; list ; list = TREE_CHAIN (list))
+ add_reg_note (insn, REG_TM, label_rtx (TREE_VALUE (list)));
+   }
+}


I can certainly move this to expand_call_stmt() if you prefer.  Do you 
have an objection to the RTL walk?  This isn't my code, but I'm open to 
suggestions on an alternative to implement.



+  /* After expanding, the tm_restart map is no longer needed.  */
+  cfun->tm_restart = NULL;


You should still free it, to not confuse the statistics code I think.


Done.


+finish_tm_clone_pairs (void)
+{
+  bool switched = false;
+
+  if (tm_clone_pairs == NULL)
+return;
+
+  htab_traverse_noresize (tm_clone_pairs, finish_tm_clone_pairs_1,
+ (void *)&switched);


This makes the generated table dependent on memory layout.  You
need to walk the pairs in some deterministic order.  In fact why not
walk all cgraph_nodes looking for the pairs - they should be still
in the list of clones for a node and you've marked it with DECL_TM_CLONE.
You can then sort them by cgraph node uid.

Did you check bootstrapping GCC with TM enabled and address-space
randomization turned on?


Actually, the table organization is irrelevant, because upon registering 
of the table in the runtime, we qsort the entire thing.  We then do a 
binary search to find items.  See _ITM_registerTMCloneTable() and 
find_clone() in the libitm runtime.



+/* In gtm-low.c  */
+extern bool is_transactional_stmt (const_gimple);
+


gimple.h please.  looks like a gimple predicate as well, so the implementation
should be in gimple.c?


Woo hoo!  Unused function.  I've removed it altogether.


case GIMPLE_GOTO:
-  if (!computed_goto_p (stmt))
+ if (!computed_goto_p (stmt))
{
- tree new_dest = main_block_label (gimple_goto_dest (stmt));
- gimple_goto_set_dest (stmt, new_dest);
+ label = gimple_goto_dest (stmt);
+ new_label = main_block_label (label);
+ if (new_label != label)
+   gimple_goto_set_dest (stmt, new_label);


What's the reason for this changes?  Optimization?


Yes.  Rth can elaborate if you deem necessary.


+/* Verify the contents of a GIMPLE_TRANSACTION.  Returns true if there
+   is a problem, otherwise false.  */
+
+static bool
+verify_gimple_transaction (gimple stmt)
+{
+  tree lab = gimple_transaction_label (stmt);
+  if (lab != NULL&&  TREE_CODE (lab) != LABEL_DECL)
+return true;


ISTR this has substatements, so you should handle this in
verify_gimple_in_seq_2 and make sure to verify those substatements.


I have added verification for the substatements in 
verify_gimple_transaction()...



@@ -4155,6 +4210,9 @@ verify_gimple_stmt (gimple stmt)
  

[patch] SLP conditions

2011-11-06 Thread Ira Rosen
Hi,

This patch adds a support of conditions in SLP.
It also fixes a bug in pattern handling in SLP (we should put pattern
statements instead of original statements in the root), and allows
pattern def-stmts in SLP.

Bootstrapped on powerpc64-suse-linux and tested on
powerpc64-suse-linux and x86_64-suse-linux.
Committed.

Ira

ChangeLog:

* tree-vectorizer.h (vectorizable_condition): Add argument.
* tree-vect-loop.c (vectorizable_reduction): Fail for condition
in SLP.  Update calls to vectorizable_condition.
* tree-vect-stmts.c (vect_is_simple_cond): Add basic block info to
the arguments.  Pass it to vect_is_simple_use_1.
(vectorizable_condition): Add slp_node to the arguments.  Support
vectorization of basic blocks.  Fail for reduction in SLP.  Update
calls to vect_is_simple_cond and vect_is_simple_use.  Support SLP:
call vect_get_slp_defs to get vector operands.
(vect_analyze_stmt): Update calls to vectorizable_condition.
(vect_transform_stmt): Likewise.
* tree-vect-slp.c (vect_create_new_slp_node): Handle COND_EXPR.
(vect_get_and_check_slp_defs): Handle COND_EXPR.  Allow pattern
def stmts.
(vect_build_slp_tree): Handle COND_EXPR.
(vect_analyze_slp_instance): Push pattern statements to root node.
(vect_get_constant_vectors): Fix comments.  Handle COND_EXPR.

testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-cond-1.c: New test.
* gcc.dg/vect/slp-cond-1.c: New test.
* gcc.dg/vect/slp-cond-2.c: New test.
Index: testsuite/gcc.dg/vect/bb-slp-cond-1.c
===
--- testsuite/gcc.dg/vect/bb-slp-cond-1.c   (revision 0)
+++ testsuite/gcc.dg/vect/bb-slp-cond-1.c   (revision 0)
@@ -0,0 +1,46 @@
+/* { dg-require-effective-target vect_condition } */
+
+#include "tree-vect.h"
+
+#define N 128
+
+__attribute__((noinline, noclone)) void
+foo (int *a, int stride)
+{
+  int i;
+
+  for (i = 0; i < N/stride; i++, a += stride)
+   {
+ a[0] = a[0] ? 1 : 5;
+ a[1] = a[1] ? 2 : 6;
+ a[2] = a[2] ? 3 : 7;
+ a[3] = a[3] ? 4 : 8;
+   }
+}
+
+
+int a[N];
+int main ()
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+a[i] = i;
+
+  foo (a, 4);
+
+  for (i = 1; i < N; i++)
+if (a[i] != i%4 + 1)
+  abort ();
+
+  if (a[0] != 5)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 
"slp" { target vect_element_align } } } */
+/* { dg-final { cleanup-tree-dump "slp" } } */
+
Index: testsuite/gcc.dg/vect/slp-cond-1.c
===
--- testsuite/gcc.dg/vect/slp-cond-1.c  (revision 0)
+++ testsuite/gcc.dg/vect/slp-cond-1.c  (revision 0)
@@ -0,0 +1,126 @@
+/* { dg-require-effective-target vect_condition } */
+#include "tree-vect.h"
+
+#define N 32
+int a[N], b[N];
+int d[N], e[N];
+int k[N];
+
+__attribute__((noinline, noclone)) void
+f1 (void)
+{
+  int i;
+  for (i = 0; i < N/4; i++)
+{
+  k[4*i] = a[4*i] < b[4*i] ? 17 : 0;
+  k[4*i+1] = a[4*i+1] < b[4*i+1] ? 17 : 0;
+  k[4*i+2] = a[4*i+2] < b[4*i+2] ? 17 : 0;
+  k[4*i+3] = a[4*i+3] < b[4*i+3] ? 17 : 0;
+}
+}
+
+__attribute__((noinline, noclone)) void
+f2 (void)
+{
+  int i;
+  for (i = 0; i < N/2; ++i)
+{
+  k[2*i] = a[2*i] < b[2*i] ? 0 : 24;
+  k[2*i+1] = a[2*i+1] < b[2*i+1] ? 7 : 4;
+}
+}
+
+__attribute__((noinline, noclone)) void
+f3 (void)
+{
+  int i;
+  for (i = 0; i < N/2; ++i)
+{
+  k[2*i] = a[2*i] < b[2*i] ? 51 : 12;
+  k[2*i+1] = a[2*i+1] > b[2*i+1] ? 51 : 12;
+}
+}
+
+__attribute__((noinline, noclone)) void
+f4 (void)
+{
+  int i;
+  for (i = 0; i < N/2; ++i)
+{
+  int d0 = d[2*i], e0 = e[2*i];
+  int d1 = d[2*i+1], e1 = e[2*i+1];
+  k[2*i] = a[2*i] >= b[2*i] ? d0 : e0;
+  k[2*i+1] = a[2*i+1] >= b[2*i+1] ? d1 : e1;
+}
+}
+
+int
+main ()
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+{
+  switch (i % 9)
+   {
+   case 0: asm (""); a[i] = - i - 1; b[i] = i + 1; break;
+   case 1: a[i] = 0; b[i] = 0; break;
+   case 2: a[i] = i + 1; b[i] = - i - 1; break;
+   case 3: a[i] = i; b[i] = i + 7; break;
+   case 4: a[i] = i; b[i] = i; break;
+   case 5: a[i] = i + 16; b[i] = i + 3; break;
+   case 6: a[i] = - i - 5; b[i] = - i; break;
+   case 7: a[i] = - i; b[i] = - i; break;
+   case 8: a[i] = - i; b[i] = - i - 7; break;
+   }
+  d[i] = i;
+  e[i] = 2 * i;
+}
+  f1 ();
+  for (i = 0; i < N; i++)
+if (k[i] != ((i % 3) == 0 ? 17 : 0))
+  abort ();
+
+  f2 ();
+  for (i = 0; i < N; i++)
+{
+  switch (i % 9)
+{
+case 0:
+   case 6:
+ if (k[i] != ((i/9 % 2) == 0 ? 0 : 7))
+   abort ();
+ break;
+case 1:
+case 5:
+case 7:
+ if (k[i] != ((i/9 % 2) == 0 ? 4 : 24))
+abor

Re: [PATCH i386] PR47698 no CMOV for volatile mem

2011-11-06 Thread Sergey Ostanevich
On Wed, Nov 2, 2011 at 3:42 PM, Richard Guenther  wrote:
> On Sat, 29 Oct 2011, Sergey Ostanevich wrote:
>
>> On Fri, Oct 28, 2011 at 7:25 PM, Sergey Ostanevich  
>> wrote:
>> > On Fri, Oct 28, 2011 at 4:52 PM, Richard Guenther  
>> > wrote:
>> >> On Fri, 28 Oct 2011, Sergey Ostanevich wrote:
>> >>
>> >>> On Fri, Oct 28, 2011 at 12:16 PM, Richard Guenther  
>> >>> wrote:
>> >>> > On Thu, 27 Oct 2011, Uros Bizjak wrote:
>> >>> >
>> >>> >> Hello!
>> >>> >>
>> >>> >> > Here's a patch for PR47698, which is about CMOV should not be
>> >>> >> > generated for memory address marked as volatile.
>> >>> >> > Successfully bootstrapped and passed make check on 
>> >>> >> > x86_64-unknown-linux-gnu.
>> >>> >>
>> >>> >>
>> >>> >>       PR rtl-optimization/47698
>> >>> >>       * config/i386/i386.c (ix86_expand_int_movcc) prevent CMOV 
>> >>> >> generation
>> >>> >>       for volatile mem
>> >>> >>
>> >>> >>       PR rtl-optimization/47698
>> >>> >>       * gcc.target/i386/47698.c: New test
>> >>> >>
>> >>> >> Please use punctuation marks and correct capitalization in ChangeLog 
>> >>> >> entries.
>> >>> >>
>> >>> >> OTOH, do we want to fix this per-target, or in the middle-end?
>> >>> >
>> >>> > The middle-end pattern documentation does not say operands 2 and 3
>> >>> > are not evaluated if they do not end up being stored, so a middle-end
>> >>> > fix is more appropriate.
>> >>> >
>> >>> > Richard.
>> >>> >
>> >>>
>> >>> I have two observations:
>> >>>
>> >>> - the code for CMOV is under #ifdef in the mddle-end, which is
>> >>> explicitly marked as "have to be removed" (ifcvt.c:1446)
>> >>> - I have no clear evidence all platforms that support conditional move
>> >>> have the same semantics that lead to the PR
>> >>>
>> >>> I think the best way to address both concerns is to implement code
>> >>> that relies on а new hookup "volatile-safe CMOV" that is false by
>> >>> default.
>> >>
>> >> I suppose it's never safe for all architectures that support
>> >> memory operands in the source operand.
>> >>
>> >> Richard.
>> >
>> > ok, at least there should be no big problem of missing optimization
>> > around volatile memory.
>> >
>> > apparently the problem is here:
>> >
>> > ifcvt.c:2539 there is a test for side effects of source (which is 'a'
>> > in this case)
>> >
>> > 2539      if (! noce_operand_ok (a) || ! noce_operand_ok (b))
>> > (gdb) p debug_rtx(a)
>> > (mem/v/c/i:DI (symbol_ref:DI ("mmio") [flags 0x40] > > 0x71339140 mmio>) [2 mmio+0 S8 A64])
>> >
>> > but inside noce_operand_ok() there is a wrong order of tests:
>> >
>> > 2332      if (MEM_P (op))
>> > 2333        return ! side_effects_p (XEXP (op, 0));
>> > 2334
>> > 2335      if (side_effects_p (op))
>> > 2336        return FALSE;
>> > 2337
>> >
>> > where XEXP removes the memory reference leaving just symbol reference,
>> > that has no volatile attribute
>> > #0  side_effects_p (x=0x7149c660) at ../../gcc/rtlanal.c:2152
>> > (gdb) p debug_rtx(x)
>> > (symbol_ref:DI ("mmio") [flags 0x40] )
>> >
>> > Is the following fix is Ok?
>> > I'm testing it so far.
>> >
>> > Sergos
>> >
>> > diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
>> > index 784e2e8..3b05c2a 100644
>> > --- a/gcc/ifcvt.c
>> > +++ b/gcc/ifcvt.c
>> > @@ -2329,12 +2329,12 @@ noce_operand_ok (const_rtx op)
>> >  {
>> >   /* We special-case memories, so handle any of them with
>> >      no address side effects.  */
>> > -  if (MEM_P (op))
>> > -    return ! side_effects_p (XEXP (op, 0));
>> > -
>> >   if (side_effects_p (op))
>> >     return FALSE;
>> >
>> > +  if (MEM_P (op))
>> > +    return ! side_effects_p (XEXP (op, 0));
>> > +
>> >   return ! may_trap_p (op);
>> >  }
>> >
>> > diff --git a/gcc/testsuite/gcc.target/i386/47698.c
>> > b/gcc/testsuite/gcc.target/i386/47698.c
>> > new file mode 100644
>> > index 000..2c75109
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/47698.c
>> > @@ -0,0 +1,10 @@
>> > +/* { dg-options "-Os" } */
>> > +/* { dg-final { scan-assembler-not "cmov" } } */
>> > +
>> > +extern volatile unsigned long mmio;
>> > +unsigned long foo(int cond)
>> > +{
>> > +      if (cond)
>> > +              return mmio;
>> > +        return 0;
>> > +}
>> >
>>
>> bootstrapped and passed make check successfully on x86_64-unknown-linux-gnu
>
> Ok.
>
> Thanks,
> Richard.

Could someone please commit it for me?

Sergos

/gcc

 2011-11-06 Sergey Ostanevich sergos@gmail.com

 PR rtl-optimization/47698
 * ifconv.c (noce_operand_ok): prevent CMOV generation
 for volatile mem

/testsuites

 2011-11-06 Sergey Ostanevich sergos@gmail.com

 PR rtl-optimization/47698
 * gcc.target/i386/47698.c: New test


47698.patch
Description: Binary data


Re: [v3] Re: put __stl_prime_list in comdat section

2011-11-06 Thread Xinliang David Li
Ping .. (the Nov 7 cutoff date is close ..).

thanks,

David

On Sat, Nov 5, 2011 at 12:22 PM, Xinliang David Li  wrote:
> thanks. The attached is the revised patch.
>
> David
>
> On Sat, Nov 5, 2011 at 11:52 AM, Paolo Carlini  
> wrote:
>> On 11/05/2011 07:32 PM, Xinliang David Li wrote:
>>>
>>> Hi, the following patch is a follow up to the one posted here
>>> http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html.
>>>
>>> The new patch is a header only change and can greatly reduce rodata
>>> section size for some programs.
>>>
>>> Ok for trunk after testing?
>>
>> As usual for backward/ stuff, I would say largely Ian's call, but please
>> don't use get_prime_list unuglified, prefer __get_prime_list, or something
>> with _S_* as prefix, being a static member.
>>
>> Thanks,
>> Paolo.
>>
>> PS: also, make sure to post library patches to libstdc++ too, and, possibly,
>> add [v3] to the Subject, otherwise you can easily fail to get the attention
>> of the library people.
>>
>


[C++ Patch] PR 47695

2011-11-06 Thread Paolo Carlini

Hi,

duplicate diagnostics for (a pretty common error, I would guess):

void f() = delete;
void g() { f(); }

47695.C: In function ‘void g()’:
47695.C:2:12: error: use of deleted function ‘void f()’
47695.C:1:6: error: declared here
47695.C:2:14: error: use of deleted function ‘void f()’
47695.C:1:6: error: declared here

I'm fixing it by returning a bool from mark_used and checking it in 
finish_id_expression. Works fine, passes the testsuite. Or shall we do 
something more sophisticated?!?


Thanks,
Paolo.

PS: sorry if you are receiving this message two or even three times, 
today I'm experiencing serious problems with my connection and I have no 
idea whether is going through


///
2011-11-06  Paolo Carlini  

PR c++/47695
* decl2.c (mark_used): Early return false after error or sorry.
* cp-tree.h (mark_used): Adjust declaration.
* semantics.c (finish_id_expression): Check mark_used return value.
Index: semantics.c
===
--- semantics.c (revision 181027)
+++ semantics.c (working copy)
@@ -3286,8 +3286,9 @@ finish_id_expression (tree id_expression,
  if (TREE_CODE (first_fn) == TEMPLATE_DECL)
first_fn = DECL_TEMPLATE_RESULT (first_fn);
 
- if (!really_overloaded_fn (decl))
-   mark_used (first_fn);
+ if (!really_overloaded_fn (decl)
+ && !mark_used (first_fn))
+   return error_mark_node;
 
  if (!template_arg_p
  && TREE_CODE (first_fn) == FUNCTION_DECL
Index: decl2.c
===
--- decl2.c (revision 181027)
+++ decl2.c (working copy)
@@ -4203,9 +4203,10 @@ possibly_inlined_p (tree decl)
 
 /* Mark DECL (either a _DECL or a BASELINK) as "used" in the program.
If DECL is a specialization or implicitly declared class member,
-   generate the actual definition.  */
+   generate the actual definition.  Return false if something goes
+   wrong, true otherwise.  */
 
-void
+bool
 mark_used (tree decl)
 {
   /* If DECL is a BASELINK for a single function, then treat it just
@@ -4216,7 +4217,7 @@ mark_used (tree decl)
 {
   decl = BASELINK_FUNCTIONS (decl);
   if (really_overloaded_fn (decl))
-   return;
+   return true;
   decl = OVL_CURRENT (decl);
 }
 
@@ -4237,13 +4238,13 @@ mark_used (tree decl)
 generate it properly; see maybe_add_lambda_conv_op.  */
  sorry ("converting lambda which uses %<...%> to "
 "function pointer");
- return;
+ return false;
}
}
   error ("use of deleted function %qD", decl);
   if (!maybe_explain_implicit_delete (decl))
error_at (DECL_SOURCE_LOCATION (decl), "declared here");
-  return;
+  return false;
 }
 
   /* We can only check DECL_ODR_USED on variables or functions with
@@ -4252,20 +4253,20 @@ mark_used (tree decl)
   if ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL)
   || DECL_LANG_SPECIFIC (decl) == NULL
   || DECL_THUNK_P (decl))
-return;
+return true;
 
   /* We only want to do this processing once.  We don't need to keep trying
  to instantiate inline templates, because unit-at-a-time will make sure
  we get them compiled before functions that want to inline them.  */
   if (DECL_ODR_USED (decl))
-return;
+return true;
 
   /* If within finish_function, defer the rest until that function
  finishes, otherwise it might recurse.  */
   if (defer_mark_used_calls)
 {
   VEC_safe_push (tree, gc, deferred_mark_used_calls, decl);
-  return;
+  return true;
 }
 
   if (TREE_CODE (decl) == FUNCTION_DECL)
@@ -4294,15 +4295,15 @@ mark_used (tree decl)
 
   /* If we don't need a value, then we don't need to synthesize DECL.  */
   if (cp_unevaluated_operand != 0)
-return;
+return true;
 
   if (processing_template_decl)
-return;
+return true;
 
   /* Check this too in case we're within fold_non_dependent_expr.  */
   if (DECL_TEMPLATE_INFO (decl)
   && uses_template_parms (DECL_TI_ARGS (decl)))
-return;
+return true;
 
   DECL_ODR_USED (decl) = 1;
   if (DECL_CLONED_FUNCTION_P (decl))
@@ -4380,6 +4381,8 @@ mark_used (tree decl)
/*expl_inst_class_mem_p=*/false);
   --function_depth;
 }
+
+  return true;
 }
 
 #include "gt-cp-decl2.h"
Index: cp-tree.h
===
--- cp-tree.h   (revision 181027)
+++ cp-tree.h   (working copy)
@@ -5049,7 +5049,7 @@ extern tree build_offset_ref_call_from_tree   (tree,
 extern bool decl_constant_var_p(tree);
 extern bool decl_maybe_constant_var_p  (tree);
 extern void check_default_args (tree);
-extern void mark_used  (tree);
+extern bool mark_used  (tree);
 e

Re: cxx-mem-model merge c-family.patch [1/8]

2011-11-06 Thread Andrew MacLeod

On 11/06/2011 09:58 AM, Andrew MacLeod wrote:


Just checked in the patches.   I will also post the exact patches 
now.  I am also checking out mainline and building it again.


Andrew





2011-11-06  Andrew MacLeod  
Richard Henderson  

Merged from cxx-mem-model.

* c-cppbuiltin.c (c_cpp_builtins): Test both atomic and sync patterns.
* c-common.c (sync_resolve_params, sync_resolve_return): Only tweak 
parameters that are the same type size.
(get_atomic_generic_size): New.  Find size of generic
atomic function parameters and do typechecking.
(add_atomic_size_parameter): New.  Insert size into parameter list.
(resolve_overloaded_atomic_exchange): Restructure __atomic_exchange to
either __atomic_exchange_n or external library call.
(resolve_overloaded_atomic_compare_exchange): Restructure 
__atomic_compare_exchange to either _n variant or external library call.
(resolve_overloaded_atomic_load): Restructure __atomic_load to either 
__atomic_load_n or an external library call.
(resolve_overloaded_atomic_store): Restructure __atomic_store to either
__atomic_store_n or an external library call.
(resolve_overloaded_builtin): Handle new __atomic builtins.

Index: gcc/c-family/c-common.c
===
*** gcc/c-family/c-common.c (revision 181026)
--- gcc/c-family/c-common.c (working copy)
*** sync_resolve_size (tree function, VEC(tr
*** 9007,9013 
 was encountered; true on success.  */
  
  static bool
! sync_resolve_params (tree orig_function, tree function, VEC(tree, gc) *params)
  {
function_args_iterator iter;
tree ptype;
--- 9007,9014 
 was encountered; true on success.  */
  
  static bool
! sync_resolve_params (location_t loc, tree orig_function, tree function,
!VEC(tree, gc) *params, bool orig_format)
  {
function_args_iterator iter;
tree ptype;
*** sync_resolve_params (tree orig_function,
*** 9035,9055 
++parmnum;
if (VEC_length (tree, params) <= parmnum)
{
! error ("too few arguments to function %qE", orig_function);
  return false;
}
  
!   /* ??? Ideally for the first conversion we'd use convert_for_assignment
!so that we get warnings for anything that doesn't match the pointer
!type.  This isn't portable across the C and C++ front ends atm.  */
!   val = VEC_index (tree, params, parmnum);
!   val = convert (ptype, val);
!   val = convert (arg_type, val);
!   VEC_replace (tree, params, parmnum, val);
  
function_args_iter_next (&iter);
  }
  
/* The definition of these primitives is variadic, with the remaining
   being "an optional list of variables protected by the memory barrier".
   No clue what that's supposed to mean, precisely, but we consider all
--- 9036,9069 
++parmnum;
if (VEC_length (tree, params) <= parmnum)
{
! error_at (loc, "too few arguments to function %qE", orig_function);
  return false;
}
  
!   /* Only convert parameters if the size is appropriate with new format
!sync routines.  */
!   if (orig_format
! || tree_int_cst_equal (TYPE_SIZE (ptype), TYPE_SIZE (arg_type)))
!   {
! /* Ideally for the first conversion we'd use convert_for_assignment
!so that we get warnings for anything that doesn't match the pointer
!type.  This isn't portable across the C and C++ front ends atm.  */
! val = VEC_index (tree, params, parmnum);
! val = convert (ptype, val);
! val = convert (arg_type, val);
! VEC_replace (tree, params, parmnum, val);
!   }
  
function_args_iter_next (&iter);
  }
  
+   /* __atomic routines are not variadic.  */
+   if (!orig_format && VEC_length (tree, params) != parmnum + 1)
+ {
+   error_at (loc, "too many arguments to function %qE", orig_function);
+   return false;
+ }
+ 
/* The definition of these primitives is variadic, with the remaining
   being "an optional list of variables protected by the memory barrier".
   No clue what that's supposed to mean, precisely, but we consider all
*** sync_resolve_params (tree orig_function,
*** 9064,9076 
 PARAMS.  */
  
  static tree
! sync_resolve_return (tree first_param, tree result)
  {
tree ptype = TREE_TYPE (TREE_TYPE (first_param));
ptype = TYPE_MAIN_VARIANT (ptype);
!   return convert (ptype, result);
  }
  
  /* Some builtin functions are placeholders for other expressions.  This
 function should be called immediately after parsing the call expression
 before surrounding code has committed to the type of the expression.
--- 9078,9465 
 PARAMS.  */
  
  static tree
! sync_resolve_return (tree first_param, tree 

Re: cxx-mem-model merge testsuite part 2 patch 7/8

2011-11-06 Thread Andrew MacLeod

On 11/06/2011 09:58 AM, Andrew MacLeod wrote:


Just checked in the patches.   I will also post the exact patches 
now.  I am also checking out mainline and building it again.


Andrew




Index: gcc/testsuite/gcc.dg/gomp/atomic-14.c
===
*** gcc/testsuite/gcc.dg/gomp/atomic-14.c   (revision 181026)
--- gcc/testsuite/gcc.dg/gomp/atomic-14.c   (working copy)
***
*** 1,43 
- /* PR middle-end/45423 */
- /* { dg-do compile } */
- /* { dg-options "-fopenmp" } */
- 
- #ifdef __cplusplus
- bool *baz ();
- #else
- _Bool *baz ();
- #endif
- int *bar ();
- 
- int
- foo (void)
- {
-   #pragma omp barrier
-   #pragma omp atomic
- (*bar ())++;
-   #pragma omp barrier
-   #pragma omp atomic
- ++(*bar ());
-   #pragma omp barrier
-   #pragma omp atomic
- (*bar ())--;
-   #pragma omp barrier
-   #pragma omp atomic
- --(*bar ());
-   #pragma omp barrier
-   #pragma omp atomic
- (*baz ())++;
-   #pragma omp barrier
-   #pragma omp atomic
- ++(*baz ());
- #ifndef __cplusplus
-   #pragma omp barrier
-   #pragma omp atomic
- (*baz ())--;
-   #pragma omp barrier
-   #pragma omp atomic
- --(*baz ());
-   #pragma omp barrier
- #endif
-   return 0;
- }
--- 0 
Index: gcc/testsuite/gcc.dg/gomp/atomic-4.c
===
*** gcc/testsuite/gcc.dg/gomp/atomic-4.c(revision 181026)
--- gcc/testsuite/gcc.dg/gomp/atomic-4.c(working copy)
***
*** 1,24 
- /* { dg-do compile } */
- 
- int a[4];
- int *p;
- struct S { int x; int y[4]; } s;
- int *bar(void);
- 
- void f1(void)
- {
-   #pragma omp atomic
- a[4] += 1;
-   #pragma omp atomic
- *p += 1;
-   #pragma omp atomic
- s.x += 1;
-   #pragma omp atomic
- s.y[*p] += 1;
-   #pragma omp atomic
- s.y[*p] *= 42;
-   #pragma omp atomic
- *bar() += 1;
-   #pragma omp atomic
- *bar() *= 42;
- }
--- 0 
Index: gcc/testsuite/gcc.dg/gomp/atomic-8.c
===
*** gcc/testsuite/gcc.dg/gomp/atomic-8.c(revision 181026)
--- gcc/testsuite/gcc.dg/gomp/atomic-8.c(working copy)
***
*** 1,21 
- /* { dg-do compile } */
- 
- long double z;
- 
- void f3(void)
- {
-   #pragma omp atomic
- z++;
-   #pragma omp atomic
- z--;
-   #pragma omp atomic
- ++z;
-   #pragma omp atomic
- --z;
-   #pragma omp atomic
- z += 1;
-   #pragma omp atomic
- z *= 3;
-   #pragma omp atomic
- z /= 3;
- }
--- 0 
Index: gcc/testsuite/gcc.dg/gomp/atomic-11.c
===
*** gcc/testsuite/gcc.dg/gomp/atomic-11.c   (revision 181026)
--- gcc/testsuite/gcc.dg/gomp/atomic-11.c   (working copy)
***
*** 1,17 
- /* PR middle-end/36877 */
- /* { dg-do compile } */
- /* { dg-options "-fopenmp" } */
- /* { dg-options "-fopenmp -march=i386" { target { { i?86-*-* x86_64-*-* } && 
ia32 } } } */
- 
- int i;
- float f;
- 
- void foo (void)
- {
- #pragma omp atomic
-   i++;
- #pragma omp atomic
-   f += 1.0;
- }
- 
- /* { dg-final { scan-assembler-not "__sync_(fetch|add|bool|val)" { target 
i?86-*-* x86_64-*-* powerpc*-*-* ia64-*-* s390*-*-* sparc*-*-* } } } */
--- 0 
Index: gcc/testsuite/gcc.dg/gomp/atomic-15.c
===
*** gcc/testsuite/gcc.dg/gomp/atomic-15.c   (revision 181026)
--- gcc/testsuite/gcc.dg/gomp/atomic-15.c   (working copy)
***
*** 1,46 
- /* { dg-do compile } */
- /* { dg-options "-fopenmp" } */
- 
- int x = 6;
- 
- int
- main ()
- {
-   int v;
-   #pragma omp atomic
- x = x * 7 + 6;/* { dg-error "expected" } */
-   #pragma omp atomic
- x = x * 7 ^ 6;/* { dg-error "expected" } */
-   #pragma omp atomic update
- x = x - 8 + 6;/* { dg-error "expected" } */
-   #pragma omp atomic
- x = x ^ 7 | 2;/* { dg-error "expected" } */
-   #pragma omp atomic
- x = x / 7 * 2;/* { dg-error "expected" } */
-   #pragma omp atomic
- x = x / 7 / 2;/* { dg-error "expected" } */
-   #pragma omp atomic capture
- v = x = x | 6;/* { dg-error "invalid operator" } */
-   #pragma omp atomic capture
- { v = x; x = x * 7 + 6; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { v = x; x = x * 7 ^ 6; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { v = x; x = x - 8 + 6; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { v = x; x = x ^ 7 | 2; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { v = x; x = x / 7 * 2; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { v = x; x = x / 7 / 2; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { x = x * 7 + 6; v = x; } /* { dg-error "expected" } */
-   #pragma omp atomic capture
- { x = x * 7 ^ 6; v = x; } /* { dg-error "expected" }

Re: cxx-mem-model merge fortran patch 2/8

2011-11-06 Thread Andrew MacLeod

On 11/06/2011 09:58 AM, Andrew MacLeod wrote:


Just checked in the patches.   I will also post the exact patches 
now.  I am also checking out mainline and building it again.


Andrew





2011-11-06  Andrew MacLeod  
Aldy Hernandez  

Merged from cxx-mem-model.

* types.def: (BT_SIZE, BT_CONST_VOLATILE_PTR, BT_FN_VOID_INT,
BT_FN_I{1,2,4,8,16}_CONST_VPTR_INT, BT_FN_VOID_VPTR_INT,
BT_FN_BOOL_VPTR_INT, BT_FN_BOOL_SIZE_CONST_VPTR,
BT_FN_VOID_VPTR_I{1,2,4,8,16}_INT, BT_FN_VOID_SIZE_VPTR_PTR_INT,
BT_FN_VOID_SIZE_CONST_VPTR_PTR_INT, BT_FN_VOID_SIZE_VPTR_PTR_PTR_INT,
BT_FN_BOOL_VPTR_PTR_I{1,2,4,8,16}_BOOL_INT_INT,
BT_FN_I{1,2,4,8,16}_VPTR_I{1,2,4,8,16}_INT): New types.


Index: gcc/fortran/types.def
===
*** gcc/fortran/types.def   (revision 181026)
--- gcc/fortran/types.def   (working copy)
*** DEF_PRIMITIVE_TYPE (BT_UINT, unsigned_ty
*** 57,62 
--- 57,63 
  DEF_PRIMITIVE_TYPE (BT_LONG, long_integer_type_node)
  DEF_PRIMITIVE_TYPE (BT_ULONGLONG, long_long_unsigned_type_node)
  DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) (word_mode, 1))
+ DEF_PRIMITIVE_TYPE (BT_SIZE, size_type_node)
  
  DEF_PRIMITIVE_TYPE (BT_I1, builtin_type_for_size (BITS_PER_UNIT*1, 1))
  DEF_PRIMITIVE_TYPE (BT_I2, builtin_type_for_size (BITS_PER_UNIT*2, 1))
*** DEF_PRIMITIVE_TYPE (BT_VOLATILE_PTR,
*** 70,76 
  build_pointer_type
   (build_qualified_type (void_type_node,
  TYPE_QUAL_VOLATILE)))
! 
  DEF_POINTER_TYPE (BT_PTR_LONG, BT_LONG)
  DEF_POINTER_TYPE (BT_PTR_ULONGLONG, BT_ULONGLONG)
  DEF_POINTER_TYPE (BT_PTR_PTR, BT_PTR)
--- 71,80 
  build_pointer_type
   (build_qualified_type (void_type_node,
  TYPE_QUAL_VOLATILE)))
! DEF_PRIMITIVE_TYPE (BT_CONST_VOLATILE_PTR,
!   build_pointer_type
!(build_qualified_type (void_type_node,
! TYPE_QUAL_VOLATILE|TYPE_QUAL_CONST)))
  DEF_POINTER_TYPE (BT_PTR_LONG, BT_LONG)
  DEF_POINTER_TYPE (BT_PTR_ULONGLONG, BT_ULONGLONG)
  DEF_POINTER_TYPE (BT_PTR_PTR, BT_PTR)
*** DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, 
*** 85,90 
--- 89,96 
  DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
  DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
  DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR)
+ DEF_FUNCTION_TYPE_1 (BT_FN_VOID_INT, BT_VOID, BT_INT)
+ 
  
  DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
  
*** DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_I4, B
*** 98,103 
--- 104,124 
  DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_I8, BT_I8, BT_VOLATILE_PTR, BT_I8)
  DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_I16, BT_I16, BT_VOLATILE_PTR, BT_I16)
  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTR, BT_VOID, BT_PTR, BT_PTR)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I1_CONST_VPTR_INT, BT_I1, BT_CONST_VOLATILE_PTR,
+BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I2_CONST_VPTR_INT, BT_I2, BT_CONST_VOLATILE_PTR,
+BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I4_CONST_VPTR_INT, BT_I4, BT_CONST_VOLATILE_PTR,
+BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I8_CONST_VPTR_INT, BT_I8, BT_CONST_VOLATILE_PTR,
+BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I16_CONST_VPTR_INT, BT_I16, BT_CONST_VOLATILE_PTR,
+BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, BT_SIZE,
+BT_CONST_VOLATILE_PTR)
+ 
  
  DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
  
*** DEF_FUNCTION_TYPE_3 (BT_FN_I16_VPTR_I16_
*** 119,133 
--- 140,170 
 BT_I16, BT_I16)
  DEF_FUNCTION_TYPE_3 (BT_FN_VOID_OMPFN_PTR_UINT, BT_VOID, BT_PTR_FN_VOID_PTR,
   BT_PTR, BT_UINT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_I1_VPTR_I1_INT, BT_I1, BT_VOLATILE_PTR, BT_I1, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_I2_VPTR_I2_INT, BT_I2, BT_VOLATILE_PTR, BT_I2, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_I4_VPTR_I4_INT, BT_I4, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_I8_VPTR_I8_INT, BT_I8, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_I16_VPTR_I16_INT, BT_I16, BT_VOLATILE_PTR, BT_I16, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I1_INT, BT_VOID, BT_VOLATILE_PTR, BT_I1, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
+ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT

Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus
Last try: Also gzip the release notes - let's see whether it mailserver 
accepts that email.


Tobias


PS: I really hate that the email get's simply dropped without any
reject email or any other status. Seemingly, my other emails without
patches go through!


I just realized that my patch email did not come through - however,
I did not get any reject email. Let's try first without patch - it's
available at
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


I wondered whether the patch exceeded the attachment size - I think
it's around 100 kB. However, even the gipped email (about 20 kB) did
not get through.

Thus, you have to live with the URL above. As I do not know what's
the problem, I cannot really solve it.

PS: The patch for the release notes is attached








releasenotes.diff.gz
Description: GNU Zip compressed data


constructor-v4a.diff.gz
Description: GNU Zip compressed data


[PATCH] Fix libgcc_tm.h dependency in libgcc Makefile

2011-11-06 Thread John David Anglin
This is fixes a trunk build error noticed on hppa-linux at -j4.  Tested on
hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.

Ok?

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

2011-11-06  John David Anglin  

PR other/50991
* Makefile.in: Make EXTRA_PARTS depend on libgcc_tm.h instead of
extra-parts.

Index: Makefile.in
===
--- Makefile.in (revision 181037)
+++ Makefile.in (working copy)
@@ -996,7 +996,7 @@
 $(libgcc-objects) $(libgcc-s-objects) $(libgcc-eh-objects) \
$(libgcov-objects) \
$(libunwind-objects) $(libunwind-s-objects) \
-   $(extra-parts): libgcc_tm.h
+   $(EXTRA_PARTS): libgcc_tm.h
 
 install-unwind_h:
dest=$(gcc_objdir)/include/tmp-unwind.h; \


[patch] 20/n: trans-mem: Unified change logs

2011-11-06 Thread Torvald Riegel
The patch adds unified changelogs. (Before merging, we would remove the
ChangeLog.tm-merge files and add their contents to the respective
ChangeLog files). libitm/ChangeLog would stay as is, I suppose.

OK for branch?
commit d54f6c956fab68f5961f245e02817c6720c0758d
Author: Torvald Riegel 
Date:   Sun Nov 6 02:18:16 2011 +0100

unified changelogs

index 000..4919b87
--- /dev/null
+++ b/ChangeLog.tm-merge
@@ -0,0 +1,11 @@
+2011-11-07  Aldy Hernandez  
+   Richard Henderson  
+
+   Merged from transactional-memory.
+
+   * Makefile.def (lang_env_dependencies): libitm is c++.
+   Add libitm target module.
+   * configure.ac: Likewise.
+   * config/mmap.m4: New file.
+   * contrib/gcc_update: Add libitm to touch data.
+   * Makefile.in, configure: Rebuild.
index 000..cd7f178
--- /dev/null
+++ b/contrib/ChangeLog.tm-merge
@@ -0,0 +1,5 @@
+2011-11-07  Richard Henderson  
+
+   Merged from transactional-memory.
+
+   * contrib/gcc_update: Add libitm to touch data.
index 000..8f0be8d
--- /dev/null
+++ b/gcc/ChangeLog.tm-merge
@@ -0,0 +1,192 @@
+2011-11-07  Richard Henderson  
+   Aldy Hernandez  
+   Andrew MacLeod  
+   Torvald Riegel  
+
+   Merged from transactional-memory.
+
+   * gtm-builtins.def: New file.
+   * trans-mem.c: New file.
+   * trans-mem.h: New file.
+
+   * config/i386/i386.c: Define TARGET_VECTORIZE* transactional variants.
+   (ix86_handle_tm_regparm_attribute, struct bdesc_tm,
+   ix86_builtin_tm_load, ix86_builtin_tm_store,
+   ix86_init_tm_builtins): New.
+   (ix86_init_builtins): Initialize TM builtins.
+   (struct ix86_attribute_table): Add "*tm regparm".
+   * config/i386/i386-builtin-types.def (PV2SI): Define.
+   (PCV2SI): Define.
+   Define V2SI_FTYPE_PCV2SI.
+   Define V4SF_FTYPE_PCV4SF.
+   Define V8SF_FTYPE_PCV8SF.
+   Define VOID_PV2SI_V2SI.
+
+   * doc/invoke.texi (C Dialect Options): Document -fgnu-tm and
+   tm-max-aggregate-size.
+   * doc/tm.texi.in: Add TARGET_VECTORIZE_BUILTIN_TM_LOAD and
+   TARGET_VECTORIZE_BUILTIN_TM_STORE hooks.
+   * doc/tm.texi: Regenerate.
+
+   * attribs.c (apply_tm_attr): New.
+   (init_attributes): Allow '*' prefix for overrides.
+   (register_attribute): Likewise.
+   * builtin-attrs.def (ATTR_TM_TMPURE, ATTR_TM_REGPARM): New.
+   (ATTR_TM_NOTHROW_LIST, ATTR_TM_TMPURE_NOTHROW_LIST,
+   ATTR_TM_PURE_TMPURE_NOTHROW_LIST, ATTR_TM_NORETURN_NOTHROW_LIST,
+   ATTR_TM_CONST_NOTHROW_LIST, ATTR_TMPURE_MALLOC_NOTHROW_LIST,
+   ATTR_TMPURE_NOTHROW_LIST): New.
+   * builtin-types.def (BT_FN_I[1248]_VPTR, BT_FN_FLOAT_VPTR,
+   BT_FN_DOUBLE_VPTR, BT_FN_LDOUBLE_VPTR, BT_FN_VOID_VPTR_I[1248],
+   BT_FN_VOID_VPTR_FLOAT, BT_FN_VOID_VPTR_DOUBLE,
+   BT_FN_VOID_VPTR_LDOUBLE, BT_FN_VOID_VPTR_SIZE): New.
+   * builtins.def: Include gtm-builtins.def. Add comments regarding
+   transactional memory synchronization.
+   (DEF_TM_BUILTIN): New.
+   * c-parser.c (struct c_parser): Add in_transaction.
+   (c_parser_transaction, c_parser_transaction_expression,
+   c_parser_transaction_cancel, c_parser_transaction_attributes): New.
+   (c_parser_attribute_any_word): Split out from c_parser_attributes.
+   (c_parser_statement_after_labels): Handle RID_TRANSACTION*.
+   (c_parser_unary_expression): Same.
+   * c-tree.h (c_finish_transaction): Declare.
+   * c-typeck.c (c_finish_transaction): New.
+   (build_function_call_vec): Call tm_malloc_replacement.
+   * calls.c (is_tm_builtin): New.
+   (flags_from_decl_or_type): Add ECF_TM_OPS for TM clones.
+   * cfgbuild.c (make_edges): Add edges for REG_TM notes.
+   * cfgexpand.c (expand_gimple_stmt): Add REG_TM notes.
+   (gimple_expand_cfg): Free the tm_restart map.
+   * cfgrtl.c (purge_dead_edges): Look for REG_TM notes.
+   * cgraph.c (dump_cgraph_node): Handle tm_clone.
+   * cgraph.h (struct cgraph_node): Add tm_clone field.
+   (decl_is_tm_clone): New.
+   (struct cgraph_local_info): Add tm_may_enter_irr.
+   (cgraph_copy_node_for_versioning): Declare.
+   * cgraphunit.c (cgraph_copy_node_for_versioning): Export;
+   copy analyzed from old version. Move setting lowered to true from ...
+   (cgraph_function_versioning): ... here.
+   * combine.c (distribute_notes): Handle REG_TM notes.
+   * common.opt: Add -fgnu-tm.
+   * crtstuff.c (__TMC_LIST__, __TMC_END__): New.
+   (__do_global_dtors_aux): Deregister clone table.
+   (frame_dummy): Register clone table.
+   * emit-rtl.c (try_split): Handle REG_TM. Early return if no function
+   body.
+   * function.h (struct tm_restart_node): New.
+   (struct function): Add tm_restart member.
+   * gimple-low.c (lower_stmt): Handle GIMPLE_EH_ELSE and
+   GIMPLE_TRANSACTION.
+   (gimple_stmt_may_fallthru): Handle GIMPLE_EH_EL

Re: Enhance performance test

2011-11-06 Thread François Dumont

Attached patch applied.

2011-11-06  François Dumont 

* testsuite/performance/23_containers/insert_erase/41975.cc: Add
tests to check performance with or without cache of hash code 
and with

string type that has a costlier hash functor than int type.

François


On 11/05/2011 12:56 AM, Paolo Carlini wrote:

On 11/04/2011 10:32 PM, François Dumont wrote:

Hi

Here is a patch to enhance the performance test introduced 
recently for hashtable. It shows more clearly the 41975 performance 
issue. I also introduced a bench using unordered_set so that 
we have a tests involving a type with costlier hash functor than the 
one used for int. And lastly the bench are run twice, with and 
without hash code cached.


2011-11-04  François Dumont 

* testsuite/performance/23_containers/insert_erase/41975.cc: Add
tests to check performance with or without cache of hash code 
and with

string type that has a costlier hash functor than int type.

Ok to commit ?

Looks Ok, but, as usual, watch overlong lines!

Paolo.



Index: testsuite/performance/23_containers/insert_erase/41975.cc
===
--- testsuite/performance/23_containers/insert_erase/41975.cc	(revision 181036)
+++ testsuite/performance/23_containers/insert_erase/41975.cc	(working copy)
@@ -17,56 +17,167 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-#include 
+#include 
 #include 
 #include 
 
-int main()
+namespace
 {
-  using namespace __gnu_test;
+  // Bench using an unordered_set. Hash functor for int is quite
+  // predictable so it helps bench very specific use cases.
+  template
+void bench()
+{
+  using namespace __gnu_test;
+  std::ostringstream ostr;
+  ostr << "unordered_set " << (use_cache ? "with" : "without")
+	   << " cache";
+  const std::string desc = ostr.str();
 
-  time_counter time;
-  resource_counter resource;
+  time_counter time;
+  resource_counter resource;
 
-  start_counters(time, resource);
+  const int nb = 20;
+  start_counters(time, resource);
 
-  std::unordered_set us;
-  for (int i = 0; i != 500; ++i)
-us.insert(i);
+  std::__unordered_set, std::equal_to,
+			   std::allocator, use_cache> us;
+  for (int i = 0; i != nb; ++i)
+	us.insert(i);
 
-  stop_counters(time, resource);
-  report_performance(__FILE__, "Container generation", time, resource);
+  stop_counters(time, resource);
+  ostr.str("");
+  ostr << desc << ": first insert";
+  report_performance(__FILE__, ostr.str().c_str(), time, resource);
 
-  start_counters(time, resource);
+  start_counters(time, resource);
 
-  for (int j = 100; j != 0; --j)
-{
-  auto it = us.begin();
-  while (it != us.end())
+  // Here is the worst erase use case when hashtable implementation was
+  // something like vector>. Erasing from the end was very
+  // costly because we need to return the iterator following the erased
+  // one, as the hashtable is getting emptier at each step there are
+  // more and more empty bucket to loop through to reach the end of the
+  // container and find out that it was in fact the last element.
+  for (int j = nb - 1; j >= 0; --j)
 	{
-	  if ((*it % j) == 0)
-	it = us.erase(it);
-	  else
-	++it;
+	  auto it = us.find(j);
+	  if (it != us.end())
+	us.erase(it);
 	}
+
+  stop_counters(time, resource);
+  ostr.str("");
+  ostr << desc << ": erase from iterator";
+  report_performance(__FILE__, ostr.str().c_str(), time, resource);
+
+  start_counters(time, resource);
+
+  // This is a worst insertion use case for the current implementation as
+  // we insert an element at the begining of the hashtable and then we
+  // insert starting at the end so that each time we need to seek up to the
+  // first bucket to find the first non-empty one.
+  us.insert(0);
+  for (int i = nb - 1; i >= 0; --i)
+	us.insert(i);
+
+  stop_counters(time, resource);
+  ostr.str("");
+  ostr << desc << ": second insert";
+  report_performance(__FILE__, ostr.str().c_str(), time, resource);
+
+  start_counters(time, resource);
+
+  for (int j = nb - 1; j >= 0; --j)
+	us.erase(j);
+
+  stop_counters(time, resource);
+  ostr.str("");
+  ostr << desc << ": erase from key";
+  report_performance(__FILE__, ostr.str().c_str(), time, resource);
 }
 
-  stop_counters(time, resource);
-  report_performance(__FILE__, "Container erase", time, resource);
+  // Bench using unordered_set that show how important it is to cache
+  // hash code as computing string hash code is quite expensive compared to
+  // computing it for int.
+  template
+void bench_str()
+{
+  using namespace __gnu_test;
+  std::ostringstream ostr;
+  ostr << "unordered_set " << (use_cache ? "with" : "without")
+	   << " cache";

Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus

Am 06.11.2011 17:26, schrieb Tobias Burnus:
I just realized that my patch email did not come through - however, I 
did not get any reject email. Let's try first without patch - it's 
available at 
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


I wondered whether the patch exceeded the attachment size - I think it's 
around 100 kB. However, even the gipped email (about 20 kB) did not get 
through.


Thus, you have to live with the URL above. As I do not know what's the 
problem, I cannot really solve it.


Tobias

PS: The patch for the release notes is attached - let's see whether that 
patch works.
Index: htdocs/gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.53
diff -u -p -r1.53 changes.html
--- htdocs/gcc-4.7/changes.html	1 Nov 2011 15:15:33 -	1.53
+++ htdocs/gcc-4.7/changes.html	6 Nov 2011 15:11:20 -
@@ -373,6 +373,14 @@ long double pi = 180_degrees;-fno-backtrace. Note: GNU Fortran does
   not support backtracing on all targets.
+Fortran 2003:
+  
+   Generic interface name which have the same name as derived types
+ are now supported, which allows to write constructor functions. Note
+ that Fortran does not support static constructor functions; only
+ default initialization or an explicit structure-constructor
+ initialization are available.
+  
 Fortran 2008:
   
 Support for the DO CONCURRENT construct has been


Re: [patch] 17/n: trans-mem: compiler trans-mem main engine

2011-11-06 Thread Torvald Riegel
On Fri, 2011-11-04 at 22:48 +0100, Torvald Riegel wrote:
> On Thu, 2011-11-03 at 20:38 +, Joseph S. Myers wrote:
> > Make sure that you do need each #include present in this and any other new 
> > file.  Since 2008 a lot of includes of tm.h and toplev.h have been removed 
> > and diagnostic-core.h introduced as an alternative to diagnostic.h for 
> > many users.  If tm.h is needed, it's good to have a comment on the 
> > #include explaining why (which target macros are used, for example), since 
> > we'd like to avoid unnecessary tm.h includes.
> 
> See attached patch. OK for branch?

I should have updated the build dependencies too, which this patch does.
OK for branch?
commit f71658c4f454f8327365891e878d5810102f04da
Author: Torvald Riegel 
Date:   Sun Nov 6 15:25:16 2011 +0100

Fix dependencies for trans-mem.o.

* Makefile.in: Fix dependencies for trans-mem.o.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 027ea42..625ccbc 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2310,10 +2310,10 @@ gtype-desc.o: gtype-desc.c $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
$(CFGLOOP_H) $(TARGET_H) $(IPA_PROP_H) $(LTO_STREAMER_H) \
target-globals.h
 
-trans-mem.o : trans-mem.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
-   $(TREE_H) $(GIMPLE_H) $(TREE_FLOW_H) tree-pass.h except.h \
-   $(DIAGNOSTIC_H) $(TOPLEV_H) $(FLAGS_H) $(DEMANGLE_H) $(TRANS_MEM_H) \
-   $(TREE_DUMP_H) $(PARAMS_H) $(TARGET_DEF_H) \
+trans-mem.o : trans-mem.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+   $(TREE_H) $(GIMPLE_H) $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_INLINE_H) \
+   $(DIAGNOSTIC_CORE_H) $(DEMANGLE_H) output.h $(TRANS_MEM_H) \
+   $(PARAMS_H) $(TARGET_H) langhooks.h \
tree-pretty-print.h gimple-pretty-print.h
 
 ggc-common.o: ggc-common.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \


Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus
Last try: Also gzip the release notes - let's see whether it mailserver 
accepts that email.


Tobias

PS: I really hate that the email get's simply dropped without any 
reject email or any other status. Seemingly, my other emails without 
patches go through!


I just realized that my patch email did not come through - however, 
I did not get any reject email. Let's try first without patch - it's 
available at 
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


I wondered whether the patch exceeded the attachment size - I think 
it's around 100 kB. However, even the gipped email (about 20 kB) did 
not get through.


Thus, you have to live with the URL above. As I do not know what's 
the problem, I cannot really solve it.


PS: The patch for the release notes is attached







releasenotes.diff.gz
Description: GNU Zip compressed data


constructor-v4a.diff.gz
Description: GNU Zip compressed data


Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus

Am 06.11.2011 17:26, schrieb Tobias Burnus:
I just realized that my patch email did not come through - however, I 
did not get any reject email.


Let's try first without patch - it's available at 
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


Possibly it is too large - it's below 100 KiBi, but maybe 7bit encoding 
makes it larger? Now gzipped. For the intro text, see: 
http://gcc.gnu.org/ml/fortran/2011-11/msg00055.html


Tobias


constructor-v4a.diff.gz
Description: GNU Zip compressed data
Index: htdocs/gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.53
diff -u -p -r1.53 changes.html
--- htdocs/gcc-4.7/changes.html	1 Nov 2011 15:15:33 -	1.53
+++ htdocs/gcc-4.7/changes.html	6 Nov 2011 15:11:20 -
@@ -373,6 +373,14 @@ long double pi = 180_degrees;-fno-backtrace. Note: GNU Fortran does
   not support backtracing on all targets.
+Fortran 2003:
+  
+   Generic interface name which have the same name as derived types
+ are now supported, which allows to write constructor functions. Note
+ that Fortran does not support static constructor functions; only
+ default initialization or an explicit structure-constructor
+ initialization are available.
+  
 Fortran 2008:
   
 Support for the DO CONCURRENT construct has been


Re: [patch, Fortran] Fix PR 50690

2011-11-06 Thread Thomas Koenig

Hi Tobias,

I'm just back from holiday, which it took me a bit longer to reply.


Actually, the test case is *not* OK.

If one compiles the original test case of the PR (or your
workshare2.f90) with "-O" and looks at "-fdump-tree-original", one finds:

 #pragma omp parallel default(shared)
   {
 {
   real(kind=4) __var_1;
   {
 #pragma omp single
   {
 __var_1 = __builtin_cosf (b[0])
   }
...
 #pragma omp for schedule(static) nowait
 for (S.1 = 1; S.1 <= 5; S.1 = S.1 + 1)
   {
 a[S.1 + -1] = a[S.1 + -1] * D.1730 + a[S.1 + -1] *
D.1731;

Thus, __var_1 is a thread-local variable; however, COS() is not executed
in all threads but only in one due to the omp single: "The single
construct specifies that the associated structured block is executed by
only one of the threads in the team" (2.5.3 single Construct, OpenMP 3.1).

Jakub remarks that omp single is what we expand to omp workshare if it
is not simple enough for us.


I modified the test case as below

! { dg-do run }
! { dg-options "-ffrontend-optimize" }
! PR 50690 - this used to ICE because workshare could not handle
! BLOCKs.
program foo
  implicit none
  integer, parameter :: n = 1
  real, parameter :: eps = 3e-7
  integer :: i
  real :: A(n), B(5), C(n)
  B(1) = 3.344
  do i=1,10
  call random_number(a)
  c = a
  !$omp parallel default(shared)
  !$omp workshare
   A(:) = A(:)*cos(B(1))+A(:)*cos(B(1))
  !$omp end workshare nowait
  !$omp end parallel ! sync is implied here
  end do
  c = c*cos(b(1)) + c*cos(b(1))
  if (any(abs(a-c) > eps)) call abort
end program foo

and did indeed see an abort.

With the patch below (based on an earlier patch, fiddling with the OMP
clauses), the test case above passes, although the tree dump shows the
same issue that you referred to.

What would be the best strategy now?  Jakub, could you check the patch
for correctness?  Should I combine the workshare-6.diff approach 
(modifying the BLOCKs) with this one?  This will certainly make the

patch more compilcated, but is doable.

Thomas
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 180394)
+++ frontend-passes.c	(Arbeitskopie)
@@ -66,6 +66,13 @@
 
 static int forall_level;
 
+/* Keep track of the OMP blocks, so we can mark variables introduced
+   by optimizations as private.  */
+
+static int omp_level;
+static int omp_size;
+static gfc_code **omp_block;
+
 /* Entry point - run all passes for a namespace.  So far, only an
optimization pass is run.  */
 
@@ -76,12 +83,15 @@
 {
   expr_size = 20;
   expr_array = XNEWVEC(gfc_expr **, expr_size);
+  omp_size = 20;
+  omp_block = XCNEWVEC(gfc_code *, omp_size);
 
   optimize_namespace (ns);
   if (gfc_option.dump_fortran_optimized)
 	gfc_dump_parse_tree (ns, stdout);
 
   XDELETEVEC (expr_array);
+  XDELETEVEC (omp_block);
 }
 }
 
@@ -245,9 +255,17 @@
   gfc_namespace *ns;
   int i;
 
-  /* If the block hasn't already been created, do so.  */
-  if (inserted_block == NULL)
+  /* If the block hasn't already been created, do so.  If we are within
+ an OMP construct, create the temporary variable in the current block.
+ This is to avoid problems with OMP workshare.  */
+
+  if (omp_level > 0)
 {
+  ns = current_ns;
+  changed_statement = current_code;
+}
+  else if (inserted_block == NULL)
+{
   inserted_block = XCNEW (gfc_code);
   inserted_block->op = EXEC_BLOCK;
   inserted_block->loc = (*current_code)->loc;
@@ -309,6 +327,20 @@
   symbol->attr.flavor = FL_VARIABLE;
   symbol->attr.referenced = 1;
   symbol->attr.dimension = e->rank > 0;
+
+  if (omp_level > 0)
+{
+  /* Insert an OMP PRIVATE clause for the new variable.  */
+  gfc_omp_clauses *clauses;
+  gfc_namelist *nn;
+
+  clauses = omp_block[omp_level-1]->ext.omp_clauses;
+  nn = gfc_get_namelist ();
+  nn->sym = symbol;
+  nn->next = clauses->lists[OMP_LIST_PRIVATE];
+  clauses->lists[OMP_LIST_PRIVATE] = nn;
+}
+
   gfc_commit_symbol (symbol);
 
   result = gfc_get_expr ();
@@ -505,6 +537,7 @@
 
   current_ns = ns;
   forall_level = 0;
+  omp_level = 0;
 
   gfc_code_walker (&ns->code, convert_do_while, dummy_expr_callback, NULL);
   gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
@@ -1149,11 +1182,13 @@
 	  gfc_actual_arglist *a;
 	  gfc_code *co;
 	  gfc_association_list *alist;
+	  bool in_omp;
 
 	  /* There might be statement insertions before the current code,
 	 which must not affect the expression walker.  */
 
 	  co = *c;
+	  in_omp = false;
 
 	  switch (co->op)
 	{
@@ -1339,6 +1374,18 @@
 	case EXEC_OMP_WORKSHARE:
 	case EXEC_OMP_END_SINGLE:
 	case EXEC_OMP_TASK:
+
+	  in_omp = 1;
+
+	  if (omp_level >= omp_size)
+		{
+		  omp_size += omp_size;
+		  omp_bl

[C++ Patch] PR 47695

2011-11-06 Thread Paolo Carlini
Hi,

duplicate diagnostics for (a pretty common error, I would guess):

void f() = delete;
void g() { f(); }

47695.C: In function ‘void g()’:
47695.C:2:12: error: use of deleted function ‘void f()’
47695.C:1:6: error: declared here
47695.C:2:14: error: use of deleted function ‘void f()’
47695.C:1:6: error: declared here

I'm fixing it by returning a bool from mark_used and checking it in
finish_id_expression. Works fine, passes the testsuite. Or shall we do
something more sophisticated?!?

Thanks,
Paolo.

PS: sorry if you are receiving this message twice, today I'm
experiencing problems with my connection




CL_47695
Description: Binary data


patch_47695
Description: Binary data


Re: [PATCH] Don't prevent merging of bbs because of user labels (PR tree-optimization/50693)

2011-11-06 Thread Eric Botcazou
> As has been suggested by Alexandre in the PR, this patch allows merging
> basic blocks which couldn't be merged before because of user (non-forced)
> labels at the beginning of the second basic blocks.
> With this patch the user label is thrown away (for -g0 or -g
> -fno-var-tracking-assignments) or turned into a debug bind stmt which
> contains the label.

Do we really want to do that if !optimize?

-- 
Eric Botcazou


Re: [rs6000] fix PR 30282, load from stack moved past stack update

2011-11-06 Thread David Edelsohn
On Fri, Nov 4, 2011 at 9:43 AM, Alan Modra  wrote:
> This patch fixes PR30282, caused by instructions being scheduled over
> the stack reset.  Note that only for ABI_V4 do we currently have
> frame_reg_rtx != sp_reg_rtx in rs6000_emit_stack_reset, so the patch
> doesn't emit *less* blockages.  I did benchmark this change and saw
> nothing but the usual benchmake noise.
>
> Bootstrapped etc. powerpc-linux.  OK to apply?
>
>        * config/rs6000/rs6000.c (rs6000_emit_stack_reset): Always emit
>        blockage for ABI_V4.

Okay.

Thanks, David


Re: [patch] 20/n: trans-mem: Unified change logs

2011-11-06 Thread Torvald Riegel
On Sun, 2011-11-06 at 06:09 -0800, Aldy Hernandez wrote:
> On 11/06/11 06:05, Torvald Riegel wrote:
> > The patch adds unified changelogs. (Before merging, we would remove the
> > ChangeLog.tm-merge files and add their contents to the respective
> > ChangeLog files). libitm/ChangeLog would stay as is, I suppose.
> >
> > OK for branch?
> 
> Thanks so much.
> 
> OK.

Committed. Please update the affected ChangeLog.tm-merge files from now
on.



Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus

Also the following failed. Thus, the patch links again:
- http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff
- Release notes: 
http://users.physik.fu-berlin.de/~tburnus/tmp/releasenotes.diff


Tobias

PS: I really hate that the email get's simply dropped without any reject 
email or any other status. Seemingly, my other emails without patches go 
through!


I just realized that my patch email did not come through - however, I 
did not get any reject email. Let's try first without patch - it's 
available at 
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


I wondered whether the patch exceeded the attachment size - I think 
it's around 100 kB. However, even the gipped email (about 20 kB) did 
not get through.


Thus, you have to live with the URL above. As I do not know what's the 
problem, I cannot really solve it.


PS: The patch for the release notes is attached




Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Tobias Burnus
I just realized that my patch email did not come through - however, I 
did not get any reject email.


Let's try first without patch - it's available at 
http://users.physik.fu-berlin.de/~tburnus/tmp/constructor.diff


Tobias

Rouson, Damian wrote:

Bravo! Thanks for all the hard work, Tobias.

Although I realize many people will (correctly) label the constructor
capability as syntactic sugar, it supports an idiom that is common across
OOP languages as you point out.  Common idioms have expressive power.

Damian

On 11/6/11 6:29 AM, "Tobias Burnus"  wrote:


Dear all,

this patch fixes as collateral effect PR 37829 (alias PR 45190) where
C_PTR/C_FUNPTR occurred when use associating a module using them, if one
additionally uses iso_fc_binding directly.

The main part of this patch, however, is for PR 39427 (alias 45190):
Allowing generic functions to have the same name as a derived type,
which is a Fortran 2003 feature. In expressions, the generic functions
have a higher precedence then the structure constructor. Note that the
functions are not required to return the derived type.

This feature allows one to create something which looks similar to
constructors in other OOP languages, except that static constructor
functions do not exist.

This patch implements them by creating for each derived type two symbols
(symtrees): One for the derived type and one for the generic function,
which links to the derived type. To distinguish them, the derived type
starts with a capital letter in the symtree. In order to facilitate the
error-message handling, the symbol itself remains in lower case.

The main challenges were to ensure that one gets the derived type when
needed and to store them properly in the module. The most time consuming
part was to find all the places one had to change that issues with
module reading could turn up much later; for instance at resolution time
of a scope which had read that module. In total, it took 18 months
between the first draft patch (cf. PR39427 comment 6, 12-14) and the
final patch. Although, the patch looked almost working by then, it took
many, many, many hours to fix the issues. Also the RFC patch, posted 6
days ago, had more issues than I had hoped for.

The attached patch had been build on x86-64-linux and successfully
regtested (gfortran and libgomp). (A full bootstrap of an almost-ready
version was done as well; I had to rebuild because I found some
left-over commented code blocks.)

Additionally, I tried the previous patches with several programs to
reduce the likelihood that it breaks real-world code. In particular, the
very latest version of the patch was used to compile FLEUR, Elk, Octopus
and the Polyhedron benchmark. Yesterday evening's version was used to
compile the Exciting code (which includes the sensitive FoX Fortran XML
library), CP2K, PSBLAS and FGSL. With a slightly older version, I also
successfully compiled Tonto, Quantum Espresso and Abinit.

OK for the trunk?

Tobias

PS: I have also included a patch for the website, i.e.
http://gcc.gnu.org/gcc-4.7/changes.html#fortran

PPS: As mentioned in the attachment, the patch includes the tree-walking
patch, which was posted before. It's a really an independent bug, even
if it only exposed with the constructor patch. I can either commit it
before or as part of this patch. See also
http://gcc.gnu.org/ml/fortran/2011-11/msg00026.html







Re: [Patch,Fortran] PR39427/37829 - implement F2003's constructors

2011-11-06 Thread Rouson, Damian
Bravo! Thanks for all the hard work, Tobias.

Although I realize many people will (correctly) label the constructor
capability as syntactic sugar, it supports an idiom that is common across
OOP languages as you point out.  Common idioms have expressive power.

Damian 

On 11/6/11 6:29 AM, "Tobias Burnus"  wrote:

>Dear all,
>
>this patch fixes as collateral effect PR 37829 (alias PR 45190) where
>C_PTR/C_FUNPTR occurred when use associating a module using them, if one
>additionally uses iso_fc_binding directly.
>
>The main part of this patch, however, is for PR 39427 (alias 45190):
>Allowing generic functions to have the same name as a derived type,
>which is a Fortran 2003 feature. In expressions, the generic functions
>have a higher precedence then the structure constructor. Note that the
>functions are not required to return the derived type.
>
>This feature allows one to create something which looks similar to
>constructors in other OOP languages, except that static constructor
>functions do not exist.
>
>This patch implements them by creating for each derived type two symbols
>(symtrees): One for the derived type and one for the generic function,
>which links to the derived type. To distinguish them, the derived type
>starts with a capital letter in the symtree. In order to facilitate the
>error-message handling, the symbol itself remains in lower case.
>
>The main challenges were to ensure that one gets the derived type when
>needed and to store them properly in the module. The most time consuming
>part was to find all the places one had to change that issues with
>module reading could turn up much later; for instance at resolution time
>of a scope which had read that module. In total, it took 18 months
>between the first draft patch (cf. PR39427 comment 6, 12-14) and the
>final patch. Although, the patch looked almost working by then, it took
>many, many, many hours to fix the issues. Also the RFC patch, posted 6
>days ago, had more issues than I had hoped for.
>
>The attached patch had been build on x86-64-linux and successfully
>regtested (gfortran and libgomp). (A full bootstrap of an almost-ready
>version was done as well; I had to rebuild because I found some
>left-over commented code blocks.)
>
>Additionally, I tried the previous patches with several programs to
>reduce the likelihood that it breaks real-world code. In particular, the
>very latest version of the patch was used to compile FLEUR, Elk, Octopus
>and the Polyhedron benchmark. Yesterday evening's version was used to
>compile the Exciting code (which includes the sensitive FoX Fortran XML
>library), CP2K, PSBLAS and FGSL. With a slightly older version, I also
>successfully compiled Tonto, Quantum Espresso and Abinit.
>
>OK for the trunk?
>
>Tobias
>
>PS: I have also included a patch for the website, i.e.
>http://gcc.gnu.org/gcc-4.7/changes.html#fortran
>
>PPS: As mentioned in the attachment, the patch includes the tree-walking
>patch, which was posted before. It's a really an independent bug, even
>if it only exposed with the constructor patch. I can either commit it
>before or as part of this patch. See also
>http://gcc.gnu.org/ml/fortran/2011-11/msg00026.html




Re: [patch] 6/n: trans-mem: runtime

2011-11-06 Thread Torvald Riegel
On Thu, 2011-11-03 at 20:15 +, Joseph S. Myers wrote:
> Do you need a FLAGS_TO_PASS setting as in 
> ?  (The way to 
> test is to do a multilib build and install, passing infodir=/some/where on 
> the "make install" line, and see if the manual ends up installed in the 
> configured directory under $prefix as well or instead of the directory 
> passed on the "make install" line - it should only go in the directory 
> passed to "make install".)

I can't reproduce this in a multilib config. The libitm info file is
correctly installed to the infodir= location only, so I'd assume this
works fine as is. But thanks for the note.

Torvald



Re: [patch] 17/n: trans-mem: compiler trans-mem main engine

2011-11-06 Thread Aldy Hernandez

On 11/06/11 07:28, Torvald Riegel wrote:

On Fri, 2011-11-04 at 22:48 +0100, Torvald Riegel wrote:

On Thu, 2011-11-03 at 20:38 +, Joseph S. Myers wrote:

Make sure that you do need each #include present in this and any other new
file.  Since 2008 a lot of includes of tm.h and toplev.h have been removed
and diagnostic-core.h introduced as an alternative to diagnostic.h for
many users.  If tm.h is needed, it's good to have a comment on the
#include explaining why (which target macros are used, for example), since
we'd like to avoid unnecessary tm.h includes.


See attached patch. OK for branch?


I should have updated the build dependencies too, which this patch does.
OK for branch?


Yes.


cxx-mem-model branch merged to mainline - revision 181031

2011-11-06 Thread Andrew MacLeod


Just checked in the patches.   I will also post the exact patches now.  
I am also checking out mainline and building it again.


Andrew




Re: r181016 - in /trunk: contrib/ChangeLog contrib/...

2011-11-06 Thread Joern Rennecke

Quoting "Joseph S. Myers" :


All these properties look wrong; I see no reason for any of those files to
be executable.


Indeed.  I didn't know svn add would do that.
I've removed these spurious properties.


Re: r181016 - in /trunk: contrib/ChangeLog contrib/...

2011-11-06 Thread Joseph S. Myers
On Sat, 5 Nov 2011, amyl...@gcc.gnu.org wrote:

> Propchange: trunk/gcc/config/epiphany/epiphany-modes.def
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/epiphany-protos.h
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/epiphany.c
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/epiphany.h
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/epiphany.md
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/epiphany.opt
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/predicates.md
> ('svn:executable' added)
> 
> Propchange: trunk/gcc/config/epiphany/t-epiphany
> ('svn:executable' added)

All these properties look wrong; I see no reason for any of those files to 
be executable.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] 20/n: trans-mem: Unified change logs

2011-11-06 Thread Aldy Hernandez

On 11/06/11 06:05, Torvald Riegel wrote:

The patch adds unified changelogs. (Before merging, we would remove the
ChangeLog.tm-merge files and add their contents to the respective
ChangeLog files). libitm/ChangeLog would stay as is, I suppose.

OK for branch?


Thanks so much.

OK.


Re: [patch] 9/n: trans-mem: compiler ChangeLog entries

2011-11-06 Thread Torvald Riegel
On Fri, 2011-11-04 at 15:34 +0100, Michael Matz wrote:
> Hi,
> 
> On Thu, 3 Nov 2011, Aldy Hernandez wrote:
> 
> > +2010-05-28  Aldy Hernandez  
> > +
> > +   * target-def.h (TARGET_VECTORIZE_BUILTIN_TM_LOAD): Define.
> > +   (TARGET_VECTORIZE_BUILTIN_TM_STORE): Same.
> > +   (TARGET_VECTORIZE): Add TM callbacks.
> 
> This actually meanwhile is in target.def, you don't change target-def.h 
> anymore.

This had already been changed a while ago, but was not visible in the
changelog. I will send unified changelogs next.

Torvald



Re: Use of vector instructions in memmov/memset expanding

2011-11-06 Thread Jan Hubicka
> 
> There is rolled loop algorithm, that doesn't use SSE-modes - such
> architectures could use it instead of unrolled_loop. I think the
> performance wouldn't suffer much from that.
> For the most of modern processors, SSE-moves are faster than several
> word-sized moves, so this change in unrolled_loop implementation seems
> reasonable to me, but, of course, if you think introducing
> sse_unrolled_move is worth doing, it could be done.

This don't seem to be quite true for AMD chips.  With your change I get on
amdfam10 hardware:
memset
   libcall   rep1   noalgrep4   noalgrep8   noalg
loop   noalgunrl   noalgbyte profiled dynamic
block size 8192000 0:00.62 0:00.55 0:00.54 0:00.54 0:00.60 0:00.54 0:00.57 
0:00.54 0:00.57 0:00.54 0:00.60 0:03.63 0:00.62 0:00.62 best: 0:00.54 loop
block size  819200 0:00.65 0:00.64 0:00.64 0:00.64 0:00.63 0:00.64 0:00.62 
0:00.64 0:00.66 0:00.66 0:00.69 0:04.10 0:00.66 0:00.66 best: 0:00.62 
rep8noalign
block size   81920 0:00.21 0:00.21 0:00.21 0:00.21 0:00.27 0:00.21 0:00.21 
0:00.20 0:00.27 0:00.20 0:00.25 0:04.18 0:00.21 0:00.21 best: 0:00.20 loop
block size   20480 0:00.18 0:00.18 0:00.18 0:00.18 0:00.24 0:00.18 0:00.20 
0:00.20 0:00.28 0:00.17 0:00.23 0:04.29 0:00.18 0:00.18 best: 0:00.17 unrl
block size8192 0:00.15 0:00.15 0:00.15 0:00.15 0:00.21 0:00.15 0:00.17 
0:00.25 0:00.28 0:00.14 0:00.20 0:01.26 0:00.14 0:00.15 best: 0:00.14 unrl
block size4096 0:00.15 0:00.15 0:00.16 0:00.15 0:00.21 0:00.14 0:00.16 
0:00.25 0:00.23 0:00.14 0:00.19 0:01.24 0:00.15 0:00.15 best: 0:00.14 rep8
block size2048 0:00.16 0:00.18 0:00.18 0:00.17 0:00.23 0:00.16 0:00.18 
0:00.26 0:00.21 0:00.17 0:00.21 0:01.25 0:00.16 0:00.16 best: 0:00.16 libcall
block size1024 0:00.19 0:00.24 0:00.24 0:00.20 0:00.25 0:00.17 0:00.19 
0:00.28 0:00.23 0:00.21 0:00.26 0:01.26 0:00.17 0:00.16 best: 0:00.17 rep8
block size 512 0:00.23 0:00.34 0:00.33 0:00.23 0:00.26 0:00.20 0:00.22 
0:00.31 0:00.27 0:00.27 0:00.29 0:01.29 0:00.20 0:00.19 best: 0:00.20 rep8
block size 256 0:00.29 0:00.51 0:00.51 0:00.28 0:00.30 0:00.25 0:00.26 
0:00.38 0:00.35 0:00.39 0:00.38 0:01.33 0:00.24 0:00.25 best: 0:00.25 rep8
block size 128 0:00.39 0:00.76 0:00.76 0:00.40 0:00.42 0:00.38 0:00.40 
0:00.52 0:00.51 0:00.55 0:00.52 0:01.41 0:00.37 0:00.38 best: 0:00.38 rep8
block size  64 0:00.72 0:00.95 0:00.95 0:00.70 0:00.73 0:00.65 0:00.72 
0:00.75 0:00.75 0:00.74 0:00.76 0:01.48 0:00.64 0:00.65 best: 0:00.65 rep8
block size  48 0:00.89 0:00.98 0:01.12 0:00.86 0:00.72 0:00.83 0:00.88 
0:00.94 0:00.92 0:00.92 0:00.91 0:01.71 0:00.93 0:00.67 best: 0:00.72 
rep4noalign
block size  32 0:01.18 0:01.30 0:01.30 0:01.11 0:01.13 0:01.11 0:01.13 
0:01.20 0:01.19 0:01.13 0:01.19 0:01.79 0:01.15 0:01.11 best: 0:01.11 rep4
block size  24 0:01.57 0:01.71 0:01.71 0:01.52 0:01.51 0:01.52 0:01.52 
0:01.57 0:01.56 0:01.49 0:01.52 0:02.30 0:01.46 0:01.53 best: 0:01.49 unrl
block size  16 0:02.53 0:02.61 0:02.61 0:02.63 0:02.52 0:02.64 0:02.61 
0:02.56 0:02.52 0:02.25 0:02.50 0:03.08 0:02.26 0:02.63 best: 0:02.25 unrl
block size  14 0:02.73 0:02.77 0:02.77 0:02.62 0:02.58 0:02.64 0:02.59 
0:02.60 0:02.61 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:03.58 0:02.48 0:02.67 best: 0:02.58 rep4noalign
block size  12 0:03.29 0:03.09 0:03.08 0:03.02 0:02.98 0:03.06 0:02.96 
0:02.89 0:02.96 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:03.89 0:02.70 0:03.05 best: 0:02.89 loop
block size  10 0:03.58 0:03.64 0:03.60 0:03.58 0:03.31 0:03.52 0:03.38 
0:03.36 0:03.38 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:04.42 0:03.10 0:03.43 best: 0:03.31 rep4noalign
block size   8 0:04.19 0:03.76 0:03.75 0:03.98 0:03.83 0:03.82 0:03.70 
0:03.70 0:03.80 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:04.68 Command terminated by signal 11 0:00.00 0:03.87 best: 
0:03.70 loop
block size   6 0:06.20 0:05.66 0:05.67 0:05.69 0:05.60 0:05.73 0:05.53 
0:05.56 0:05.46 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:06.23 0:05.60 0:05.65 best: 0:05.46 loopnoalign
block size   4 0:09.58 0:06.93 0:06.94 0:07.13 0:07.30 0:07.05 0:06.94 
0:07.05 0:07.28 Command terminated by signal 11 0:00.00 Command terminated by 
signal 11 0:00.00 0:07.37 Command terminated by signal 11 0:00.00 0:07.46 best: 
0:06.93 rep1
block size   1 0:38.46 0:17.27 0:17.27 0:15.14 0:15.11 0:16.34 0:15.10 
0:16.38 0:15.11 Command terminated by signal 11 0:00.00 0:15.22 0:16.33 0:14.87 
0:16.31 best: 0:15.10 rep8noalign

The ICEs for SSE loop < 16 bytes needs to be solved.
memset
   libcall   rep1   noalgrep4   noalgrep8   noalg
loop   noalgunrl   noalgbyte profiled dynamic
block size 8192000 0:00.62 0:00.55 0:00.55 0:00.55 0:00.60 0:00.55 0:00.57 
0:00.54 0:00.59 0:00.55 0:00.57 0:03.60 0:00

Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-06 Thread Ilya Enkovich
2011/11/6 Richard Guenther :
> On Sun, Nov 6, 2011 at 11:50 AM, Ilya Enkovich  wrote:
>> Hello,
>>
>> 2011/11/5 Eric Botcazou :
 Here is a patch which fixes redundant zero extensions problem. Issue
 is resolved by expanding implicit_zee pass functionality to cover zero
 and sign extends of different modes. Could please someone review it?
>>>
>>> Could you explain the undelying idea?  The current strategy of 
>>> implicit-zee.c
>>> is exposed at length at the beginning of the file, but here's a summary:
>>>
>>>  1. On some architectures (typically x86-64), implicity zero-extensions are
>>> applied when instructions operate in selected sub-word modes (SImode here):
>>>
>>>  addl edi,eax
>>>
>>> has an implicit zero-extension for %rax.
>>>
>>>  2. Because of 1, the second instruction in sequences like:
>>>
>>>  (set (reg:SI x) (plus:SI (reg:SI z1) (reg:SI z2)))
>>>  (set (reg:DI x) (zero_extend:DI (reg:SI x)))
>>>
>>> is redundant.
>>>
>>>  3. The pass recognizes this and transforms the above sequence into:
>>>
>>>  (set (reg:DI x) (zero_extend:DI (plus:SI (reg:SI z1) (reg:SI z2
>>>
>>> and the machine description knows how to translate this into an 'addl'.
>>>
>>>
>>> You're proposing extending this to other modes and other architectures, for
>>> example QImode on x86.  But does
>>>
>>>  addb %dl, %al
>>>
>>> modify the entire %eax register on x86?  In other words, are you really 
>>> after
>>> implicit (zero-)extensions or after something else, like global elimination 
>>> of
>>> redundant extensions?
>> Initial aim of the pass was to remove zero extentions redundant due to
>> implicit zero extention in x64. But implementation actually uses
>> generic approach and seems like a mini-combiner. Pass may combine two
>> zero extends or combine zero extend with a constant as a special case
>> but in other cases we just try to merge two instructions and then
>> check we have corresponding template. It can be easily adopted to
>> remove all redundant extensions. So, byte add in the example will be
>> merged with zxero extend only if we have explicit template for it in
>> machine model.
>>
>>>
>>> What's the effect of the patch on the testcase in the PR in terms of insns 
>>> at
>>> the RTL level?  Why doesn't the combiner already optimize it?
>> The patch helps to remove two zero extends from RTL in the test from
>> PR. I believe zee pass was introduced after postreload pass because we
>> should have additional memory instructions by that time and therefore
>> more opportunities for optimization after combiner work.
>>
>> In this particular test case combiner may also help because we have
>> byte memory load and extend on combiner pass. But due to some reason
>> it does not merge them. In combiner dump I see
>>
>> (insn 39 38 40 4 (set (reg/v:QI 81 [ xr ])
>>        (mem:QI (reg/v/f:DI 111 [ ImageInPtr ]) [0 MEM[base:
>> ImageInPtr_29, offset: 0B]+0 S1 A8])) 1.c:9 66 {*movqi_internal}
>>     (nil))
>>
>> (insn 43 42 44 4 (parallel [
>>            (set (reg:SI 116 [ xr ])
>>                (zero_extend:SI (reg/v:QI 81 [ xr ])))
>>            (clobber (reg:CC 17 flags))
>>        ]) 1.c:11 121 {*zero_extendqisi2_movzbl_and}
>>     (expr_list:REG_DEAD (reg/v:QI 81 [ xr ])
>>        (expr_list:REG_UNUSED (reg:CC 17 flags)
>>            (nil
>>
>> and
>>
>> Trying 39 -> 43:
>>
>> With no additional information.
>
> Well, I bet it's because of the CC clobber which is there
> because of the use of TARGET_ZERO_EXTEND_WITH_AND.
> Where does that insn get generated?  By combine itself?

This insn is generated by expand:

(insn 43 42 44 6 (parallel [
(set (reg:SI 116)
(zero_extend:SI (reg/v:QI 81 [ xr ])))
(clobber (reg:CC 17 flags))
]) 1.c:11 -1
 (nil))

>
> Richard.
>
>>> Enhancing implicit-zee.c to address missed optimizations like the one 
>>> reported
>>> in target/50038 might well be the best approach, but the strategy shift 
>>> must be
>>> clearly exposed and discussed.  The reported numbers are certainly 
>>> impressive.
>>>
>>> --
>>> Eric Botcazou
>>>
>>
>> Ilya
>>
>


[PATCH] Fix fallout from inlinable flag handling change

2011-11-06 Thread Richard Guenther

This fixes ipa-prop to properly re-compute inlinability which
can change from false to true if a mismatched argument is dropped.

Bootstrapped and tested onx 86_64-unknown-linux-gnu, applied.

Richard.

2011-11-6  Richard Guenther  

* ipa-prop.c (ipa_modify_call_arguments): Re-compute
inlinable flag.

Index: gcc/ipa-prop.c
===
--- gcc/ipa-prop.c  (revision 181026)
+++ gcc/ipa-prop.c  (working copy)
@@ -2568,8 +2568,11 @@
   gimple_set_block (new_stmt, gimple_block (stmt));
   if (gimple_has_location (stmt))
 gimple_set_location (new_stmt, gimple_location (stmt));
+  gimple_call_set_chain (new_stmt, gimple_call_chain (stmt));
   gimple_call_copy_flags (new_stmt, stmt);
-  gimple_call_set_chain (new_stmt, gimple_call_chain (stmt));
+  if (gimple_call_cannot_inline_p (stmt))
+gimple_call_set_cannot_inline
+  (new_stmt, !gimple_check_call_matching_types (new_stmt, callee_decl));
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 {


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Jakub Jelinek
On Sun, Nov 06, 2011 at 01:12:21PM +0100, Paolo Bonzini wrote:
> On 11/06/2011 03:19 AM, Dimitrios Apostolou wrote:
> >
> >I understand major hassle is when the register file is big, too much
> >data is being copied on a function call, when it has a HARD_REG_SET as a
> >pass by value parameter. So I did some testing on SPARC, which has the
> >biggest register file I know of, and there is a small performance
> >regression indeed. On the other hand I really like the code reduction in
> >hard-reg-set.h. So how should I proceed? FWIW I'm already testing
> >passing the parameter by reference in the hottest functions.
> 
> What about adding a macro indirection to more functions (like you
> did with SET_HARD_REG_BIT and friends), so that pass-by-value can be
> changed to pass-by-reference without affecting all the uses
> throughout the compiler?

Or keep HARD_REG_SET type as is and just use a new struct type which
contains HARD_REG_SET or HARD_REG_SET * in it.
struct hard_reg_set_ptr;
void (*live_on_entry) (struct hard_reg_set_ptr *);
in the target* headers and
struct hard_reg_set_ptr { HARD_REG_SET *set; };
as the actual definition.

Jakub


Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Paolo Bonzini

On 11/06/2011 03:19 AM, Dimitrios Apostolou wrote:


I understand major hassle is when the register file is big, too much
data is being copied on a function call, when it has a HARD_REG_SET as a
pass by value parameter. So I did some testing on SPARC, which has the
biggest register file I know of, and there is a small performance
regression indeed. On the other hand I really like the code reduction in
hard-reg-set.h. So how should I proceed? FWIW I'm already testing
passing the parameter by reference in the hottest functions.


What about adding a macro indirection to more functions (like you did 
with SET_HARD_REG_BIT and friends), so that pass-by-value can be changed 
to pass-by-reference without affecting all the uses throughout the compiler?


Paolo


Re: [Patch, Fortran] Cleanup of gfc_extend_expr

2011-11-06 Thread Janus Weil
2011/11/3 Steve Kargl :
> On Thu, Nov 03, 2011 at 10:56:47PM +0100, Janus Weil wrote:
>> > At least add a comment about the re-use (abuse?) of the
>> > enum.
>>
>> Updated patch attached, which adds a short comment on the usage of 'match'.
>
> Thanks.
>
>> > This should reduce confusion months from when
>> > someone wonders why gfc_extend_expr returns a "match"
>> > for a non-matching function.
>>
>> Well, I think my approach is not as far-fetched as you seem to imply:
>> There are already a good number of procedures which use the 'match'
>> enum, although they're not related to matching at all. Listing only
>> those that occur in gfortran.h (I'm sure there are more):
>>
>>  * match gfc_mod_pointee_as (gfc_array_spec *);
>>  * match gfc_intrinsic_func_interface (gfc_expr *, int);
>>  * match gfc_intrinsic_sub_interface (gfc_code *, int);
>>  * match gfc_iso_c_sub_interface(gfc_code *, gfc_symbol *);
>>
>> The reason for this is of course that the YES/NO/ERROR triple is not
>> only useful in matching, but also in many other situations.
>>
>
> I think the patch is fine and can be committed.  But, give
> Steven a chance to respond before committing.

Thanks, Steve. I think three days should be long enough. Will commit
later today (if no one protests in the meantime).

Cheers,
Janus


Re: [Patch, Fortran] Cleanup of gfc_extend_expr

2011-11-06 Thread Janus Weil
> Sounds like what everything needs is a differently named enum: say 
> three_way_logic.

Well, one might just rename the present enum 'match' (which anyway
lacks the usual 'gfc' prefix), to something like 'gfc_three_way_logic'
(or whatever name you prefer), with appropriately named values. Then
one could apply it in a more general setting without causing
confusion. But that would be a *huge* patch (though completely
mechanical), and I'm not sure if I would be willing to waste my time
writing a ChangeLog for that (unless there is an automated way for
this by now?!?).

Cheers,
Janus



> On Nov 3, 2011, at 3:56 PM, Janus Weil wrote:
>
>>> At least add a comment about the re-use (abuse?) of the
>>> enum.
>>
>> Updated patch attached, which adds a short comment on the usage of 'match'.
>>
>>
>>> This should reduce confusion months from when
>>> someone wonders why gfc_extend_expr returns a "match"
>>> for a non-matching function.
>>
>> Well, I think my approach is not as far-fetched as you seem to imply:
>> There are already a good number of procedures which use the 'match'
>> enum, although they're not related to matching at all. Listing only
>> those that occur in gfortran.h (I'm sure there are more):
>>
>> * match gfc_mod_pointee_as (gfc_array_spec *);
>> * match gfc_intrinsic_func_interface (gfc_expr *, int);
>> * match gfc_intrinsic_sub_interface (gfc_code *, int);
>> * match gfc_iso_c_sub_interface(gfc_code *, gfc_symbol *);
>>
>> The reason for this is of course that the YES/NO/ERROR triple is not
>> only useful in matching, but also in many other situations.
>>
>> Cheers,
>> Janus
>> 
>
>


Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-06 Thread Richard Guenther
On Sun, Nov 6, 2011 at 11:50 AM, Ilya Enkovich  wrote:
> Hello,
>
> 2011/11/5 Eric Botcazou :
>>> Here is a patch which fixes redundant zero extensions problem. Issue
>>> is resolved by expanding implicit_zee pass functionality to cover zero
>>> and sign extends of different modes. Could please someone review it?
>>
>> Could you explain the undelying idea?  The current strategy of implicit-zee.c
>> is exposed at length at the beginning of the file, but here's a summary:
>>
>>  1. On some architectures (typically x86-64), implicity zero-extensions are
>> applied when instructions operate in selected sub-word modes (SImode here):
>>
>>  addl edi,eax
>>
>> has an implicit zero-extension for %rax.
>>
>>  2. Because of 1, the second instruction in sequences like:
>>
>>  (set (reg:SI x) (plus:SI (reg:SI z1) (reg:SI z2)))
>>  (set (reg:DI x) (zero_extend:DI (reg:SI x)))
>>
>> is redundant.
>>
>>  3. The pass recognizes this and transforms the above sequence into:
>>
>>  (set (reg:DI x) (zero_extend:DI (plus:SI (reg:SI z1) (reg:SI z2
>>
>> and the machine description knows how to translate this into an 'addl'.
>>
>>
>> You're proposing extending this to other modes and other architectures, for
>> example QImode on x86.  But does
>>
>>  addb %dl, %al
>>
>> modify the entire %eax register on x86?  In other words, are you really after
>> implicit (zero-)extensions or after something else, like global elimination 
>> of
>> redundant extensions?
> Initial aim of the pass was to remove zero extentions redundant due to
> implicit zero extention in x64. But implementation actually uses
> generic approach and seems like a mini-combiner. Pass may combine two
> zero extends or combine zero extend with a constant as a special case
> but in other cases we just try to merge two instructions and then
> check we have corresponding template. It can be easily adopted to
> remove all redundant extensions. So, byte add in the example will be
> merged with zxero extend only if we have explicit template for it in
> machine model.
>
>>
>> What's the effect of the patch on the testcase in the PR in terms of insns at
>> the RTL level?  Why doesn't the combiner already optimize it?
> The patch helps to remove two zero extends from RTL in the test from
> PR. I believe zee pass was introduced after postreload pass because we
> should have additional memory instructions by that time and therefore
> more opportunities for optimization after combiner work.
>
> In this particular test case combiner may also help because we have
> byte memory load and extend on combiner pass. But due to some reason
> it does not merge them. In combiner dump I see
>
> (insn 39 38 40 4 (set (reg/v:QI 81 [ xr ])
>        (mem:QI (reg/v/f:DI 111 [ ImageInPtr ]) [0 MEM[base:
> ImageInPtr_29, offset: 0B]+0 S1 A8])) 1.c:9 66 {*movqi_internal}
>     (nil))
>
> (insn 43 42 44 4 (parallel [
>            (set (reg:SI 116 [ xr ])
>                (zero_extend:SI (reg/v:QI 81 [ xr ])))
>            (clobber (reg:CC 17 flags))
>        ]) 1.c:11 121 {*zero_extendqisi2_movzbl_and}
>     (expr_list:REG_DEAD (reg/v:QI 81 [ xr ])
>        (expr_list:REG_UNUSED (reg:CC 17 flags)
>            (nil
>
> and
>
> Trying 39 -> 43:
>
> With no additional information.

Well, I bet it's because of the CC clobber which is there
because of the use of TARGET_ZERO_EXTEND_WITH_AND.
Where does that insn get generated?  By combine itself?

Richard.

>> Enhancing implicit-zee.c to address missed optimizations like the one 
>> reported
>> in target/50038 might well be the best approach, but the strategy shift must 
>> be
>> clearly exposed and discussed.  The reported numbers are certainly 
>> impressive.
>>
>> --
>> Eric Botcazou
>>
>
> Ilya
>


Many testsuite failures on x86_64 due recent "fix" about f16cintrin.h header

2011-11-06 Thread Kai Tietz
Hi,

In ChangeLog is mentioned a change to config/config.gcc, which doesn't
exist.  I assume the change is meant for config.gcc within gcc/
folder, but also for this file no change was applied.  By this
f16cintrin.h header isn't installed at all.

Cuprit patch is:

2011-11-05  Quentin Neill  

Piledriver f16cintrin.h fix.
* config/i386/f16cintrin.h: Contents moved from immintrin.h.
* config/config.gcc: Add f16cintrin.h.

Regards,
Kai


Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-06 Thread Ilya Enkovich
Hello,

2011/11/5 Eric Botcazou :
>> Here is a patch which fixes redundant zero extensions problem. Issue
>> is resolved by expanding implicit_zee pass functionality to cover zero
>> and sign extends of different modes. Could please someone review it?
>
> Could you explain the undelying idea?  The current strategy of implicit-zee.c
> is exposed at length at the beginning of the file, but here's a summary:
>
>  1. On some architectures (typically x86-64), implicity zero-extensions are
> applied when instructions operate in selected sub-word modes (SImode here):
>
>  addl edi,eax
>
> has an implicit zero-extension for %rax.
>
>  2. Because of 1, the second instruction in sequences like:
>
>  (set (reg:SI x) (plus:SI (reg:SI z1) (reg:SI z2)))
>  (set (reg:DI x) (zero_extend:DI (reg:SI x)))
>
> is redundant.
>
>  3. The pass recognizes this and transforms the above sequence into:
>
>  (set (reg:DI x) (zero_extend:DI (plus:SI (reg:SI z1) (reg:SI z2
>
> and the machine description knows how to translate this into an 'addl'.
>
>
> You're proposing extending this to other modes and other architectures, for
> example QImode on x86.  But does
>
>  addb %dl, %al
>
> modify the entire %eax register on x86?  In other words, are you really after
> implicit (zero-)extensions or after something else, like global elimination of
> redundant extensions?
Initial aim of the pass was to remove zero extentions redundant due to
implicit zero extention in x64. But implementation actually uses
generic approach and seems like a mini-combiner. Pass may combine two
zero extends or combine zero extend with a constant as a special case
but in other cases we just try to merge two instructions and then
check we have corresponding template. It can be easily adopted to
remove all redundant extensions. So, byte add in the example will be
merged with zxero extend only if we have explicit template for it in
machine model.

>
> What's the effect of the patch on the testcase in the PR in terms of insns at
> the RTL level?  Why doesn't the combiner already optimize it?
The patch helps to remove two zero extends from RTL in the test from
PR. I believe zee pass was introduced after postreload pass because we
should have additional memory instructions by that time and therefore
more opportunities for optimization after combiner work.

In this particular test case combiner may also help because we have
byte memory load and extend on combiner pass. But due to some reason
it does not merge them. In combiner dump I see

(insn 39 38 40 4 (set (reg/v:QI 81 [ xr ])
(mem:QI (reg/v/f:DI 111 [ ImageInPtr ]) [0 MEM[base:
ImageInPtr_29, offset: 0B]+0 S1 A8])) 1.c:9 66 {*movqi_internal}
 (nil))

(insn 43 42 44 4 (parallel [
(set (reg:SI 116 [ xr ])
(zero_extend:SI (reg/v:QI 81 [ xr ])))
(clobber (reg:CC 17 flags))
]) 1.c:11 121 {*zero_extendqisi2_movzbl_and}
 (expr_list:REG_DEAD (reg/v:QI 81 [ xr ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil

and

Trying 39 -> 43:

With no additional information.

> Enhancing implicit-zee.c to address missed optimizations like the one reported
> in target/50038 might well be the best approach, but the strategy shift must 
> be
> clearly exposed and discussed.  The reported numbers are certainly impressive.
>
> --
> Eric Botcazou
>

Ilya


Re: [patch] 3/n: trans-mem: runtime

2011-11-06 Thread Gerald Pfeifer

On Sat, 5 Nov 2011, Aldy Hernandez wrote:

I believe this should become "2009, 2011" or "2009, 2010, 2011"
when it's applied to trunk.

I assume the same thing goes for the rest of similar files.


Yes, that's my understanding, I just didn't want to bother you with
a response to every single patch. :-)

And 2010 if there have been changes in that year as well.

Gerald


Re: [PATCH Atom] Fix for PR target/50962 (bad AGU stall avoidance)

2011-11-06 Thread Ilya Enkovich
Hi,

2011/11/5 Richard Henderson :
>> +  if (!TARGET_OPT_AGU || optimize_function_for_size_p (cfun))
>
> Surely optimize_insn_for_size_p (), so that cold blocks are optimized for 
> size.
OK.

>> +      else if (ix86_use_lea_for_mov(insn, operands))
>> +     return "lea{q}\t{%a1, %0|%0, %a1}";
>
> We're now getting the insn type and thus length wrong.
>
> Seems like a better change is
>
>
>        ...
>            (eq_attr "alternative" "16,17")
>              (const_string "ssecvt")
>            (match_operand 1 "pic_32bit_operand" "")
>              (const_string "lea")
> +           (match_test "ix86_use_lea_for_mov (insn)")
> +             (const_string "lea")
>           ]

It would be great to have computed type here but ix86_use_lea_for_mov
will check types of other instructions and then call
extract_insn_cached. It will cause infinite loop, right?

> (Or something; can't be bothered to double-check that match_test is
> the right thing to use here.)
>
> Which will automatically use the proper mnemonic, and also get the
> instruction length right.
>
>
> r~
>

Ilya


  1   2   >