C++ PATCHes for c++/50545, 51222

2012-08-30 Thread Jason Merrill
I've been surprised at the number of issues that have come up while I've 
been working on implementing the notion of instantiation-dependent 
expressions, which aren't currently described in the standard other than 
as "expression involving a template parameter".  I checked in fixes for 
several of these issues last week, but now here's another batch.


The first patch fixes an issue with partial ordering whereby we weren't 
keeping processing_template_decl set when instantiating a function using 
dependent template arguments.


The second patch gives the error about using a parenthesized expression 
list to initialize a non-class variable even if the expressions are 
dependent, to avoid a diagnostic regression as more things become dependent.


The third patch uses coerce_template_parms to make sure that after we've 
substituted the deduced args into the partial specialization argument 
list, we do have arguments of the appropriate type and that constants 
have been folded the way we want.  Without this we could have unresolved 
overloads and variables instead of constants.


The fourth patch implements making a template template parameter a 
friend, which is tested in cpp0x/friend2.C and seems to have worked 
before entirely by accident.


The fifth patch moves the decision to build a SCOPE_REF to express a 
non-type-dependent qualified-id to a different place so that it is 
preserved in partial instantiations.


And finally, the implementation of instantiation_dependent_expression_p. 
 We really only need to check it in two places: deciding whether a 
decltype represents a dependent type, and in checking for 
value-dependence.  I've proposed to the committee that making 
value-dependent a superset of instantiation-dependent, but not doing the 
same for type-dependent, is the best way to handle 
instantiation-dependency, and that's what I've implemented here.  I also 
implemented something that has been a bit controversial on the 
committee: treating member references as instantiation-dependent even 
when they don't actually involve any template parameters, because access 
checking at instantiation time might vary between specializations; see 
my test decltype41.C for an example.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 894047b122a1676c6f0ba9e94eb79ca812174e6a
Author: Jason Merrill 
Date:   Thu Aug 30 11:54:34 2012 -0400

	* pt.c (instantiate_template_1): Keep processing_template_decl set
	if there are dependent args.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f8ff1df..54d92df 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -14373,6 +14373,10 @@ instantiate_template_1 (tree tmpl, tree orig_args, tsubst_flags_t complain)
   /* Instantiation of the function happens in the context of the function
  template, not the context of the overload resolution we're doing.  */
   push_to_top_level ();
+  /* If there are dependent arguments, e.g. because we're doing partial
+ ordering, make sure processing_template_decl stays set.  */
+  if (uses_template_parms (targ_ptr))
+++processing_template_decl;
   if (DECL_CLASS_SCOPE_P (gen_tmpl))
 {
   tree ctx = tsubst (DECL_CONTEXT (gen_tmpl), targ_ptr,
commit 764a3a0466f51b1c2c5eeca698cc0cd5b45bcf22
Author: Jason Merrill 
Date:   Wed Aug 29 19:43:20 2012 -0400

	* decl.c (cp_finish_decl): Check for invalid multiple initializers
	even if the initializer is dependent.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 4b2958c..19485fc 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6123,8 +6123,15 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 	  release_tree_vector (cleanups);
 	}
   else if (!DECL_PRETTY_FUNCTION_P (decl))
-	/* Deduce array size even if the initializer is dependent.  */
-	maybe_deduce_size_from_array_init (decl, init);
+	{
+	  /* Deduce array size even if the initializer is dependent.  */
+	  maybe_deduce_size_from_array_init (decl, init);
+	  /* And complain about multiple initializers.  */
+	  if (init && TREE_CODE (init) == TREE_LIST && TREE_CHAIN (init)
+	  && !MAYBE_CLASS_TYPE_P (type))
+	init = build_x_compound_expr_from_list (init, ELK_INIT,
+		tf_warning_or_error);
+	}
 
   if (init)
 	DECL_INITIAL (decl) = init;
diff --git a/gcc/testsuite/g++.dg/template/static30.C b/gcc/testsuite/g++.dg/template/static30.C
index 01fa5dc..07dafe2 100644
--- a/gcc/testsuite/g++.dg/template/static30.C
+++ b/gcc/testsuite/g++.dg/template/static30.C
@@ -7,4 +7,4 @@ template  struct A
 };
 
 template  const int A::i1(A::i);
-template  const int A::i2(3, A::i);
+template  const int A::i2(3, A::i); // { dg-error "expression list" }
commit e75847b77eea8af35562e41dc5250b9bb4dbc956
Author: Jason Merrill 
Date:   Thu Aug 30 16:29:47 2012 -0400

	* pt.c (get_class_bindings): Call coerce_template_parms.  Add
	main_tmpl parameter.
	(more_specialized_class): Add main_tmpl parameter.
	(most_specialized_class): Adjust calls.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index

[PATCH, PR 54409] Remapping inlining predicates fix

2012-08-30 Thread Martin Jambor
Hi,

this patch fixes PR 54409.  The condition for dealing with offset maps
when remapping predicates which I have added recently was wrong,
fortunately a subsequent assert caught this.  We cannot shift stuff by
an offset when it is passed by value.

Conversely, the condition was unnecessarily restrictive, we can still
happily use non-aggregate and by-value conditions when offset map is
negative, that only means that by-ref stuff is not guaranteed to
survive.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin



2012-08-30  Martin Jambor  

PR middle-end/54409
* ipa-inline-analysis.c (remap_predicate): Fix the offset_map
checking condition.

* gcc/testsuite/gcc.dg/torture/pr54409.c: New test.


Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -2811,8 +2811,11 @@ remap_predicate (struct inline_summary *
 if (!operand_map
 || (int)VEC_length (int, operand_map) <= c->operand_num
 || VEC_index (int, operand_map, c->operand_num) == -1
-|| (!c->agg_contents
-&& VEC_index (int, offset_map, c->operand_num) != 0)
+/* TODO: For non-aggregate conditions, adding an offset is
+   basically an arithmetic jump function processing which
+   we should support in future.  */
+|| ((!c->agg_contents || !c->by_ref)
+&& VEC_index (int, offset_map, c->operand_num) > 0)
 || (c->agg_contents && c->by_ref
 && VEC_index (int, offset_map, c->operand_num) < 0))
   cond_predicate = true_predicate ();
Index: src/gcc/testsuite/gcc.dg/torture/pr54409.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/torture/pr54409.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+
+int b;
+
+struct S
+{
+  char *p;
+  struct {
+  } s;
+  int a;
+};
+
+static _Bool
+fn2 (int *p1)
+{
+  if (b)
+{
+  struct S *c = (struct S *) &p1;
+  return c->a;
+}
+}
+
+_Bool
+fn3 (struct S *p1)
+{
+  if (fn2 ((int *) &p1->s))
+return 0;
+}


Re: [PATCH, i386]: Implement atomic_fetch_sub

2012-08-30 Thread Richard Henderson
On 08/23/2012 08:59 AM, Andrew MacLeod wrote:
> 2012-08-23  Andrew MacLeod  
> 
> gcc
>   PR target/54087
>   * optabs.c (expand_atomic_fetch_op_no_fallback): New.  Factored code
>   from expand_atomic_fetch_op.
>   (expand_atomic_fetch_op):  iTry atomic_{add|sub} operations in terms of
>   the other one if direct opcode fails.
> 
> testsuite
>   * gcc.dg/pr54087.c:  New testcase for atomic_sub -> atomic_add when
>   atomic_sub fails.

Ok.


r~


Re: [PATCH] MIPS16 TLS support for GCC

2012-08-30 Thread Richard Sandiford
Chung-Lin Tang  writes:
> On 2012/8/30 02:44 AM, Richard Sandiford wrote:
>> Chung-Lin Tang  writes:
>>> On 2012/7/6 02:23 PM, Richard Sandiford wrote:
 Richard Sandiford  writes:
>> (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit
>> a 32-bit code sequence under both MIPS/MIPS16 mode (under O32).
>>
>> As you can see in the original Feb. patch, I had changes to emit a
>> MIPS16 version of these static calls, but with the changes in (2) above,
>> they will not work with the usual situation of a 32-bit MIPS built /lib
>> (.init/.fini will have 32/16-bit code improperly concatenated).
>>
>> The CodeSourcery builds use an independent mips16 sysroot for this, so a
>> MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think
>> making it 32-bit is the compatible choice.
>
> Yeah, I agree that sounds like the right call.  Please do the same
> for the n32/n64 version (i.e. explicitly make it nomips16 rather
> than add the #error).

 BTW, doing this has removed my main concern about having dead code.
 The original patch had a separate MIPS16 implementation that (as things
 stood) could never be used by stock sources.  That would make it difficult
 to maintain.

 Now that the MIPS16 library support is purely adding nomips16 attributes
 to code that is obviously nomips16, those parts are OK on their own, 
 thanks.
 (I.e. the mips.h change, the libgcc change, and the libgomp change.)
 Feel free to drop the multilib thing if you don't want to implement
 --with-multilib-list.
>>>
>>> Hi Richard, just FYI, I just committed the said approved parts.
>>> gcc/config/mips/t-linux64 had one additional change, adding
>>> ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end
>>> with a weird option-named directory for the mips16 libraries.
>> 
>> Sorry, but the t-linux64 stuff wasn't approved.  It was just the mips.h
>> change, the libgcc change and the libgomp change.
>> 
>> Please revert the patch to t-linux64.  My original objection to adding
>> mips16 unconditionally still stands: it isn't correct for people who
>> configure for processors that don't have the MIPS16 ASE (such as Octeon).
>
> I have reverted that part.

Thanks.

> Maybe a list of proper march=XXX/mips16 added to MULTILIB_EXCLUSIONS
> will do what you're mentioning, though I haven't tried testing that for now.

TBH, I'm not sure off-hand whether MULTILIB_EXCLUSIONS takes account
of --with-arch-style defaults.  (As in: it might well do.)

Even if it does, though, I still think --with-multilib-list would be
the right way of adding a mips16 multilib.  It's just that having an
out-of-the-box way of getting a mips16 multilib seems less important
now than it did originally (because the original patch added code that
wouldn't be used without such a multilib, whereas the current patch just
adds obviously-correct nomips16 attributes).

"Not important" doesn't mean "not useful", of course.  Having
--with-multilib-list would still very nice to have if anyone
feels suitably inclined.

Richard


Ping^4 Re: Add --no-sysroot-suffix driver option

2012-08-30 Thread Joseph S. Myers
Ping^4.  This patch 
 is still pending 
review.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [v3] libstdc++/54005

2012-08-30 Thread Benjamin De Kosnik

> Pretty minor change, as per PR. This version seems more appropriate
> for templatized types.

.. this wasn't right, commentary as per bugzilla. 

tested x86/linux

-benjamin
2012-08-07  Benjamin Kosnik  

	PR libstdc++/54005 continued
	* include/std/atomic: Use __atomic_lock_free with 
	* include/bits/atomic_base.h: Same.

diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index 598e1f1..de098a3 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -422,11 +423,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_always_lock_free(sizeof(_M_i), &_M_i); }
+  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_always_lock_free(sizeof(_M_i), &_M_i); }
+  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
 
   void
   store(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept
@@ -716,11 +717,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_always_lock_free(_M_type_size(1), &_M_p); }
+  { return __atomic_is_lock_free(_M_type_size(1), NULL); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_always_lock_free(_M_type_size(1), &_M_p); }
+  { return __atomic_is_lock_free(_M_type_size(1), NULL); }
 
   void
   store(__pointer_type __p,
diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index b5ca606..535a90f 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -184,11 +184,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   bool
   is_lock_free() const noexcept
-  { return __atomic_always_lock_free(sizeof(_M_i), &_M_i); }
+  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
 
   bool
   is_lock_free() const volatile noexcept
-  { return __atomic_always_lock_free(sizeof(_M_i), &_M_i); }
+  { return __atomic_is_lock_free(sizeof(_M_i), NULL); }
 
   void
   store(_Tp __i, memory_order _m = memory_order_seq_cst) noexcept


Re: Memset/memcpy patch

2012-08-30 Thread H.J. Lu
On Mon, Dec 12, 2011 at 6:02 AM, Jan Hubicka  wrote:
>> Any update?
>
> I will look into it today, but anyway I think it is stage1 material, so we 
> have some time to progress on it.
>
> Honza

Hi Honza,

The old patch was reverted and the new patch was posted at

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00336.html

Have you got a chance to review it?

Thanks.


-- 
H.J.


Re: faster random number engine

2012-08-30 Thread Benjamin De Kosnik
On Wed, 29 Aug 2012 14:34:40 -0400
Ulrich Drepper  wrote:

> On Wed, Aug 29, 2012 at 11:43 AM, Paolo Carlini
>  wro
> > The substance isn't of course. But normally we don't have __gnu_cxx
> > things in the same std header. Can't we have a new ext/random and
> > put it in there? If we can separate the new code to it, I think
> > people would not even object to the target dependency, etc. In ext/
> > we are quite free to do extension / experimental work.
> 
> OK, I moved the definition to ext.  Will check in the result.

Nice! Thanks.

Here's a small patchlet to set the abi version to .18. With this,
check-abi will pass.

tested x86/linux

-benjamin2012-08-30  Benjamin Kosnik  

	* testsuite/util/testsuite_abi.cc (check_version): Add GLIBCXX_3.4.18.

diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc b/libstdc++-v3/testsuite/util/testsuite_abi.cc
index 4721ccd..a5066cc 100644
--- a/libstdc++-v3/testsuite/util/testsuite_abi.cc
+++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc
@@ -195,6 +195,7 @@ check_version(symbol& test, bool added)
   known_versions.push_back("GLIBCXX_3.4.15");
   known_versions.push_back("GLIBCXX_3.4.16");
   known_versions.push_back("GLIBCXX_3.4.17");
+  known_versions.push_back("GLIBCXX_3.4.18");
   known_versions.push_back("GLIBCXX_LDBL_3.4");
   known_versions.push_back("GLIBCXX_LDBL_3.4.7");
   known_versions.push_back("GLIBCXX_LDBL_3.4.10");
@@ -222,7 +223,7 @@ check_version(symbol& test, bool added)
 	test.version_status = symbol::incompatible;
 
   // Check that added symbols are added in the latest pre-release version.
-  bool latestp = (test.version_name == "GLIBCXX_3.4.17"
+  bool latestp = (test.version_name == "GLIBCXX_3.4.18"
 		 || test.version_name == "CXXABI_1.3.6"
 		 || test.version_name == "CXXABI_TM_1");
   if (added && !latestp)


Re: out-of-line and arch-specific random_device

2012-08-30 Thread Ulrich Drepper
On Thu, Aug 30, 2012 at 11:52 AM, Hans-Peter Nilsson
 wrote:
>> From: Ulrich Drepper 
>> Date: Tue, 28 Aug 2012 05:57:08 +0200
>
> This patch (commit r190787) broke build for non-_GLIBCXX_USE_RANDOM_TR1
> targets.  (See libstdc++-v3/configure.ac and its crossconfig.m4 for a
> list.)

Should be fixed now.


Re: [PATCH] Set correct source location for deallocator calls

2012-08-30 Thread Richard Henderson
On 08/30/2012 08:20 AM, Andrew Haley wrote:
> Is the problem simply that the logic to
> scan the assembly code isn't present in the libgcj testsuite?

Yes, exactly.


r~


Re: [middle-end] Add machine_mode to address_cost target hook

2012-08-30 Thread Michael Eager

On 08/29/2012 05:46 PM, Oleg Endo wrote:

Hello,

While experimenting a little bit with an idea for an address mode
selection RTL pass for SH, I realized that SH's sh_address_cost function
is quite broken.  When trying to fix it, I ran against a wall, since the
mode of the MEM is not passed to the target hook function, as it is e.g.
in legitimate_address.  This circumstance makes it a bit difficult to
return useful answers in the address_cost hook.  Like on SH,
displacement address modes for anything < SImode are considered slightly
more expensive due to increased pressure on R0.

Since everything in the middle-end already seems to pass the mode to the
'address_cost' function in rtlanal.c, I'd like to propose to forward the
mode arg to the target hook.  The change is quite obvious, as it only
adds one new (mostly) unused argument to the various address_cost
functions in the targets.

I went through all the targets' code and fixed the hook function.  It
seems some other targets than SH could also benefit from the mode wisdom
in their address_cost estimation.

There are a few peculiarities I ran across (respective target
maintainers CC'ed):




microblaze:
   The microblaze_address_cost takes the mode of the address rtx.
   Maybe it is meant to take the mode of the MEM?


The address cost calculation looks OK.  I'm not sure why the mode of
MEM is relevant to this computation.

No objections to the patch.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077




Re: out-of-line and arch-specific random_device

2012-08-30 Thread Hans-Peter Nilsson
> From: Ulrich Drepper 
> Date: Tue, 28 Aug 2012 05:57:08 +0200

This patch (commit r190787) broke build for non-_GLIBCXX_USE_RANDOM_TR1
targets.  (See libstdc++-v3/configure.ac and its crossconfig.m4 for a
list.)

> Index: libstdc++-v3/include/bits/random.h
> ===
> --- libstdc++-v3/include/bits/random.h(revision 190713)
> +++ libstdc++-v3/include/bits/random.h(working copy)
> @@ -1575,40 +1575,20 @@
>  #ifdef _GLIBCXX_USE_RANDOM_TR1
>  
>  explicit
> -random_device(const std::string& __token = "/dev/urandom")
> +random_device(const std::string& __token = "default")
>  {
> -  if ((__token != "/dev/urandom" && __token != "/dev/random")
> -   || !(_M_file = std::fopen(__token.c_str(), "rb")))
> - std::__throw_runtime_error(__N("random_device::"
> -"random_device(const std::string&)"));
> +  _M_init(__token);
>  }
>  
>  ~random_device()
> -{ std::fclose(_M_file); }
> +{ _M_fini(); }
>  
>  #else
>  
>  explicit
>  random_device(const std::string& __token = "mt19937")
> -: _M_mt(_M_strtoul(__token)) { }
> +{ return _M_init_pretr1(__token); }
>  

make[4]: Entering directory 
`/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include'
mkdir -p ./cris-elf/bits/stdc++.h.gch
/tmp/hpautotest-gcc0/cris-elf/gccobj/./gcc/xgcc -shared-libgcc 
-B/tmp/hpautotest-gcc0/cris-elf/gccobj/./gcc -nostdinc++ 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/src 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/src/.libs 
-nostdinc -B/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/newlib/ -isystem 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/newlib/targ-include -isystem 
/tmp/hpautotest-gcc0/gcc/newlib/libc/include 
-B/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libgloss/cris 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libgloss/libnosys 
-L/tmp/hpautotest-gcc0/gcc/libgloss/cris 
-B/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/bin/ 
-B/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/lib/ -isystem 
/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/include -isystem 
/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/sys-include-x c++-header 
-nostdinc++ -g -O2 
-I/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/cris-elf 
-I/tmp/hpautotest-gcc0/cris-elf/
 gccobj/cris-elf/libstdc++-v3/include 
-I/tmp/hpautotest-gcc0/gcc/libstdc++-v3/libsupc++ -O2 -g -std=gnu++0x 
/tmp/hpautotest-gcc0/gcc/libstdc++-v3/include/precompiled/stdc++.h \
-o cris-elf/bits/stdc++.h.gch/O2ggnu++0x.gch
In file included from 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/random:50:0,
 from 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/bits/stl_algo.h:67,
 from 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/algorithm:63,
 from 
/tmp/hpautotest-gcc0/gcc/libstdc++-v3/include/precompiled/stdc++.h:65:
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/bits/random.h:
 In constructor 'std::random_device::random_device(const string&)':
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/bits/random.h:1590:36:
 error: returning a value from a constructor
 { return _M_init_pretr1(__token); }
^
make[4]: *** [cris-elf/bits/stdc++.h.gch/O2ggnu++0x.gch] Error 1

brgds, H-P


Re: [PATCH] Add counter histogram to fdo summary (issue6465057)

2012-08-30 Thread Jan Hubicka
> On Wed, Aug 29, 2012 at 6:12 AM, Jan Hubicka  wrote:
> >> Index: libgcc/libgcov.c
> >> ===
> >> --- libgcc/libgcov.c  (revision 190736)
> >> +++ libgcc/libgcov.c  (working copy)
> >> @@ -276,6 +276,78 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
> >>return 1;
> >>  }
> >>
> >> +/* Insert counter VALUE into HISTOGRAM.  */
> >> +
> >> +static void
> >> +gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value)
> >> +{
> >> +  unsigned i;
> >> +
> >> +  i = gcov_histo_index(value);
> >> +  gcc_assert (i < GCOV_HISTOGRAM_SIZE);
> > Does checking_assert work in libgcov? I do not think internal consistency 
> > check
> > should go to --enable-checking=release libgcov. We want to maintain it as
> > lightweight as possible. (I see there are two existing gcc_asserts, since 
> > they
> > report file format corruption, I think they should give better diagnostic).
> 
> gcc_checking_assert isn't available, since tsystem.h not system.h is
> included. I could probably just remove the assert (to be safe,
> silently return if i is out of bounds?).

I think just removing the assert is fine here: only way it can trigger is when
GCOV_HISTOGRAM_SIZE is wrong and it ought not to be.
> 
> >From my understanding of the mode attribute meanings, which I thought
> are defined in terms of the number of smallest addressable units, the
> code in gcov-io.h that sets up the gcov_type typedef will always end
> up with a gcov_type that is 32 or 64 bits? I.e. when BITS_PER_UNIT is
> 8 it will use either SI or DI which will end up either 32 or 64, and
> when BITS_PER_UNIT is 16 it would use either HI or SI which would
> again be either 32 or 64. Is that wrong and we can end up with a 16
> bit gcov_type?

I see, the code simplified a bit since we dropped support for some of more 
exotic
targets.  The type should be either 32bit or 64. 
> 
> The GCOV_TYPE_SIZE was being defined everywhere except when IN_GOV (so
> it was being defined IN_LIBGCOV), but I wanted it defined
> unconditionally because I am using it to determine the required number
> of histogram entries.
> 
> In any case, I think it will be more straightforward to define the
> number of histogram entries based on the max possible gcov_type size
> which is 64 (so 256 entries). This will make implementing the bit mask
> simpler, since it will always require the same number of gcov_unsigned
> ints (8).
Sounds good to me. 64bit should be enough for everyone. Coverage is kind of 
useless
for smaller types and for really restricted targets we more likely will want to 
disable
histogram generation rather than making it a bit smaller.
> 
> >
> > Patch is OK if it passed profiledbootstrap modulo the comments above.
> 
> Ok, thanks. Working on the fixes above.

OK, thanks!
Honza
> 
> Teresa
> 
> > Thanks!
> > Honza
> 
> 
> 
> -- 
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] Set correct source location for deallocator calls

2012-08-30 Thread Andrew Haley
On 08/30/2012 03:28 PM, Richard Henderson wrote:
> On 08/17/2012 03:02 PM, Dehao Chen wrote:
>> I spend a whole day working on this, but find it very difficult to add
>> such a java test because:
>>
>> * First, libjava testsuits are all runtime tests, i.e., it compiles
>> the byte code to native code, execute it, and compares the output to
>> expected output. There is no way to scan the assembly.
>> * Though there is a way to derive the line number at runtime in java
>> (using Exception().getStackTrace()), this method only works on VM, and
>> the gcj generated native code does not get the lineno.
>>
>> Any suggestions on this?
> 
> Hmm, not from me, unfortunately.  Cc'ing the java list for clues.
> I won't hang up the main patch for this though.

Fair enough.  As Bryce said, line numbers should work if you have
addr2line installed.

Can't we scan the assembly?  Is the problem simply that the logic to
scan the assembly code isn't present in the libgcj testsuite?

Andrew.



[PATCH 3/3] Compute predicates for phi node results in ipa-inline-analysis.c

2012-08-30 Thread Martin Jambor
Hi,

this is a new version of the patch which makes ipa analysis produce
predicates for PHI node results, at least at the bottom of the
simplest diamond and semi-diamond CFG subgraphs.  This time I also
analyze the conditions again rather than extracting information from
CFG edges, which means I can reason about substantially more PHI
nodes.

This patch makes us produce loop bounds hint for the pr48636.f90
testcase.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2012-08-29  Martin Jambor  

* ipa-inline-analysis.c (phi_result_unknown_predicate): New function.
(predicate_for_phi_result): Likewise.
(estimate_function_body_sizes): Use the above two functions.


Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -2070,6 +2070,99 @@ param_change_prob (gimple stmt, int i)
   return REG_BR_PROB_BASE;
 }
 
+/* Find whether a basic block BB is the final block of a (half) diamond CFG
+   sub-graph and if the predicate the condition depends on is known.  If so,
+   return true and store the pointer the predicate in *P.  */
+
+static bool
+phi_result_unknown_predicate (struct ipa_node_params *info,
+ struct inline_summary *summary, basic_block bb,
+ struct predicate *p,
+ VEC (predicate_t, heap) *nonconstant_names)
+{
+  edge e;
+  edge_iterator ei;
+  basic_block first_bb = NULL;
+  gimple stmt;
+
+  if (single_pred_p (bb))
+{
+  *p = false_predicate ();
+  return true;
+}
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+{
+  if (single_succ_p (e->src))
+   {
+ if (!single_pred_p (e->src))
+   return false;
+ if (!first_bb)
+   first_bb = single_pred (e->src);
+ else if (single_pred (e->src) != first_bb)
+   return false;
+   }
+  else
+   {
+ if (!first_bb)
+   first_bb = e->src;
+ else if (e->src != first_bb)
+   return false;
+   }
+}
+
+  if (!first_bb)
+return false;
+
+  stmt = last_stmt (first_bb);
+  if (!stmt
+  || gimple_code (stmt) != GIMPLE_COND
+  || !is_gimple_ip_invariant (gimple_cond_rhs (stmt)))
+return false;
+
+  *p = will_be_nonconstant_expr_predicate (info, summary,
+  gimple_cond_lhs (stmt),
+  nonconstant_names);
+  if (true_predicate_p (p))
+return false;
+  else
+return true;
+}
+
+/* Given a PHI statement in a function described by inline properties SUMMARY
+   and *P being the predicate describing whether the selected PHI argument is
+   known, store a predicate for the result of the PHI statement into
+   NONCONSTANT_NAMES, if possible.  */
+
+static void
+predicate_for_phi_result (struct inline_summary *summary, gimple phi,
+ struct predicate *p,
+ VEC (predicate_t, heap) *nonconstant_names)
+{
+  unsigned i;
+
+  for (i = 0; i < gimple_phi_num_args (phi); i++)
+{
+  tree arg = gimple_phi_arg (phi, i)->def;
+  if (!is_gimple_min_invariant (arg))
+   {
+ gcc_assert (TREE_CODE (arg) == SSA_NAME);
+ *p = or_predicates (summary->conds, p,
+ &VEC_index (predicate_t, nonconstant_names,
+ SSA_NAME_VERSION (arg)));
+ if (true_predicate_p (p))
+   return;
+   }
+}
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "\t\tphi predicate: ");
+  dump_predicate (dump_file, summary->conds, p);
+}
+  VEC_replace (predicate_t, nonconstant_names,
+  SSA_NAME_VERSION (gimple_phi_result (phi)), *p);
+}
 
 /* Compute function body size parameters for NODE.
When EARLY is true, we compute only simple summaries without
@@ -2143,7 +2236,30 @@ estimate_function_body_sizes (struct cgr
  fprintf (dump_file, "\n BB %i predicate:", bb->index);
  dump_predicate (dump_file, info->conds, &bb_predicate);
}
-  
+
+  if (parms_info && nonconstant_names)
+   {
+ struct predicate phi_predicate;
+ bool first_phi = true;
+
+ for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+   {
+ if (first_phi
+ && !phi_result_unknown_predicate (parms_info, info, bb,
+   &phi_predicate,
+   nonconstant_names))
+   break;
+ first_phi = false;
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "  ");
+ print_gimple_stmt (dump_file, gsi_stmt (bsi), 0, 0);
+   }
+ predicate_for_phi_result (info, gsi_stmt (bsi), &

Re: [PATCH] OpenBSD/amd64 support and OpenBSD/i386 cleanup

2012-08-30 Thread Richard Henderson
On 08/26/2012 08:07 AM, Gerald Pfeifer wrote:
> If anyone wants to approve this, I volunteer to commit the patch.
> 
> Gerald
> 
> 
> libgcc/:
> 
> 2011-12-27  Mark Kettenis  
> 
>   * config.host (x86_64-*-openbsd*): New target.
> 
> gcc/:
> 
> 2011-12-27  Mark Kettenis  
> 
>   * config.gcc (x86_64-*-openbsd*): New target.
>   * config/openbsd.h (TARGET_C99_FUNCTIONS): Define.
>   * config/i386/openbsdelf.h: Remove some superfluous defines and
>   group things together in a more logical fashion.
>   (DBX_REGISTER_NUMBER): Provide a
>   definition that works on both 32-bit and 64-bit targets.
>   (WCHAR_TYPE_SIZE): Hardcode as 32.
>   (NO_DOLLAR_IN_LABEL): Remove undef.
>   (TARGET_DEFAULT): Remove.
>   (SET_ASM_OP): Remove.
>   (DEFAULT_PCC_STRUCT_RETURN): Undef first to prevent warning.
>   (ASM_OUTPUT_MAX_SKIP_ALIGN): Synch with x86-64.h
>   (DWARF2_UNWIND_INFO): Remove define.
>   (HAVE_ENABLE_EXECUTE_STACK): Define.
>   * config/host-openbsd.c: New file.
>   * config/t-openbsd (USER_H): Add EXTRA_HEADERS.
>   * config/x-openbsd: New file.

Looks ok.

Some of the defines in i386/openbsdelf.h look redundant with either
i386/gas.h or i386/x86_64.h or both.  But I won't quibble about that
since there are other files in i386/ that ought to get cleaned up
for the same reasons.

r~


Re: [PATCH] Set correct source location for deallocator calls

2012-08-30 Thread Bryce McKinlay
On Thu, Aug 30, 2012 at 3:28 PM, Richard Henderson  wrote:
> On 08/17/2012 03:02 PM, Dehao Chen wrote:
>> I spend a whole day working on this, but find it very difficult to add
>> such a java test because:
>>
>> * First, libjava testsuits are all runtime tests, i.e., it compiles
>> the byte code to native code, execute it, and compares the output to
>> expected output. There is no way to scan the assembly.
>> * Though there is a way to derive the line number at runtime in java
>> (using Exception().getStackTrace()), this method only works on VM, and
>> the gcj generated native code does not get the lineno.
>>
>> Any suggestions on this?
>
> Hmm, not from me, unfortunately.  Cc'ing the java list for clues.
> I won't hang up the main patch for this though.

libjava calls out to addr2line to get the line number and source file
name for stack traces. As long as it can find addr2line you should get
a line number - but some platforms don't have it.

Ref: libjava/stacktrace.cc and libjava/gnu/gcj/runtime/NameFinder.java


Re: [PATCH] Set correct source location for deallocator calls

2012-08-30 Thread Richard Henderson
On 08/17/2012 03:02 PM, Dehao Chen wrote:
> I spend a whole day working on this, but find it very difficult to add
> such a java test because:
> 
> * First, libjava testsuits are all runtime tests, i.e., it compiles
> the byte code to native code, execute it, and compares the output to
> expected output. There is no way to scan the assembly.
> * Though there is a way to derive the line number at runtime in java
> (using Exception().getStackTrace()), this method only works on VM, and
> the gcj generated native code does not get the lineno.
> 
> Any suggestions on this?

Hmm, not from me, unfortunately.  Cc'ing the java list for clues.
I won't hang up the main patch for this though.

>> BTW, for the future, please fix your mailer to not wrap lines.
> 
> Okay, I'll try. The problem is that we have to send mail in plain txt.
> And in "plain text mode" gmail wraps each line to 80 characters and
> wouldn't allow you change that...

In that case use a text/plain attachment (which, not having tried it myself,
may require you use a .txt suffix on the patch file).  Most mail readers will
show those inline.  It's certainly better than having actually corrupt data
sent to the list.

> +// { dg-options "-O2 -fno-exceptions -g -dA" }
...
> +// { dg-final { scan-assembler "1 28 0" } }

You're still scanning for the .loc line, not the "test.c:28"
comment added by -dA.

To understand the problem, go back to your build tree, edit
auto-host.h and undefine HAVE_AS_DWARF2_DEBUG_LINE.  Then
rerun the testsuite with RUNTESTFLAGS=dwarf2.exp.

> + /* Calls to destructors are generated automatically in FINALL/CATCH
> +block. They should have location as UNKNOWN_LOCATION. However,
> +gimplify_call_expr will reset these call stmts to input_location
> +if it finds stmt's location is unknown. To prevent resetting for
> +destructors, we set the input_location to unknown.
> +Note that this only affects the destructor calls in FINALL/CATCH
> +block, and will automatically reset to its original value by the
> +end of gimplify_expr.  */

s/FINALL/FINALLY/g


r~


Re: VxWorks Patches Back from the Dead!

2012-08-30 Thread Bruce Korb
Hi Robert,

On Thu, Aug 30, 2012 at 6:30 AM, rbmj  wrote:
>> Done, and patch is attached.
>>
>
> OK.  make install doesn't seem to like it as much as I do.  It complains
> because it tries to install macro_list and can't find it.  Proposed
> solutions:
>
> 2. Change line to read test -f ${MACRO_LIST} && rm -f ${MACRO_LIST} && touch
> ${MACRO_LIST}
> Advantages- Will not unnecessarily run machine_name.  Saves 4 bytes of disk
> usage over option 1 :D
> Disadvantages- Looks rather hackish; something about it feels wrong

Agreed.  You would have to add a comment

> 1. Change line to read test -f ${MACRO_LIST} && echo > ${MACRO_LIST}
> Advantages- easy, simple
> Disadvantages- might cause an unnecessary run of the fix.  A very, very
> small potential compile time hit.

"GCC compile time hit"  Anyway, this looks cleaner and I surely don't see
much difference between this and #2 above.

> 3. Make macro_list optional for installation
> Disadvantages- more complex, and I don't really feel like going back into
> the makefiles again.  That's a scary place.

Amen.  Keep it as simple as you can.  Thanks.


Re: [wwwdocs] IA-32/x86-64 Changes for upcoming 4.8.0 series (Broadwell)

2012-08-30 Thread Gerald Pfeifer
On Thu, 30 Aug 2012, H.J. Lu wrote:
> Please drop for the new Intel processor codename Broadwell.  Just
> 
> Support for the new RDSEED, ADCX, ADOX and PREFETCHW intrinsics

Put ... around each of the four and refer to
"command-line options".

This is fine with those changes.

Thanks,
Gerald


Re: VxWorks Patches Back from the Dead!

2012-08-30 Thread rbmj

On 8/23/2012 7:54 AM, Paolo Bonzini wrote:

Il 23/08/2012 13:46, rbmj ha scritto:

On 8/23/2012 4:24 AM, Paolo Bonzini wrote:

Subject: [PATCH 10/10] Make open() call more compatible in gcc/gcov-io.c

In gcc/gcov-io.c, the call to open() only has two arguments. This
is fine, as long as the system open() is standards compliant.

So you have to add another fixincludes hack, adding a macro indirection
like the one you have for ioctl:

#define open(a, b, ...)  __open(a, b , ##__VA_ARGS__, 0660)
#define __open(a, b, c, ...) (open)(a, b, c)


Again, just not sure about variadic macro compatibility.  If that will
work for both c89 and c99 and c++, then that looks good to me.


Yes, GCC has variadic macros as an extension in C89 mode too.  You need
to experiment a bit with -pedantic and/or -ansi and/or -std=c89, though.



So the variadic macros work for compiling GCC itself.  However, I run 
into problems when compiling libstdc++-v3.  The problem is that 
basic_file.cc defines __basic_file::open(), and the macro is 
substituting for this as well.  So AFAICT the original solution (just 
passing unconditionally) is necessary.  I don't see any pitfalls 
associated with this - do we really care *that* much about passing one 
extra int?


Though it looks weird, it's clearly not unprecedented (as you said, it's 
not the rule, but it has certainly been done in other places).  I don't 
see a way to use a macro that will not break the declaration.  Is there 
a way that a macro can work that I'm missing?


--
rbmj


Re: VxWorks Patches Back from the Dead!

2012-08-30 Thread rbmj

On 8/25/2012 11:35 PM, rbmj wrote:

On 8/24/2012 4:59 PM, Bruce Korb wrote:

Hi Robert,

If you are going to defer, then:

On Fri, Aug 24, 2012 at 1:20 PM, rbmj  wrote:

diff --git a/fixincludes/fixinc.in b/fixincludes/fixinc.in
index e73aed9..de7be35 100755
--- a/fixincludes/fixinc.in
+++ b/fixincludes/fixinc.in
@@ -128,6 +128,18 @@ fi

  # # # # # # # # # # # # # # # # # # # # #
  #
+#  Check to see if the machine_name fix needs to be disabled.
+#
+
+case "${target_canonical}" in
+*-*-vxworks*)
+machine_name_override="OVERRIDE"

 replace this line with:

test -f ${MACRO_LIST} && rm -f ${MACRO_LIST}

The remaining part of the patch to this file is not necessary.



I like it!

Done, and patch is attached.



OK.  make install doesn't seem to like it as much as I do.  It complains 
because it tries to install macro_list and can't find it.  Proposed 
solutions:


1. Change line to read test -f ${MACRO_LIST} && echo > ${MACRO_LIST}
Advantages- easy, simple
Disadvantages- might cause an unnecessary run of the fix.  A very, very 
small potential compile time hit.


2. Change line to read test -f ${MACRO_LIST} && rm -f ${MACRO_LIST} && 
touch ${MACRO_LIST}
Advantages- Will not unnecessarily run machine_name.  Saves 4 bytes of 
disk usage over option 1 :D

Disadvantages- Looks rather hackish; something about it feels wrong

3. Make macro_list optional for installation
Disadvantages- more complex, and I don't really feel like going back 
into the makefiles again.  That's a scary place.


--
rbmj



Re: [wwwdocs] IA-32/x86-64 Changes for upcoming 4.8.0 series (Broadwell)

2012-08-30 Thread H.J. Lu
On Thu, Aug 30, 2012 at 2:13 AM, Kirill Yukhin  wrote:
> Hi,
> Attached patch for changes.html reflecting new CPU codename Broadwell 
> builtins.
>
> Could you please have a look?
>
> Thanks, K

Please drop for the new Intel processor codename Broadwell.  Just

Support for the new RDSEED, ADCX, ADOX and PREFETCHW intrinsics

-- 
H.J.


[PATCH, libstdc++] Improve slightly __cxa_guard_acquire

2012-08-30 Thread Thiago Macieira
Hello

The attached patch is a simple improvement to make a thread that failed to set 
the waiting bit to exit the function earlier, if it detects that another 
thread has successfully finished initialising. It matches the CAS code from a 
few lines above.

The change from RELAXED to ACQUIRE is noted in the previous patch I've just 
sent.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
2012-08-30  Thiago Macieira 

	* libsupc++/guard.cc (__cxa_guard_acquire): exit the loop earlier if
	we detect that another thread has had success.
---
 libstdc++-v3/libsupc++/guard.cc | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc
index bfe1a59..73d7221 100644
--- a/libstdc++-v3/libsupc++/guard.cc
+++ b/libstdc++-v3/libsupc++/guard.cc
@@ -269,8 +269,16 @@ namespace __cxxabiv1
 		 int newv = expected | waiting_bit;
 		 if (!__atomic_compare_exchange_n(gi, &expected, newv, false,
 		  __ATOMIC_ACQ_REL, 
-		  __ATOMIC_RELAXED))
-		   continue;
+		  __ATOMIC_ACQUIRE))
+		   {
+		 if (expected == guard_bit)
+		   {
+			 // Already initialized.
+			 return 0;
+		   }
+		 if (expected == 0)
+		   continue;
+		   }
 		 
 		 expected = newv;
 	   }
-- 
1.7.11.4



[PATCH, libstdc++] Use acquire semantics in case of CAS failure

2012-08-30 Thread Thiago Macieira
Hello

I detected this issue as I was updating the patches to send to the mailing 
list. I have not created a bug report.

When the CAS operation fails and expected == guard_bit, __cxa_guard_acquire 
will return immediately indicating that the initialisation has already 
succeeded. However, it's missing the acquire barrier for the changes done on 
the other thread, to match the release barrier from __cxa_guard_release.

That is:
thread Athread B
load.acq == 0   load.acq == 0
__cxa_guard_acquire __cxa_guard_acquire
CAS(0 -> 256) success
__cxa_guard_release
store.rel(1)
CAS(0 ->256) fails

At this point, we must synchronise with the store-release from thread A.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
2012-08-30  Thiago Macieira 

	* libsupc++/guard.cc (__cxa_guard_acquire): must use acquire semantics
	in case of failure, to acquire changes done by the other thread
---
 libstdc++-v3/libsupc++/guard.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc
index 36352e7..73d7221 100644
--- a/libstdc++-v3/libsupc++/guard.cc
+++ b/libstdc++-v3/libsupc++/guard.cc
@@ -253,7 +253,7 @@ namespace __cxxabiv1
 	int expected(0);
 	if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false,
 	__ATOMIC_ACQ_REL,
-	__ATOMIC_RELAXED))
+	__ATOMIC_ACQUIRE))
 	  {
 		// This thread should do the initialization.
 		return 1;
-- 
1.7.11.4



[PATCH, libstdc++] Fix PR54172

2012-08-30 Thread Thiago Macieira
Hello

The attached patch fixes a race condition in __cxa_guard_acquire, a regression 
from 4.6. When the code was refactored to use the new atomic intrinsics, the 
fact that __atomic_compare_exchange_n updates the "expected" variable with the 
current value was missed.

That causes the loop to possibly overwrite a successful initialisation and re-
start the initialisation phase. The bug report has a gdb trace with a 
watchpoint showing the guard variable transitioning

0 (uninit) -> 256 (pending) -> 1 (done) -> 256 (pending) -> 1 (done)

This situation can happen if both CAS fail in given thread: the first because 
the other thread started the initialisation and the second because the 
initialisation succeeded.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
2012-08-30  Thiago Macieira 

	PR libstdc++/54172
* libsupc++/guard.cc (__cxa_guard_acquire): Don't compare_exchange
from a finished state back to a waiting state.
---
 libstdc++-v3/libsupc++/guard.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc
index adc9608..4da9035 100644
--- a/libstdc++-v3/libsupc++/guard.cc
+++ b/libstdc++-v3/libsupc++/guard.cc
@@ -244,13 +244,13 @@ namespace __cxxabiv1
 if (__gthread_active_p ())
   {
 	int *gi = (int *) (void *) g;
-	int expected(0);
 	const int guard_bit = _GLIBCXX_GUARD_BIT;
 	const int pending_bit = _GLIBCXX_GUARD_PENDING_BIT;
 	const int waiting_bit = _GLIBCXX_GUARD_WAITING_BIT;
 
 	while (1)
 	  {
+	int expected(0);
 	if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false,
 	__ATOMIC_ACQ_REL,
 	__ATOMIC_RELAXED))
-- 
1.7.11.4



[Patch ARM] Fix PR54252

2012-08-30 Thread Ramana Radhakrishnan

Hi,

This is a fix for a pretty serious regression in GCC 4.7 onwards where
GCC is likely to put out wrong alignment specifiers in case of the neon
intrinsics. These specifiers appear to be much larger than the alignment 
specifiers allowed by the architecture for the memory sizes allowed by 
the instructions.


The part of the backend which was emitting the alignment specifiers
wasn't really wrong in what it was doing , it's just that the
information in terms of MEM_SIZE for the memory being accesses was wrong 
when neon_dereference_pointer constructed these MEM_REFs in the first place.


There are 2 fundamental problems in the way in which the builtin
expanders and neon_dereference_pointer construct these memory references.

The first problem is that neon_dereference_pointer in the case that
reg_mode and mem_mode are identical doesn't take into account the number 
of bytes that elem_type actually uses. The logic below in

neon_dereference_pointer essentially specifies that the memory accessed
by the intrinsic is an array of type elem_type with number of elements
equal to the number of elements in the vector.

The second problem and something more fundamental in 
neon_dereference_pointer is that it attempts to figure out the

underlying type of the element being accessed by looking at the actual
parameter for the load or the store. However this is not necessarily
guaranteed to work always as the underlying type could by itself by an
array type causing the logic in neon_dereference_pointer to end up
constructing a multi-dimensional array of the basic type. The way I
spotted this was to construct a testcase from the original PR but using
the vld3q_lane_f32 style intrinsics. In these cases the memory reference 
produced appeared to be loading a 2 dimensional array of 6 float values 
instead of just 3 float values. Ouch !


The correct method ought to be to use the underlying type from the
formal parameter which is what this patch attempts to do.

Tested cross with no regressions on arm-linux-gnueabi with the relevant 
configury, tested with a number of handwritten tests and observed size 
of the memory accesses look sane.


Applied on trunk and will wait for a few days before backporting to 4.7 
branch.


regards,
Ramana

2012-08-29  Ramana Radhakrishnan  
Richard Earnshaw  

PR target/54252
* config/arm/arm.c (neon_dereference_pointer): Adjust nelems by
element size. Use elem_type from the formal parameter. New parameter 
fcode.
(neon_expand_args): Adjust call to neon_dereference_pointer.











Re: [PATCH, libstdc++] Make empty std::string storage readonly

2012-08-30 Thread Jonathan Wakely
On 29 August 2012 13:25, Michael Haubenwallner wrote:
>
> On 08/28/2012 08:12 PM, Jonathan Wakely wrote:
>> On 28 August 2012 18:27, Michael Haubenwallner wrote:

 Does it actually produce a segfault? I suppose it might on some
 platforms, but not all, so I'm not sure it's worth changing.
>>>
>>> It does segfault here on (32bit each):
>>>  i686-pc-linux-gnu
>>>  ia64-hp-hpux11.31
>>>  i386-pc-solaris2.10
>>>  sparc-sun-solaris2.10
>>>  powerpc-ibm-aix5.3.0.0
>>>  powerpc-ibm-aix6.1.0.0
>>>  powerpc-ibm-aix7.1.0.0
>>>
>>> It does not segfault here on:
>>>  hppa2.0n-hp-hpux11.31
>>>  i586-pc-interix5.2
>>>  i586-pc-winnt5.2 (using MSVC)
>>>
>>> Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some 
>>> linker flag,
>>> but that's a deprecated platform anyway.
>>>
>>> As long as the major development platform (Linux) does segfault, it feels 
>>> worth
>>> changing - especially as string.clear() to write the '\0' back again won't 
>>> help
>>> as quick'n dirty workaround since gcc-4.4.4 any more.
>>
>> Hmm, I tested it on x86_64-unknown-linux-gnu without getting a
>> segfault - but I might have messed up my test.
>
> Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does 
> segfault
> as expected - when I make sure the correct libstdc++ is used at runtime,
> having the '_S_empty_rep_storage' symbol in the .rodata section rather than 
> .bss.

Bah, I did mess up my test, not correctly disabling the extern
template instantiations in the library.

If it works reliably on x86_64 then I think the patch is worth considering.

I'm on holiday for a week, so maybe one of the other maintainers will
deal with it first.


[wwwdocs] IA-32/x86-64 Changes for upcoming 4.8.0 series (Broadwell)

2012-08-30 Thread Kirill Yukhin
Hi,
Attached patch for changes.html reflecting new CPU codename Broadwell builtins.

Could you please have a look?

Thanks, K


bdw-html.patch
Description: Binary data