Re: [PATCH] Fix PR49518

2011-07-04 Thread Ira Rosen


Richard Guenther  wrote on 04/07/2011 03:30:59 PM:
> >
> > Richard Guenther  wrote on 04/07/2011 02:38:50 PM:
> >
> > > Handling of negative steps broke one of the many asserts in
> > > the vectorizer.  The following patch drops one that I can't
> > > make sense of.  I think all asserts need comments - especially
> > > this one would, as I can't see why using vf is correct to
> > > test against and not nelements (and why <= vf and not < vf).
> >
> > There is an explanation 10 rows above the assert. It doesn't make sense
to
> > peel more than vf iterations (and not nelements, since for the case of
> > multiple types it may help to align more data-refs - see the comment in
the
> > code). IIRC <= is for the case of aligned access, but I am not sure
about
> > that, so maybe you are right.
> >
> > I don't see how it is related to negative steps.
> >
> > I think that the real reason for this failure is that the loads are
> > actually irrelevant (hence, vf=4 that doesn't take char loads into
> > account), but we don't check that when we analyze data-refs. So, in my
> > opinion, the proper fix will add such check.
>
> The following also works for me:
>
> Index: tree-vect-data-refs.c
> ===
> --- tree-vect-data-refs.c   (revision 175802)
> +++ tree-vect-data-refs.c   (working copy)
> @@ -1495,6 +1495,9 @@ vect_enhance_data_refs_alignment (loop_v
>stmt = DR_STMT (dr);
>stmt_info = vinfo_for_stmt (stmt);
>
> +  if (!STMT_VINFO_RELEVANT (stmt_info))
> +   continue;
> +
>/* For interleaving, only the alignment of the first access
>   matters.  */
>if (STMT_VINFO_STRIDED_ACCESS (stmt_info)
>
> does that look better or do you propose to clean the datarefs
> vector from those references?

Well, this is certainly enough to fix the PR.
I am not sure if we can just remove these data-refs from the dependence
checks. After that all the alignment and access checks are at least
redundant.

Thanks,
Ira

>
> Thanks,
> Richard.



Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread Alan Modra
On Mon, Jul 04, 2011 at 05:09:28PM -0700, H.J. Lu wrote:
> On Mon, Jul 4, 2011 at 4:54 PM, Alan Modra  wrote:
> > I didn't set out to do anything special with hard regs one way or the
> > other, just extended what was already done for paradoxical subregs to
> > sign and zero extended subregs.
> 
> Does your change depend on processing zero/sign-extended
> hard registers?

At the time I wrote the patch I was more interested in pseudos.  I
expect that powerpc64 won't be greatly affected if hard regs were
excluded from this fwprop optimization, but you need to discuss your
patch with maintainers of this code.  My opinion as a one-time
contributor to fwprop doesn't count for much.

-- 
Alan Modra
Australia Development Lab, IBM


RE: [RFC] Add middle end hook for stack red zone size

2011-07-04 Thread Jiangning Liu
PING...

I just merged with the latest code base and generated new patch as attached.

Thanks,
-Jiangning

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jiangning Liu
> Sent: 2011年6月28日 4:38 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [RFC] Add middle end hook for stack red zone size
> 
> This patch is to fix PR38644, which is a bug with long history about
> stack red zone access, and PR30282 is correlated.
> 
> Originally red zone concept is not exposed to middle-end, and back-end
> uses special logic to add extra memory barrier RTL and help the correct
> dependence in middle-end. This way different back-ends must handle red
> zone problem by themselves. For example, X86 target introduced function
> ix86_using_red_zone() to judge red zone access, while POWER introduced
> offset_below_red_zone_p() to judge it. Note that they have different
> semantics, but the logic in caller sites of back-end uses them to
> decide whether adding memory barrier RTL or not. If back-end
> incorrectly handles this, bug would be introduced.
> 
> Therefore, the correct method should be middle-end handles red zone
> related things to avoid the burden in different back-ends. To be
> specific for PR38644, this middle-end problem causes incorrect behavior
> for ARM target.
> This patch exposes red zone concept to middle-end by introducing a
> middle-end/back-end hook TARGET_STACK_RED_ZONE_SIZE defined in
> target.def, and by default its value is 0. Back-end may redefine this
> function to provide concrete red zone size according to specific ABI
> requirements.
> 
> In middle end, scheduling dependence is modified by using this hook
> plus checking stack frame pointer adjustment instruction to decide
> whether memory references need to be all flushed out or not. In theory,
> if TARGET_STACK_RED_ZONE_SIZE is defined correctly, back-end would not
> be required to specially handle this scheduling dependence issue by
> introducing extra memory barrier RTL.
> 
> In back-end, the following changes are made to define the hook,
> 1) For X86, TARGET_STACK_RED_ZONE_SIZE is redefined to be
> ix86_stack_red_zone_size() in i386.c, which is an newly introduced
> function.
> 2) For POWER, TARGET_STACK_RED_ZONE_SIZE is redefined to be
> rs6000_stack_red_zone_size() in rs6000.c, which is also a newly defined
> function.
> 3) For ARM and others, TARGET_STACK_RED_ZONE_SIZE is defined to be
> default_stack_red_zone_size in targhooks.c, and this function returns 0,
> which means ARM eabi and others don't support red zone access at all.
> 
> In summary, the relationship between ABI and red zone access is like
> below,
> 
> -
> |   ARCH   |  ARM  |   X86 |POWER  | others |
> |--|---|---|---||
> |ABI   | EABI  | MS_64 | other |   AIX  |  V4  ||
> |--|---|---|---||--||
> | RED ZONE |  No   |  YES  |  No   |  YES   |  No  |   No   |
> |--|---|---|---||--||
> | RED ZONE SIZE|   0   |  128  |   0   |220/288 |   0  |0   |
> -
> 
> Thanks,
> -Jiangning

stack-red-zone-patch-38644-4.patch
Description: Binary data


libjava patches for RTEMS

2011-07-04 Thread Jie Liu
Hi,

GCJ is available on RTEMS/pc386. Here is the libjava testsuite result
on RTEMS/pc386:

=== libjava Summary ===
# of expected passes2249
# of unexpected failures94
# of untested testcases 66

As the testsuite result is good enough, I think it's time to  get the
patch reviewed and merged into gcc. The patch is attached. :)

Best Regards,
Jie


libjava.patch
Description: Binary data


Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread H.J. Lu
On Mon, Jul 4, 2011 at 4:54 PM, Alan Modra  wrote:
> On Mon, Jul 04, 2011 at 01:57:34PM -0700, H.J. Lu wrote:
>> forward_propagate_subreg issue was introduced by
>>
>> http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html
>>
>> Before that,  fwprop never tries to work on hard registers.
>
> I question this claim.  It seems to me that fwprop did look at
> paradoxical subregs of hard regs before my change.

I should have said " fwprop never tries to work on zero/sign-extended
hard registers."

>>  Alan,
>> is your change to process hard registers intentional?
>
> I didn't set out to do anything special with hard regs one way or the
> other, just extended what was already done for paradoxical subregs to
> sign and zero extended subregs.
>

Does your change depend on processing zero/sign-extended
hard registers?

-- 
H.J.


Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread Alan Modra
On Mon, Jul 04, 2011 at 01:57:34PM -0700, H.J. Lu wrote:
> forward_propagate_subreg issue was introduced by
> 
> http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html
> 
> Before that,  fwprop never tries to work on hard registers.

I question this claim.  It seems to me that fwprop did look at
paradoxical subregs of hard regs before my change.

>  Alan,
> is your change to process hard registers intentional?

I didn't set out to do anything special with hard regs one way or the
other, just extended what was already done for paradoxical subregs to
sign and zero extended subregs.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [Patch 2/3] ARM 64 bit atomic operations

2011-07-04 Thread David Gilbert
On 1 July 2011 20:38, Joseph S. Myers  wrote:

Hi Joseph,
  Thanks for your comments.

> On Fri, 1 Jul 2011, Dr. David Alan Gilbert wrote:
>
>> +/* For write */
>> +#include 
>> +/* For abort */
>> +#include 
>
> Please don't include system headers in libgcc without appropriate
> inhibit_libc checks for bootstrap purposes.  In this case, it would seem
> better just to declare the functions you need.

OK.

>> +/* Check that the kernel has a new enough version at load */
>> +void __check_for_sync8_kernelhelper (void)
>
> Shouldn't this function be static?

Yep.

>> +{
>> +  if (__kernel_helper_version < 5)
>> +    {
>> +      const char err[] = "A newer kernel is required to run this binary. 
>> (__kernel_cmpxchg64 helper)\n";
>> +      /* At this point we need a way to crash with some information
>> +      for the user - I'm not sure I can rely on much else being
>> +      available at this point, so do the same as generic-morestack.c
>> +      write() and abort(). */
>> +      write (2 /* stderr */, err, sizeof(err));
>
> "write" is in the user's namespace in ISO C so it's not ideal to have a
> call to it.  If there isn't a reserved-namespace version, using the
> syscall directly (hardcoding the syscall number) might be better.

OK, fair enough.

>> +void (*__sync8_kernelhelper_inithook[]) (void) __attribute__ ((section 
>> (".init_array"))) = {
>> +  &__check_for_sync8_kernelhelper
>> +};
>
> Shouldn't this also be static (marked "used" if needed)?  Though I'd have
> thought simply marking the function as a constructor would be simpler and
> better

OK, can do - I wasn't too sure if constructor would end up later in
the initialisation - I was worrying whether that might end up after a
C++ constructor that might actually use; (although I'm not actually
sure if that's more or less likely to happen with constructor v
init_array).

Dave


Re: C++ PATCH to improve pretty-printing of function calls

2011-07-04 Thread Gabriel Dos Reis
Jason Merrill  writes:

| Before this patch, GCC described the candidate as
| 
| template decltype (((TypeC*)this)->TypeC::b.template
| typename TypeA::type TypeB::fn [with int U = U, int N = 10,
| typename TypeA::type = TypeA::type]()) TypeC::fn()

ouch!

| after the patch, it's
| 
| template decltype (((TypeC*)this)->TypeC::b.fn()) TypeC::fn()
| 
| it doesn't make any sense to have the template header or return type
| in the middle of an expression, nor to have the [with ...] template
| bindings.

agreed.  Thanks!

-- Gaby


Re: C++ PATCH for c++/49003 (DR 1207, use of 'this' in trailing return type)

2011-07-04 Thread Jason Merrill

On 06/29/2011 05:15 PM, Jason Merrill wrote:

This patch adds support for use of 'this' (implicitly or explicitly) in
the trailing-return-type of a member function.


The above patch wasn't enough, though.  The following patch fixes some 
issues that arose with real uses, including mangling.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ef43a979a3f46150f383b9deab70dd412d66f96b
Author: Jason Merrill 
Date:   Sun Jul 3 17:14:56 2011 -0400

	DR 1207
	PR c++/49589
	* mangle.c (write_expression): Handle 'this'.
	* parser.c (cp_parser_postfix_dot_deref_expression): Allow
	incomplete *this.
	* semantics.c (potential_constant_expression_1): Check that
	DECL_CONTEXT is set on 'this'.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 134c9ea..81b772f 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -2495,6 +2495,11 @@ write_expression (tree expr)
   else if (TREE_CODE_CLASS (code) == tcc_constant
 	   || (abi_version_at_least (2) && code == CONST_DECL))
 write_template_arg_literal (expr);
+  else if (code == PARM_DECL && DECL_ARTIFICIAL (expr))
+{
+  gcc_assert (!strcmp ("this", IDENTIFIER_POINTER (DECL_NAME (expr;
+  write_string ("fpT");
+}
   else if (code == PARM_DECL)
 {
   /* A function parameter used in a late-specified return type.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index d79326d..6bb15ed 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -5281,7 +5281,11 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
 		postfix_expression);
 	  scope = NULL_TREE;
 	}
-  else
+  /* Unlike the object expression in other contexts, *this is not
+	 required to be of complete type for purposes of class member
+	 access (5.2.5) outside the member function body.  */
+  else if (scope != current_class_ref
+	   && !(processing_template_decl && scope == current_class_type))
 	scope = complete_type_or_else (scope, NULL_TREE);
   /* Let the name lookup machinery know that we are processing a
 	 class member access expression.  */
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index e29705c..619c058 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7791,7 +7791,8 @@ potential_constant_expression_1 (tree t, bool want_rval, tsubst_flags_t flags)
 STRIP_NOPS (x);
 if (is_this_parameter (x))
 	  {
-	if (DECL_CONSTRUCTOR_P (DECL_CONTEXT (x)) && want_rval)
+	if (want_rval && DECL_CONTEXT (x)
+		&& DECL_CONSTRUCTOR_P (DECL_CONTEXT (x)))
 	  {
 		if (flags & tf_error)
 		  sorry ("use of the value of the object being constructed "
diff --git a/gcc/testsuite/g++.dg/abi/mangle48.C b/gcc/testsuite/g++.dg/abi/mangle48.C
new file mode 100644
index 000..dc9c492
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle48.C
@@ -0,0 +1,23 @@
+// Testcase for 'this' mangling
+// { dg-options -std=c++0x }
+
+struct B
+{
+  template  U f();
+};
+
+struct A
+{
+  B b;
+  // { dg-final { scan-assembler "_ZN1A1fIiEEDTcldtdtdefpT1b1fIT_EEEv" } }
+  template  auto f() -> decltype (b.f());
+  // { dg-final { scan-assembler "_ZN1A1gIiEEDTcldtptfpT1b1fIT_EEEv" } }
+  template  auto g() -> decltype (this->b.f());
+};
+
+int main()
+{
+  A a;
+  a.f();
+  a.g();
+}
commit acbc60694bf95f13f9088ed4d5b3d18780aaf754
Author: Jason Merrill 
Date:   Mon Jul 4 10:44:29 2011 -0400

	* cp-demangle.c (d_expression): Handle 'this'.
	(d_print_comp) [DEMANGLE_COMPONENT_FUNCTION_PARAM]: Likewise.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index f136322..29badbb 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -2738,10 +2738,18 @@ d_expression (struct d_info *di)
   /* Function parameter used in a late-specified return type.  */
   int index;
   d_advance (di, 2);
-  index = d_compact_number (di);
-  if (index < 0)
-	return NULL;
-
+  if (d_peek_char (di) == 'T')
+	{
+	  /* 'this' parameter.  */
+	  d_advance (di, 1);
+	  index = 0;
+	}
+  else
+	{
+	  index = d_compact_number (di) + 1;
+	  if (index == 0)
+	return NULL;
+	}
   return d_make_function_param (di, index);
 }
   else if (IS_DIGIT (peek)
@@ -4400,9 +4408,17 @@ d_print_comp (struct d_print_info *dpi, int options,
   return;
 
 case DEMANGLE_COMPONENT_FUNCTION_PARAM:
-  d_append_string (dpi, "{parm#");
-  d_append_num (dpi, dc->u.s_number.number + 1);
-  d_append_char (dpi, '}');
+  {
+	long num = dc->u.s_number.number;
+	if (num == 0)
+	  d_append_string (dpi, "this");
+	else
+	  {
+	d_append_string (dpi, "{parm#");
+	d_append_num (dpi, num);
+	d_append_char (dpi, '}');
+	  }
+  }
   return;
 
 case DEMANGLE_COMPONENT_GLOBAL_CONSTRUCTORS:
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index 4980cf1..2dc74be 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -3905,6 +3905,10 @@ decltype ({parm#1}+{parm#2}) add(int, doubl

C++ PATCH to improve pretty-printing of function calls

2011-07-04 Thread Jason Merrill

Before this patch, GCC described the candidate as

template decltype (((TypeC*)this)->TypeC::b.template 
typename TypeA::type TypeB::fn [with int U = U, int N = 10, typename 
TypeA::type = TypeA::type]()) TypeC::fn()


after the patch, it's

template decltype (((TypeC*)this)->TypeC::b.fn()) TypeC::fn()

it doesn't make any sense to have the template header or return type in 
the middle of an expression, nor to have the [with ...] template bindings.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 70816c82793a089f530a0df105c129aa9f6dfa65
Author: Jason Merrill 
Date:   Sun Jul 3 17:25:40 2011 -0400

	* error.c (dump_template_bindings): Don't print typenames
	for a partial instantiation.
	(dump_function_decl): If we aren't printing function arguments,
	print template arguments as  rather than [with ...].
	(dump_expr): Don't print return type or template header.
	[BASELINK]: Use BASELINK_FUNCTIONS rather than get_first_fn.
	* pt.c (dependent_template_arg_p): Handle null arg.

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 664b918..b16fce6 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -307,9 +307,12 @@ dump_template_bindings (tree parms, tree args, VEC(tree,gc)* typenames)
   parms = TREE_CHAIN (parms);
 }
 
+  /* Don't bother with typenames for a partial instantiation.  */
+  if (VEC_empty (tree, typenames) || uses_template_parms (args))
+return;
+
   FOR_EACH_VEC_ELT (tree, typenames, i, t)
 {
-  bool dependent = uses_template_parms (args);
   if (need_comma)
 	pp_separate_with_comma (cxx_pp);
   dump_type (t, TFF_PLAIN_IDENTIFIER);
@@ -317,11 +320,7 @@ dump_template_bindings (tree parms, tree args, VEC(tree,gc)* typenames)
   pp_equal (cxx_pp);
   pp_cxx_whitespace (cxx_pp);
   push_deferring_access_checks (dk_no_check);
-  if (dependent)
-	++processing_template_decl;
   t = tsubst (t, args, tf_none, NULL_TREE);
-  if (dependent)
-	--processing_template_decl;
   pop_deferring_access_checks ();
   /* Strip typedefs.  We can't just use TFF_CHASE_TYPEDEF because
 	 pp_simple_type_specifier doesn't know about it.  */
@@ -1379,17 +1378,37 @@ dump_function_decl (tree t, int flags)
 
   if (show_return)
 	dump_type_suffix (TREE_TYPE (fntype), flags);
-}
 
-  /* If T is a template instantiation, dump the parameter binding.  */
-  if (template_parms != NULL_TREE && template_args != NULL_TREE)
+  /* If T is a template instantiation, dump the parameter binding.  */
+  if (template_parms != NULL_TREE && template_args != NULL_TREE)
+	{
+	  pp_cxx_whitespace (cxx_pp);
+	  pp_cxx_left_bracket (cxx_pp);
+	  pp_cxx_ws_string (cxx_pp, M_("with"));
+	  pp_cxx_whitespace (cxx_pp);
+	  dump_template_bindings (template_parms, template_args, typenames);
+	  pp_cxx_right_bracket (cxx_pp);
+	}
+}
+  else if (template_args)
 {
-  pp_cxx_whitespace (cxx_pp);
-  pp_cxx_left_bracket (cxx_pp);
-  pp_cxx_ws_string (cxx_pp, M_("with"));
-  pp_cxx_whitespace (cxx_pp);
-  dump_template_bindings (template_parms, template_args, typenames);
-  pp_cxx_right_bracket (cxx_pp);
+  bool need_comma = false;
+  int i;
+  pp_cxx_begin_template_argument_list (cxx_pp);
+  template_args = INNERMOST_TEMPLATE_ARGS (template_args);
+  for (i = 0; i < TREE_VEC_LENGTH (template_args); ++i)
+	{
+	  tree arg = TREE_VEC_ELT (template_args, i);
+	  if (need_comma)
+	pp_separate_with_comma (cxx_pp);
+	  if (ARGUMENT_PACK_P (arg))
+	pp_cxx_left_brace (cxx_pp);
+	  dump_template_argument (arg, TFF_PLAIN_IDENTIFIER);
+	  if (ARGUMENT_PACK_P (arg))
+	pp_cxx_right_brace (cxx_pp);
+	  need_comma = true;
+	}
+  pp_cxx_end_template_argument_list (cxx_pp);
 }
 }
 
@@ -1724,7 +1743,9 @@ dump_expr (tree t, int flags)
 case OVERLOAD:
 case TYPE_DECL:
 case IDENTIFIER_NODE:
-  dump_decl (t, (flags & ~TFF_DECL_SPECIFIERS) | TFF_NO_FUNCTION_ARGUMENTS);
+  dump_decl (t, ((flags & ~(TFF_DECL_SPECIFIERS|TFF_RETURN_TYPE
+|TFF_TEMPLATE_HEADER))
+		 | TFF_NO_FUNCTION_ARGUMENTS));
   break;
 
 case INTEGER_CST:
@@ -2289,7 +2310,7 @@ dump_expr (tree t, int flags)
   break;
 
 case BASELINK:
-  dump_expr (get_first_fn (t), flags & ~TFF_EXPR_IN_PARENS);
+  dump_expr (BASELINK_FUNCTIONS (t), flags & ~TFF_EXPR_IN_PARENS);
   break;
 
 case EMPTY_CLASS_EXPR:
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7236e7e..e7be08b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18848,7 +18848,7 @@ dependent_template_arg_p (tree arg)
  is dependent. This is consistent with what
  any_dependent_template_arguments_p [that calls this function]
  does.  */
-  if (arg == error_mark_node)
+  if (!arg || arg == error_mark_node)
 return true;
 
   if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT)
diff --git a/gcc/testsuite/g++.dg/cpp0x/diag1.C b/gcc/testsuite/g++.dg/cpp0x/diag1.C
new file mode 100644
index 000..b3f30bc
--- /dev/null
+++ b/gcc/testsui

Re: C++ PATCH to improve 'aka's on type printing in diagnostics

2011-07-04 Thread Gabriel Dos Reis
Jason Merrill  writes:

| I thought of a different way to do it that would stay encapsulated in
| type_as_string, so this is the version I'm going to check in.

OK, thanks.

-- Gaby


Re: C++ PATCH to improve 'aka's on type printing in diagnostics

2011-07-04 Thread Jason Merrill
I thought of a different way to do it that would stay encapsulated in 
type_as_string, so this is the version I'm going to check in.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 689a3e58f4eebbcdafec81f06e8af699045fff3a
Author: Jason Merrill 
Date:   Fri Jul 1 00:16:46 2011 -0400

	* error.c (type_to_string): Avoid redundant akas.

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 7c90ec4..664b918 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -2634,14 +2634,28 @@ type_to_string (tree typ, int verbose)
 
   reinit_cxx_pp ();
   dump_type (typ, flags);
+  /* If we're printing a type that involves typedefs, also print the
+ stripped version.  But sometimes the stripped version looks
+ exactly the same, so we don't want it after all.  To avoid printing
+ it in that case, we play ugly obstack games.  */
   if (typ && TYPE_P (typ) && typ != TYPE_CANONICAL (typ)
   && !uses_template_parms (typ))
 {
+  int aka_start; char *p;
+  struct obstack *ob = pp_base (cxx_pp)->buffer->obstack;
+  /* Remember the end of the initial dump.  */
+  int len = obstack_object_size (ob);
   tree aka = strip_typedefs (typ);
   pp_string (cxx_pp, " {aka");
   pp_cxx_whitespace (cxx_pp);
+  /* And remember the start of the aka dump.  */
+  aka_start = obstack_object_size (ob);
   dump_type (aka, flags);
   pp_character (cxx_pp, '}');
+  p = (char*)obstack_base (ob);
+  /* If they are identical, cut off the aka with a NUL.  */
+  if (memcmp (p, p+aka_start, len) == 0)
+	p[len] = '\0';
 }
   return pp_formatted_text (cxx_pp);
 }
diff --git a/gcc/testsuite/g++.dg/diagnostic/aka1.C b/gcc/testsuite/g++.dg/diagnostic/aka1.C
new file mode 100644
index 000..37f8df9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/aka1.C
@@ -0,0 +1,15 @@
+// Basic test for typedef stripping in diagnostics.
+
+struct A {
+  void f();
+};
+
+void A::f() {
+  // We don't want an aka for the injected-class-name.
+  A a = 0;			// { dg-error "type .A. requested" }
+}
+
+typedef A B;
+
+// We do want an aka for a real typedef.
+B b = 0;			// { dg-error "B .aka A." }


Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread H.J. Lu
On Mon, Jul 4, 2011 at 1:52 PM, H.J. Lu  wrote:
> On Mon, Jul 4, 2011 at 12:57 PM, Richard Sandiford
>  wrote:
>> "H.J. Lu"  writes:
>>> RTL-based forward propagation pass shouldn't propagate hard register.
>>
>> That's seems a bit draconian.  Many fixed hard registers ought to be OK.
>> E.g. there doesn't seem to be anything wrong with propagating uses of
>> the stack or frame pointers, subject to the usual availability checks.
>>
>> To play devil's advocate, an alternative might be to
>>
>> (a) make local_ref_killed_between_p return true for non-fixed hard
>>    registers when a call or asm comes between the two instructions
>>
>> (b) make use_killed_between return true for non-fixed hard registers
>>    when the instructions are in different basic blocks
>>
>> Thoughts?
>>
>
> There are a few problems with this suggestions:
>
> 1. The comments says:
>
> /* If USE is a subreg, see if it can be replaced by a pseudo.  */
>
> static bool
> forward_propagate_subreg (df_ref use, rtx def_insn, rtx def_set)
> {
>
> It indicates this function is intended to work on pseudo registers.
>
> 2. propagate_rtx avoids hard registers:
>
> static rtx
> propagate_rtx (rtx x, enum machine_mode mode, rtx old_rtx, rtx new_rtx,
>               bool speed)
> {
>  rtx tem;
>  bool collapsed;
>  int flags;
>
>  if (REG_P (new_rtx) && REGNO (new_rtx) < FIRST_PSEUDO_REGISTER)
>    return NULL_RTX;
>
> It seems that fwprop is intended to deal with pseudo registers.  If we
> want to extend it to hard registers, that should be a separate project.
>
> Thanks.

forward_propagate_subreg issue was introduced by

http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html

Before that,  fwprop never tries to work on hard registers.  Alan,
is your change to process hard registers intentional?

Thanks.


-- 
H.J.


[pph] Split c1meteor-contest.cc (issue4654087)

2011-07-04 Thread Diego Novillo
The test c1meteor-contest.cc had similar issues as c1eabi1.cc.  The
inclusion of system headers that have been PPH'd confuse the compiler.
I split the test so we have one version without additional includes
and another with the standard includes.

The version without additional includes works fine.

Tested on x86_64.  Committed.


Diego.

* g++.dg/pph/c1meteor-contest.cc: Make executable.
Move function main() from c1meteor-contest.h.
* g++.dg/pph/c1meteor-contest.h: Do not include stdlib.h nor
stdio.h.
Add prototype for system functions qsort, printf and atoi.
* g++.dg/pph/c2meteor-contest.cc: New.
* g++.dg/pph/c2meteor-contest.h: New.
* g++.dg/pph/pph.map: Add c2meteor-contest.h

diff --git a/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc 
b/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc
index e745afe..ff1765c 100644
--- a/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc
+++ b/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc
@@ -1,4 +1,17 @@
-/* { dg-timeout 2 { target *-*-* } }  */
-// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } }
 /* { dg-options "-w" }  */
+/* { dg-do run } */
+
 #include "c1meteor-contest.h"
+
+int main(int argc, char **argv) {
+   if(argc > 1)
+  max_solutions = atoi(argv[1]);
+   calc_pieces();
+   calc_rows();
+   solve(0, 0);
+   printf("%d solutions found\n\n", solution_count);
+   qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort);
+   pretty(solutions[0]);
+   pretty(solutions[solution_count-1]);
+   return 0;
+}
diff --git a/gcc/testsuite/g++.dg/pph/c1meteor-contest.h 
b/gcc/testsuite/g++.dg/pph/c1meteor-contest.h
index 3c465ab..698ccf5 100644
--- a/gcc/testsuite/g++.dg/pph/c1meteor-contest.h
+++ b/gcc/testsuite/g++.dg/pph/c1meteor-contest.h
@@ -36,8 +36,16 @@ POSSIBILITY OF SUCH DAMAGE.
  * contributed by Christian Vosteen
  */
 
-#include 
-#include 
+/* Simplified version of c2meteor-contest.h - Do not include other system
+   headers here.  Simply forward declare the library functions used
+   by this header.  */
+extern  "C" {
+  typedef __SIZE_TYPE__ size_t;
+  void qsort(void *, size_t, size_t, int (*)(const void *, const void *));
+  int printf(const char *, ...);
+  int atoi(const char *);
+}
+
 #define TRUE 1
 #define FALSE 0
 
@@ -614,17 +622,4 @@ void pretty(signed char *b) {
}
printf("\n");
 }
-
-int main(int argc, char **argv) {
-   if(argc > 1)
-  max_solutions = atoi(argv[1]);
-   calc_pieces();
-   calc_rows();
-   solve(0, 0);
-   printf("%d solutions found\n\n", solution_count);
-   qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort);
-   pretty(solutions[0]);
-   pretty(solutions[solution_count-1]);
-   return 0;
-}
 #endif
diff --git a/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc 
b/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc
new file mode 100644
index 000..e35cca4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc
@@ -0,0 +1,17 @@
+/* { dg-timeout 2 { target *-*-* } }  */
+// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } }
+/* { dg-options "-w" }  */
+#include "c2meteor-contest.h"
+
+int main(int argc, char **argv) {
+   if(argc > 1)
+  max_solutions = atoi(argv[1]);
+   calc_pieces();
+   calc_rows();
+   solve(0, 0);
+   printf("%d solutions found\n\n", solution_count);
+   qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort);
+   pretty(solutions[0]);
+   pretty(solutions[solution_count-1]);
+   return 0;
+}
diff --git a/gcc/testsuite/g++.dg/pph/c2meteor-contest.h 
b/gcc/testsuite/g++.dg/pph/c2meteor-contest.h
new file mode 100644
index 000..33a9907
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c2meteor-contest.h
@@ -0,0 +1,617 @@
+/* { dg-options "-w" }  */
+#ifndef __PPH_GUARD_H
+#define __PPH_GUARD_H
+/*
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of "The Computer Language Benchmarks Game" nor the
+name of "The Computer Language Shootout Benchmarks" nor the names of
+its contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQU

Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread H.J. Lu
On Mon, Jul 4, 2011 at 12:57 PM, Richard Sandiford
 wrote:
> "H.J. Lu"  writes:
>> RTL-based forward propagation pass shouldn't propagate hard register.
>
> That's seems a bit draconian.  Many fixed hard registers ought to be OK.
> E.g. there doesn't seem to be anything wrong with propagating uses of
> the stack or frame pointers, subject to the usual availability checks.
>
> To play devil's advocate, an alternative might be to
>
> (a) make local_ref_killed_between_p return true for non-fixed hard
>    registers when a call or asm comes between the two instructions
>
> (b) make use_killed_between return true for non-fixed hard registers
>    when the instructions are in different basic blocks
>
> Thoughts?
>

There are a few problems with this suggestions:

1. The comments says:

/* If USE is a subreg, see if it can be replaced by a pseudo.  */

static bool
forward_propagate_subreg (df_ref use, rtx def_insn, rtx def_set)
{

It indicates this function is intended to work on pseudo registers.

2. propagate_rtx avoids hard registers:

static rtx
propagate_rtx (rtx x, enum machine_mode mode, rtx old_rtx, rtx new_rtx,
   bool speed)
{
  rtx tem;
  bool collapsed;
  int flags;

  if (REG_P (new_rtx) && REGNO (new_rtx) < FIRST_PSEUDO_REGISTER)
return NULL_RTX;

It seems that fwprop is intended to deal with pseudo registers.  If we
want to extend it to hard registers, that should be a separate project.

Thanks.

-- 
H.J.


[pph] Tweak some tests (issue4668052)

2011-07-04 Thread Diego Novillo

This patch adds an assertion to x1ten-hellos to make sure that the
loop counter is properly initialized and ends in 10.  It also calls
exit instead of return.

In c1eabi1.h I forgot to surround the system function signatures in
extern "C" {}.

Tested on x86_64.  Committed.


Diego.

* g++.dg/pph/c1eabi1.h: Surround system function prototypes with
extern "C" {}.
* g++.dg/pph/x1ten-hellos.cc (main): Tidy.
Assert that i is 10 at the end of the loop.
Call exit instead of 'return 0'.
* g++.dg/pph/x1ten-hellos.h: Do not include stdio.h.

diff --git a/gcc/testsuite/g++.dg/pph/c1eabi1.h 
b/gcc/testsuite/g++.dg/pph/c1eabi1.h
index 77ebfa3..f43913f 100644
--- a/gcc/testsuite/g++.dg/pph/c1eabi1.h
+++ b/gcc/testsuite/g++.dg/pph/c1eabi1.h
@@ -33,11 +33,13 @@
 /* Simplified version of c2eabi1.cc - Do not include other system
headers here.  Simply forward declare the library functions used
by this header.  */
-extern void abort(void);
-extern int abs(int);
-extern void exit(int);
-extern double fabs(double);
-extern int printf(const char *, ...);
+extern "C" {
+  extern void abort(void);
+  extern int abs(int);
+  extern void exit(int);
+  extern double fabs(double);
+  extern int printf(const char *, ...);
+}
 
 /* All these functions are defined to use the base ABI, so use the
attribute to ensure the tests use the base ABI to call them even
diff --git a/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc 
b/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc
index 865b149..704b3fc 100644
--- a/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc
+++ b/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc
@@ -1,10 +1,17 @@
 // { dg-do run }
+
 #include "x1ten-hellos.h"
 
 int main(void)
 {
   A a;
-  for (int i = 0; i < 10; i++)
+  int i;
+
+  for (i = 0; i < 10; i++)
 a.hello();
-  return 0;
+
+  if (i != 10)
+abort ();
+
+  exit (0);
 }
diff --git a/gcc/testsuite/g++.dg/pph/x1ten-hellos.h 
b/gcc/testsuite/g++.dg/pph/x1ten-hellos.h
index 2a53b66..c165c01 100644
--- a/gcc/testsuite/g++.dg/pph/x1ten-hellos.h
+++ b/gcc/testsuite/g++.dg/pph/x1ten-hellos.h
@@ -1,6 +1,10 @@
 #ifndef A_H_
 #define A_H_
-#include 
+extern "C" {
+  int printf(const char*, ...);
+  void abort(void);
+  void exit(int);
+};
 
 class A
 {

--
This patch is available for review at http://codereview.appspot.com/4668052


Re: [PATCH] Fix ICE during combine (PR rtl-optimization/49619)

2011-07-04 Thread Eric Botcazou
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> For 4.6, I think safer would be just the first one liner change to pass
> VOIDmode to combine_simplify_rtx.  Is that ok for 4.6?
>
> 2011-07-04  Jakub Jelinek  
>
>   PR rtl-optimization/49619
>   * combine.c (combine_simplify_rtx): In PLUS -> IOR simplification
>   pass VOIDmode as op0_mode to recursive call, and return temp even
>   when different from tor, just if it is not IOR of the original
>   PLUS arguments.
>
>   * gcc.dg/pr49619.c: New test.

OK for mainline, and for 4.6/4.5 branch as far as the first part is concerned.

-- 
Eric Botcazou


Re: Ping: C-family stack check for threads

2011-07-04 Thread Thomas Klein

Richard Henderson wrote:

 On 07/03/2011 08:06 AM, Thomas Klein wrote:
 >  +/*
 >  + * Write prolouge part of stack check into asm file.
 >  + * For Thumb this may look like this:
 >  + *   push {rsym,ramn}
 >  + *   ldr rsym, .LSPCHK0
 >  + *   ldr rsym, [rsym]
 >  + *   ldr ramn, .LSPCHK0 + 4
 >  + *   add rsym, rsym, ramn
 >  + *   cmp sp, rsym
 >  + *   bhs .LSPCHK1
 >  + *   push {lr}
 >  + *   bl __thumb_stack_failure
 >  + * .align 2
 >  + * .LSPCHK0:
 >  + *   .word symbol_addr_of(stack_limit_rtx)
 >  + *   .word lenght_of(amount)
 >   + * .LSPCHK1:
 >  + *   pop {rsym,ramn}
 >  + */
 >  +void
 >  +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount,
 >  + unsigned numregs)
 >  +{

 Is there an exceedingly good reason you're emitting this much code
 as text, rather than as rtl?


To me, the stack check is one coherent operation.
This is placed after an initial push, which can't be eliminated, but before a 
major stack adjustment.

I have, had some problems with rtl at prologue stage.
Is there a way to encapsulate a rtl sequence within prologue.
There is a emit_multi_reg_push but is there something like emit_multi_reg_pop, 
too.
Are the other operations (compare, branche, ..) still allowed?


 In particular, you adjust the stack but not the unwind info.  So
 if one puts a breakpoint at your __thumb_stack_failure function,
 the unwind information will be incorrect.


Yes, if the failure function is taken the info will be wrong.
If this is a major problem do I have to add this info after any push and pop 
operation?
Will the rtl push/pop do this already for me?

Regards
 Thomas Klein




Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area

2011-07-04 Thread Richard Sandiford
"H.J. Lu"  writes:
> RTL-based forward propagation pass shouldn't propagate hard register.

That's seems a bit draconian.  Many fixed hard registers ought to be OK.
E.g. there doesn't seem to be anything wrong with propagating uses of
the stack or frame pointers, subject to the usual availability checks.

To play devil's advocate, an alternative might be to

(a) make local_ref_killed_between_p return true for non-fixed hard
registers when a call or asm comes between the two instructions

(b) make use_killed_between return true for non-fixed hard registers
when the instructions are in different basic blocks

Thoughts?

Richard


Re: C++ PATCH to improve 'aka's on type printing in diagnostics

2011-07-04 Thread Gabriel Dos Reis
Jason Merrill  writes:

| On 06/14/2011 01:38 PM, Jason Merrill wrote:
| > While I was at it, I've also tweaked the compiler to also print the
| > typedef-stripped version of a type when appropriate, which should help
| > with understanding template error messages.
| 
| I noticed that this was sometimes printing an aka that was exactly the
| same, which looks a bit goofy.  So this patch makes sure that the
| typedef-stripped version actually prints out differently before
| appending the {aka}.
| 
| Tested x86_64-pc-linux-gnu.  Gaby: I'm not entirely comfortable
| messing directly with the obstack here, but the pp interface doesn't
| seem to support multiple strings at once.  Does this approach make
| sense to you, or do you have a better idea?
| 

Hi Jason,

Please go ahead with your patch, and open a PR request for a better
interface (assigned to me).  The diagnostic machinery should support
what you want to do without people having to deal directly with the
lower-level storage management.  Thanks! 

-- Gaby


[pph] Split c1eabi (issue4635089)

2011-07-04 Thread Diego Novillo

This test was exposing multiple failures.  To isolate them better, I split
it in two.  I simplified c1eabi1.{cc,h} to test a single header file.
This fails in assembly comparison because we do not emit static
initializers properly out of the pph image.

The original test fails because c2eabi1.h includes another pph image,
which produces a bogus duplicate declaration error that throws the
diagnostic routines into a mutually-recursive infinite call loop.

I am currently working on the c1eabi1 failure.

Tested on x86_64.  Committed.


Diego.

* g++.dg/pph/c1eabi1.cc: Move main from c1eabi1.h
Remove timeout.
Add expected asm difference.
* g++.dg/pph/c1eabi1.h: Do not include stdio.h, stdlib.h nor math.h.
Declare abort, abs, exit, fabs and printf.
* g++.dg/pph/c2eabi1.cc: New.
* g++.dg/pph/c2eabi1.h: New.
* g++.dg/pph/pph.map: Add c2eabi1.h.

diff --git a/gcc/testsuite/g++.dg/pph/c1eabi1.cc 
b/gcc/testsuite/g++.dg/pph/c1eabi1.cc
index 3f5038a..d676732 100644
--- a/gcc/testsuite/g++.dg/pph/c1eabi1.cc
+++ b/gcc/testsuite/g++.dg/pph/c1eabi1.cc
@@ -1,5 +1,191 @@
-// { dg-timeout 2 { target *-*-* } }
-// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } }
 // { dg-options "-w -fpermissive" }
+// pph asm xdiff
 
 #include "c1eabi1.h"
+
+int main () {
+  unsigned char bytes[256];
+  int i, j, k, n;
+  int *result;
+
+  /* Table 2.  Double-precision floating-point arithmetic.  */
+  deq (__aeabi_dadd (dzero, done), done);
+  deq (__aeabi_dadd (done, done), dtwo);
+  deq (__aeabi_ddiv (dminus_four, dminus_two), dtwo);
+  deq (__aeabi_ddiv (dminus_two, dtwo), dminus_one);
+  deq (__aeabi_dmul (dtwo, dtwo), dfour);
+  deq (__aeabi_dmul (dminus_one, dminus_two), dtwo);
+  deq (__aeabi_dneg (dminus_one), done);
+  deq (__aeabi_dneg (dfour), dminus_four);
+  deq (__aeabi_drsub (done, dzero), dminus_one);
+  deq (__aeabi_drsub (dtwo, dminus_two), dminus_four);
+  deq (__aeabi_dsub (dzero, done), dminus_one);
+  deq (__aeabi_dsub (dminus_two, dtwo), dminus_four);
+
+  /* Table 3.  Double-precision floating-point comparisons.  */
+  ieq (__aeabi_dcmpeq (done, done), 1);
+  ieq (__aeabi_dcmpeq (done, dzero), 0);
+  ieq (__aeabi_dcmpeq (dNaN, dzero), 0);
+  ieq (__aeabi_dcmpeq (dNaN, dNaN), 0);
+
+  ieq (__aeabi_dcmplt (dzero, done), 1);
+  ieq (__aeabi_dcmplt (done, dzero), 0);
+  ieq (__aeabi_dcmplt (dzero, dzero), 0);
+  ieq (__aeabi_dcmplt (dzero, dNaN), 0);
+  ieq (__aeabi_dcmplt (dNaN, dNaN), 0);
+
+  ieq (__aeabi_dcmple (dzero, done), 1);
+  ieq (__aeabi_dcmple (done, dzero), 0);
+  ieq (__aeabi_dcmple (dzero, dzero), 1);
+  ieq (__aeabi_dcmple (dzero, dNaN), 0);
+  ieq (__aeabi_dcmple (dNaN, dNaN), 0);
+
+  ieq (__aeabi_dcmpge (dzero, done), 0);
+  ieq (__aeabi_dcmpge (done, dzero), 1);
+  ieq (__aeabi_dcmpge (dzero, dzero), 1);
+  ieq (__aeabi_dcmpge (dzero, dNaN), 0);
+  ieq (__aeabi_dcmpge (dNaN, dNaN), 0);
+
+  ieq (__aeabi_dcmpgt (dzero, done), 0);
+  ieq (__aeabi_dcmpgt (done, dzero), 1);
+  ieq (__aeabi_dcmplt (dzero, dzero), 0);
+  ieq (__aeabi_dcmpgt (dzero, dNaN), 0);
+  ieq (__aeabi_dcmpgt (dNaN, dNaN), 0);
+
+  ieq (__aeabi_dcmpun (done, done), 0);
+  ieq (__aeabi_dcmpun (done, dzero), 0);
+  ieq (__aeabi_dcmpun (dNaN, dzero), 1);
+  ieq (__aeabi_dcmpun (dNaN, dNaN), 1);
+
+  /* Table 4.  Single-precision floating-point arithmetic.  */
+  feq (__aeabi_fadd (fzero, fone), fone);
+  feq (__aeabi_fadd (fone, fone), ftwo);
+  feq (__aeabi_fdiv (fminus_four, fminus_two), ftwo);
+  feq (__aeabi_fdiv (fminus_two, ftwo), fminus_one);
+  feq (__aeabi_fmul (ftwo, ftwo), ffour);
+  feq (__aeabi_fmul (fminus_one, fminus_two), ftwo);
+  feq (__aeabi_fneg (fminus_one), fone);
+  feq (__aeabi_fneg (ffour), fminus_four);
+  feq (__aeabi_frsub (fone, fzero), fminus_one);
+  feq (__aeabi_frsub (ftwo, fminus_two), fminus_four);
+  feq (__aeabi_fsub (fzero, fone), fminus_one);
+  feq (__aeabi_fsub (fminus_two, ftwo), fminus_four);
+
+  /* Table 5.  Single-precision floating-point comparisons.  */
+  ieq (__aeabi_fcmpeq (fone, fone), 1);
+  ieq (__aeabi_fcmpeq (fone, fzero), 0);
+  ieq (__aeabi_fcmpeq (fNaN, fzero), 0);
+  ieq (__aeabi_fcmpeq (fNaN, fNaN), 0);
+
+  ieq (__aeabi_fcmplt (fzero, fone), 1);
+  ieq (__aeabi_fcmplt (fone, fzero), 0);
+  ieq (__aeabi_fcmplt (fzero, fzero), 0);
+  ieq (__aeabi_fcmplt (fzero, fNaN), 0);
+  ieq (__aeabi_fcmplt (fNaN, fNaN), 0);
+
+  ieq (__aeabi_fcmple (fzero, fone), 1);
+  ieq (__aeabi_fcmple (fone, fzero), 0);
+  ieq (__aeabi_fcmple (fzero, fzero), 1);
+  ieq (__aeabi_fcmple (fzero, fNaN), 0);
+  ieq (__aeabi_fcmple (fNaN, fNaN), 0);
+
+  ieq (__aeabi_fcmpge (fzero, fone), 0);
+  ieq (__aeabi_fcmpge (fone, fzero), 1);
+  ieq (__aeabi_fcmpge (fzero, fzero), 1);
+  ieq (__aeabi_fcmpge (fzero, fNaN), 0);
+  ieq (__aeabi_fcmpge (fNaN, fNaN), 0);
+
+  ieq (__aeabi_fcmpgt (fzero, fone), 0);
+  ieq (__aeabi_fcmpgt (fone, fzero), 1);
+  ieq (__aeabi_fcmplt (fzero, fzero), 0);
+  ieq (__aeabi_fcmpgt (fzero, fNaN), 0);
+  i

Re: [PATCH] Fix ICE with gfortran ... -L without argument (PR fortran/49623)

2011-07-04 Thread Paul Richard Thomas
Dear Jakub,

Yes!  OK for trunk and, if you will, for 4.6.

Thanks

Paul

On Mon, Jul 4, 2011 at 7:22 PM, Jakub Jelinek  wrote:
> Hi!
>
> If -L doesn't have an argument, find_spec_file ICEs on it, as
> the argument is NULL.  As suggested by Joseph, this disregards in
> this loop all options which don't have the required argument.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk/4.6?
>
> 2011-07-04  Jakub Jelinek  
>
>        PR fortran/49623
>        * gfortranspec.c (lang_specific_driver): Ignore options with
>        CL_ERR_MISSING_ARG errors.
>
> --- gcc/fortran/gfortranspec.c.jj       2011-07-04 14:58:56.0 +0200
> +++ gcc/fortran/gfortranspec.c  2011-07-04 15:01:58.0 +0200
> @@ -255,6 +255,9 @@ lang_specific_driver (struct cl_decoded_
>
>   for (i = 1; i < argc; ++i)
>     {
> +      if (decoded_options[i].errors & CL_ERR_MISSING_ARG)
> +       continue;
> +
>       switch (decoded_options[i].opt_index)
>        {
>        case OPT_SPECIAL_input_file:
>
>        Jakub
>



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy


Re: [PATCH] Fix bootstrap on OpenBSD, PR48851

2011-07-04 Thread Mike Stump
On Jul 4, 2011, at 4:04 AM, Richard Guenther wrote:
> It happens that OpenBSD suffers from a bogus fixinclude that changes
> its perfectly valid NULL define from (void *)0 to 0.  The fix itself
> appears to be very old and is completely bogus

I don't agree with the completely bogus part.  Why not replace it with:

#undef NULL
#ifdef __GNUG__
#define NULL __null
#else   /* G++ */
#ifndef __cplusplus
#define NULL ((void *)0)
#else   /* C++ */
#define NULL 0
#endif  /* C++ */
#endif  /* G++ */

?

This is C++ friendly, C friendly and modern.  It should be very safe and should 
work just about everywhere.

> - it replaces
> (void *)0 with 0 under the assumption the former is invalid for C++ - 
> which is true - but 0 is inappropriate for C which is much worse.

A #define to 0 is, for the C language, last I checked valid.  You may not like 
it, but welcome to C.

> Thus, I propose to remove the fix altogether.

Breaking all systems that are broken, isn't a good tradeoff.

Now, looking at the PR, in this case, one could add a bypass __GNUG__ to this 
fix, and avoid the change on OpenBSD.  This would also fix the problem.  I do 
not think removing the fix is a good idea.


Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns

2011-07-04 Thread Kai Tietz
Ok, reworked version.  The folding of X op X and !X op !X seems indeed
not being necessary. So function simplifies much.

Bootstrapped and regression tested for all standard languages (plus
Ada and Obj-C++). Ok for apply?

Regards,
Kai

Index: gcc-head/gcc/tree-ssa-forwprop.c
===
--- gcc-head.orig/gcc/tree-ssa-forwprop.c
+++ gcc-head/gcc/tree-ssa-forwprop.c
@@ -1602,6 +1602,129 @@ simplify_builtin_call (gimple_stmt_itera
   return false;
 }

+/* Checks if expression has type of one-bit precision, or is a known
+   truth-valued expression.  */
+static bool
+truth_valued_ssa_name (tree name)
+{
+  gimple def;
+  tree type = TREE_TYPE (name);
+
+  if (!INTEGRAL_TYPE_P (type))
+return false;
+  /* Don't check here for BOOLEAN_TYPE as the precision isn't
+ necessarily one and so ~X is not equal to !X.  */
+  if (TYPE_PRECISION (type) == 1)
+return true;
+  def = SSA_NAME_DEF_STMT (name);
+  if (is_gimple_assign (def))
+return truth_value_p (gimple_assign_rhs_code (def), type);
+  return false;
+}
+
+/* Helper routine for simplify_bitwise_binary_1 function.
+   Return for the SSA name NAME the expression X if it mets condition
+   NAME = !X. Otherwise return NULL_TREE.
+   Detected patterns for NAME = !X are:
+ !X and X == 0 for X with integral type.
+ X ^ 1, X != 1,or ~X for X with integral type with precision of one.  */
+static tree
+lookup_logical_inverted_value (tree name)
+{
+  tree op1, op2;
+  enum tree_code code;
+  gimple def;
+
+  /* If name has none-intergal type, or isn't a SSA_NAME, then
+ return.  */
+  if (TREE_CODE (name) != SSA_NAME
+  || !INTEGRAL_TYPE_P (TREE_TYPE (name)))
+return NULL_TREE;
+  def = SSA_NAME_DEF_STMT (name);
+  if (!is_gimple_assign (def))
+return NULL_TREE;
+
+  code = gimple_assign_rhs_code (def);
+  op1 = gimple_assign_rhs1 (def);
+  op2 = NULL_TREE;
+
+  /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand.
+ If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR,
+ or BIT_NOT_EXPR, then return.  */
+  if (code == EQ_EXPR || code == NE_EXPR
+  || code == BIT_XOR_EXPR)
+op2 = gimple_assign_rhs2 (def);
+
+  switch (code)
+{
+case TRUTH_NOT_EXPR:
+  return op1;
+case BIT_NOT_EXPR:
+  if (truth_valued_ssa_name (name))
+   return op1;
+  break;
+case EQ_EXPR:
+  /* Check if we have X == 0 and X has an integral type.  */
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
+   break;
+  if (integer_zerop (op2))
+   return op1;
+  break;
+case NE_EXPR:
+  /* Check if we have X != 1 and X is a truth-valued.  */
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
+   break;
+  if (integer_onep (op2) && truth_valued_ssa_name (op1))
+   return op1;
+  break;
+case BIT_XOR_EXPR:
+  /* Check if we have X ^ 1 and X is truth valued.  */
+  if (integer_onep (op2) && truth_valued_ssa_name (op1))
+   return op1;
+  break;
+default:
+  break;
+}
+
+  return NULL_TREE;
+}
+
+/* Optimize ARG1 CODE ARG2 to a constant for bitwise binary
+   operations CODE, if one operand has the logically inverted
+   value of the other.  */
+static tree
+simplify_bitwise_binary_1 (enum tree_code code, tree type,
+  tree arg1, tree arg2)
+{
+  tree anot;
+
+  /* If CODE isn't a bitwise binary operation, return NULL_TREE.  */
+  if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR
+  && code != BIT_XOR_EXPR)
+return NULL_TREE;
+
+  /* First check if operands ARG1 and ARG2 are equal.  If so
+ return NULL_TREE as this optimization is handled fold_stmt.  */
+  if (arg1 == arg2)
+return NULL_TREE;
+  /* See if we have in arguments logical-not patterns.  */
+  if (((anot = lookup_logical_inverted_value (arg1)) == NULL_TREE
+   || anot != arg2)
+  && ((anot = lookup_logical_inverted_value (arg2)) == NULL_TREE
+ || anot != arg1))
+return NULL_TREE;
+
+  /* X & !X -> 0.  */
+  if (code == BIT_AND_EXPR)
+return fold_convert (type, integer_zero_node);
+  /* X | !X -> 1 and X ^ !X -> 1, if X is truth-valued.  */
+  if (truth_valued_ssa_name (anot))
+return fold_convert (type, integer_one_node);
+
+  /* ??? Otherwise result is (X != 0 ? X : 1).  not handled.  */
+  return NULL_TREE;
+}
+
 /* Simplify bitwise binary operations.
Return true if a transformation applied, otherwise return false.  */

@@ -1769,6 +1892,15 @@ simplify_bitwise_binary (gimple_stmt_ite
   return true;
 }

+  /* Try simple folding for X op !X, and X op X.  */
+  res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2);
+  if (res != NULL_TREE)
+{
+  gimple_assign_set_rhs_from_tree (gsi, res);
+  update_stmt (gsi_stmt (*gsi));
+  return true;
+}
+
   return false;
 }

Index: gcc-head/gcc/testsuite/gcc.dg/binop-notand1a.c
===
--- /dev/null
+++ gcc-head/gcc/tes

Re: [testsuite, ada] Fix run_acats for shells without type -p

2011-07-04 Thread Rainer Orth
Arnaud Charlet  writes:

>> This patch fixes this by decoupling type/type -p from extracting the
>> last field.
>> 
>> Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11.
>> 
>> Ok for mainline, 4.6 and 4.5 branches (where the offending patch has
>> been installed)?
>
> OK, but if this new patch introduces new regressions, please revert this
> change and the previous one, thanks.

I will.  The fragility of this stuff suggests that I should revisit and
finish my ACATS via DejaGnu patch ;-)

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [testsuite, ada] Fix run_acats for shells without type -p

2011-07-04 Thread Arnaud Charlet
> This patch fixes this by decoupling type/type -p from extracting the
> last field.
> 
> Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11.
> 
> Ok for mainline, 4.6 and 4.5 branches (where the offending patch has
> been installed)?

OK, but if this new patch introduces new regressions, please revert this
change and the previous one, thanks.

Arno


Re: [PATCH] Fix an endless recursion during simplification of MULT (PR rtl-optimization/49472)

2011-07-04 Thread Eric Botcazou
> 2011-07-04  Jakub Jelinek  
>
>   PR rtl-optimization/49472
>   * simplify-rtx.c (simplify_unary_operation_1) : When
>   negating MULT, negate the second operand instead of first.
>   (simplify_binary_operation_1) : If one operand is
>   a NEG and the other is MULT, don't attempt to optimize by
>   negation of the MULT operand if it only moves the NEG operation
>   around.
>
>   * gfortran.dg/pr49472.f90: New test.

OK for mainline and 4.6 branch.

-- 
Eric Botcazou


[wwwdocs] Buildstat update for 4.6

2011-07-04 Thread Tom G. Christensen
Latest results for 4.6.x

-tgc

Testresults for 4.6.1:
  hppa2.0w-hp-hpux11.00
  hppa2.0w-hp-hpux11.11
  hppa64-hp-hpux11.11
  i386-pc-solaris2.10
  i686-pc-linux-gnu (2)
  sparc-sun-solaris2.8
  x86_64-unknown-linux-gnu

Testresults for 4.6.0
  sparc-sun-solaris2.10
  x86_64-unknown-linux-gnu
Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/buildstat.html,v
retrieving revision 1.4
diff -u -r1.4 buildstat.html
--- buildstat.html  4 Jun 2011 20:14:24 -   1.4
+++ buildstat.html  4 Jul 2011 18:38:15 -
@@ -42,9 +42,18 @@
 
 
 
+hppa2.0w-hp-hpux11.00
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00197.html";>4.6.1
+
+
+
+
 hppa2.0w-hp-hpux11.11
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03306.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02630.html";>4.6.0
 
 
@@ -53,6 +62,7 @@
 hppa64-hp-hpux11.11
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03440.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02801.html";>4.6.0
 
 
@@ -61,6 +71,7 @@
 i386-pc-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00139.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00175.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg03106.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02960.html";>4.6.0,
@@ -84,6 +95,7 @@
 i386-pc-solaris2.10
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00327.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02738.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02705.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02704.html";>4.6.0,
@@ -106,6 +118,8 @@
 i686-pc-linux-gnu
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03400.html";>4.6.1,
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03128.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg03610.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00440.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00064.html";>4.6.0,
@@ -154,6 +168,7 @@
 sparc-sun-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00138.html";>4.6.1,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02959.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02933.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02716.html";>4.6.0,
@@ -175,6 +190,7 @@
 sparc-sun-solaris2.10
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg02863.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02835.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02725.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02714.html";>4.6.0
@@ -203,6 +219,8 @@
 x86_64-unknown-linux-gnu
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03135.html";>4.6.1,
+http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg01380.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg03091.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00445.html";>4.6.0,
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg03102.html";>4.6.0,


[PATCH] Fix ICE during combine (PR rtl-optimization/49619)

2011-07-04 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because simplify_gen_binary (IOR, HImode, ...)
simplifies into (subreg:HI (reg:SI ...) 0), but was still passing
mode (HImode) as second argument to recursive combine_simplify_rtx call.
The second argument is op0_mode, so is supposed to be the real
mode which should be assumed for its first operand.
Passing mode in that case is only safe if simplify_gen_binary doesn't
actually simplify it, but as simplify_gen_binary would simplify constant
arguments anyway into a constant, it doesn't make any sense to hint
combine_simplify_rtx about the original op0_mode.  That is something
only useful when called from subst, which simplifies the operands (which may
turn them from non-VOIDmode into VOIDmode) and then calls
combine_simplify_rtx to simplify the whole operation.

The second part of the patch attempts to optimize more, as
simplify_gen_binary may already simplify the expression, so often (including
the testcase) combine_simplify_rtx doesn't simplify anything, i.e.
tor == temp, yet it is simplified over (ior plus_arg0 plus_arg1).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
For 4.6, I think safer would be just the first one liner change to pass
VOIDmode to combine_simplify_rtx.  Is that ok for 4.6?

2011-07-04  Jakub Jelinek  

PR rtl-optimization/49619
* combine.c (combine_simplify_rtx): In PLUS -> IOR simplification
pass VOIDmode as op0_mode to recursive call, and return temp even
when different from tor, just if it is not IOR of the original
PLUS arguments.

* gcc.dg/pr49619.c: New test.

--- gcc/combine.c.jj2011-06-21 16:46:01.0 +0200
+++ gcc/combine.c   2011-07-04 16:05:52.0 +0200
@@ -5681,12 +5681,17 @@ combine_simplify_rtx (rtx x, enum machin
{
  /* Try to simplify the expression further.  */
  rtx tor = simplify_gen_binary (IOR, mode, XEXP (x, 0), XEXP (x, 1));
- temp = combine_simplify_rtx (tor, mode, in_dest, 0);
+ temp = combine_simplify_rtx (tor, VOIDmode, in_dest, 0);
 
  /* If we could, great.  If not, do not go ahead with the IOR
 replacement, since PLUS appears in many special purpose
 address arithmetic instructions.  */
- if (GET_CODE (temp) != CLOBBER && temp != tor)
+ if (GET_CODE (temp) != CLOBBER
+ && (GET_CODE (temp) != IOR
+ || ((XEXP (temp, 0) != XEXP (x, 0)
+  || XEXP (temp, 1) != XEXP (x, 1))
+ && (XEXP (temp, 0) != XEXP (x, 1)
+ || XEXP (temp, 1) != XEXP (x, 0)
return temp;
}
   break;
--- gcc/testsuite/gcc.dg/pr49619.c.jj   2011-07-04 16:04:21.0 +0200
+++ gcc/testsuite/gcc.dg/pr49619.c  2011-07-04 16:04:06.0 +0200
@@ -0,0 +1,13 @@
+/* PR rtl-optimization/49619 */
+/* { dg-do compile } */
+/* { dg-options "-O -fno-tree-fre" } */
+
+extern int a, b;
+
+void
+foo (int x)
+{
+  a = 2;
+  b = 0;
+  b = (a && ((a = 1, 0 >= b) || (short) (x + (b & x;
+}

Jakub


[testsuite, ada] Fix run_acats for shells without type -p

2011-07-04 Thread Rainer Orth
My last run_acats patch broke platforms where CONFIG_SHELL doesn't
support type -p (like Solaris < 11 /bin/ksh): when using awk to extract
the last output field, the exit code is from the last command in the
pipe (always 0), not type, so the which function returns an empty
string.

This patch fixes this by decoupling type/type -p from extracting the
last field.

Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11.

Ok for mainline, 4.6 and 4.5 branches (where the offending patch has
been installed)?

Thanks.
Rainer


2011-07-01  Rainer Orth  

* ada/acats/run_acats (which): Extract last field from type -p,
type output only if command succeeded.

diff --git a/gcc/testsuite/ada/acats/run_acats 
b/gcc/testsuite/ada/acats/run_acats
--- a/gcc/testsuite/ada/acats/run_acats
+++ b/gcc/testsuite/ada/acats/run_acats
@@ -14,8 +14,8 @@ fi
 # Fall back to whence which ksh88 and ksh93 provide, but bash does not.
 
 which () {
-path=`type -p $* 2>/dev/null | awk '{print $NF}'` && { echo $path; return 
0; }
-path=`type $* 2>/dev/null | awk '{print $NF}'` && { echo $path; return 0; }
+path=`type -p $* 2>/dev/null` && { echo $path | awk '{print $NF}'; return 
0; }
+path=`type $* 2>/dev/null` && { echo $path | awk '{print $NF}'; return 0; }
 path=`whence $* 2>/dev/null` && { echo $path; return 0; }
 return 1
 }


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Fix tree_could_trap_p so that weak var accesses are considered trapping (PR tree-optimization/49618)

2011-07-04 Thread Jakub Jelinek
Hi!

Before http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168951
set_mem_attributes_minus_bitpos would set MEM_NOTRAP_P for decls
based on whether they are DECL_WEAK or not, but now it is set only
from !tree_could_trap_p.

These patches adjust tree_could_trap_p to say that references
to weak vars/functions may trap (for calls it was doing that already).

The first version of the patch is intended for 4.7 and only handles
that way weak vars/functions that aren't known to be defined somewhere
(either in current CU, or in the CUs included in -flto build).
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The second version is simplified one which always treats DECL_WEAK
vars as maybe trapping.  Ok for 4.6?

Jakub
2011-07-04  Jakub Jelinek  

PR tree-optimization/49618
* tree-eh.c (tree_could_trap_p) : For DECL_WEAK
t recurse on the decl.
: For DECL_WEAK decls
return true if expr isn't known to be defined in current
TU or some other LTO partition.

--- gcc/tree-eh.c.jj2011-06-17 11:02:19.0 +0200
+++ gcc/tree-eh.c   2011-07-04 14:27:01.0 +0200
@@ -2449,8 +2449,42 @@ tree_could_trap_p (tree expr)
 case CALL_EXPR:
   t = get_callee_fndecl (expr);
   /* Assume that calls to weak functions may trap.  */
-  if (!t || !DECL_P (t) || DECL_WEAK (t))
+  if (!t || !DECL_P (t))
return true;
+  if (DECL_WEAK (t))
+   return tree_could_trap_p (t);
+  return false;
+
+case FUNCTION_DECL:
+  /* Assume that accesses to weak functions may trap, unless we know
+they are certainly defined in current TU or in some other
+LTO partition.  */
+  if (DECL_WEAK (expr))
+   {
+ struct cgraph_node *node;
+ if (!DECL_EXTERNAL (expr))
+   return false;
+ node = cgraph_function_node (cgraph_get_node (expr), NULL);
+ if (node && node->in_other_partition)
+   return false;
+ return true;
+   }
+  return false;
+
+case VAR_DECL:
+  /* Assume that accesses to weak vars may trap, unless we know
+they are certainly defined in current TU or in some other
+LTO partition.  */
+  if (DECL_WEAK (expr))
+   {
+ struct varpool_node *node;
+ if (!DECL_EXTERNAL (expr))
+   return false;
+ node = varpool_variable_node (varpool_get_node (expr), NULL);
+ if (node && node->in_other_partition)
+   return false;
+ return true;
+   }
   return false;
 
 default:
2011-07-04  Jakub Jelinek  

PR tree-optimization/49618
* tree-eh.c (tree_could_trap_p) :
For DECL_WEAK decls return true.

--- gcc/tree-eh.c.jj2011-05-11 17:01:05.0 +0200
+++ gcc/tree-eh.c   2011-07-04 14:32:54.0 +0200
@@ -2459,6 +2459,13 @@ tree_could_trap_p (tree expr)
return true;
   return false;
 
+case VAR_DECL:
+case FUNCTION_DECL:
+  /* Assume that accesses to weak vars or functions may trap.  */
+  if (DECL_WEAK (expr))
+return true;
+  return false;
+
 default:
   return false;
 }


CFT: Move unwinder to toplevel libgcc

2011-07-04 Thread Rainer Orth
"Joseph S. Myers"  writes:

> On Mon, 20 Jun 2011, Rainer Orth wrote:
>
>> * Move all remaining unwinder-only macros to libgcc: UNW_IVMS_MODE,
>>   MD_UNW_COMPATIBLE_PERSONALITY_P, MD_FROB_UPDATE_CONTEXT.
>
> I don't see any sign of macros being poisoned in system.h.  For macros 
> used in target-independent unwinder code - at least MD_FROB_UPDATE_CONTEXT 
> - that used to be defined in the host tm.h but now no longer should be, I 
> think poisoning in system.h is appropriate.

Done in the updated patch below.  Given that the other two are ia64 only
and not documented in md.texi, I don't think they need to be poisoned.

Otherwise, the patch is unchanged from the original submission:

[build] Move unwinder to toplevel libgcc
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html

Unfortunately, it hasn't seen much comment.  I'm now looking for testers
especially on platforms with more change and approval of those parts:

* Several IA-64 targets:

ia64*-*-linux*
ia64*-*-hpux*
ia64-hp-*vms*

* AIX:

rs6000-ibm-aix*

Thanks.
Rainer


2011-06-12  Rainer Orth  

gcc:
* Makefile.in (UNWIND_H): Remove.
(LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): Move to
../libgcc/Makefile.in.
(LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL): Likewise.
(LIBUNWINDDEP): Remove.
(libgcc-support): Remove LIB2ADDEH, $(srcdir)/emutls.c dependencies.
(libgcc.mvars): Remove LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED,
LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL.
(stmp-int-hdrs): Remove $(UNWIND_H) dependency.
Don't copy $(UNWIND_H).
* config.gcc (ia64*-*-linux*): Remove with_system_libunwind
handling.
* configure.ac (GCC_CHECK_UNWIND_GETIPINFO): Remove.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* emutls.c, unwind-c.c, unwind-compat.c, unwind-compat.h,
unwind-dw2-fde-compat.c, unwind-dw2-fde-darwin.c,
unwind-dw2-fde-glibc.c, unwind-dw2-fde.c, unwind-dw2-fde.h,
unwind-dw2.c, unwind-dw2.h, unwind-generic.h, unwind-pe.h,
unwind-sjlj.c, unwind.inc: Move to ../libgcc.
* config/arm/libunwind.S, config/arm/pr-support.c,
config/arm/unwind-arm.c, config/arm/unwind-arm.h: Move to
../libgcc/config/arm.
* config/arm/t-bpabi (UNWIND_H, LIB2ADDEH): Remove.
* config/arm/t-symbian (UNWIND_H, LIB2ADDEH): Remove.
* config/frv/t-frv ($(T)frvbegin$(objext)): Use
$(srcdir)/../libgcc to refer to unwind-dw2-fde.h.
($(T)frvend$(objext)): Likewise.
* config/ia64/t-glibc (LIB2ADDEH): Remove.
* config/ia64/t-glibc-libunwind: Move to ../libgcc/config/ia64.
* config/ia64/fde-glibc.c, config/ia64/fde-vms.c,
config/ia64/unwind-ia64.c, config/ia64/unwind-ia64.h: Move to
../libgcc/config/ia64.
* config/ia64/t-hpux (LIB2ADDEH): Remove.
* config/ia64/t-ia64 (LIB2ADDEH): Remove.
* config/ia64/t-vms (LIB2ADDEH): Remove.
* config/ia64/vms.h (UNW_IVMS_MODE,
MD_UNW_COMPATIBLE_PERSONALITY_P): Remove.
* config/picochip/t-picochip (LIB2ADDEH): Remove.
* config/rs6000/aix.h (R_LR, MD_FROB_UPDATE_CONTEXT): Remove.
* config/rs6000/t-darwin (LIB2ADDEH): Remove.
* config/rs6000/darwin-fallback.c: Move to ../libgcc/config/rs6000.
* config/sh/t-sh ($(T)unwind-dw2-Os-4-200.o): Use
$(srcdir)/../libgcc to refer to unwinder sources.
* config/spu/t-spu-elf (LIB2ADDEH): Remove.
* config/t-darwin (LIB2ADDEH): Remove.
* config/t-freebsd (LIB2ADDEH): Remove.
* config/t-libunwind (LIB2ADDEH, LIB2ADDEHSTATIC): Remove.
* config/t-linux (LIB2ADDEH): Remove.
* config/t-sol2 (LIB2ADDEH): Remove.
* config/xtensa/t-xtensa (LIB2ADDEH): Remove.
* system.h (MD_FROB_UPDATE_CONTEXT): Poison.

gcc/po:
* EXCLUDES (unwind-c.c, unwind-dw2-fde-darwin.c,
unwind-dw2-fde-glibc.c, unwind-dw2-fde.c, unwind-dw2-fde.h,
unwind-dw2.c, unwind-pe.h, unwind-sjlj.c, unwind.h): Remove.

libgcc:
* Makefile.in (LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): New
variables.
(LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL): New variables.
(LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): Add $(srcdir)/emutls.c.
(install-unwind_h): New target.
(all): Depend on it.
* config.host (unwind_header): New variable.
(*-*-freebsd*): Set tmake_file to t-eh-dw2-dip.
(*-*-linux*, frv-*-*linux*, *-*-kfreebsd*-gnu, *-*-knetbsd*-gnu,
*-*-gnu*): Likewise, also for *-*-kopensolaris*-gnu.
(*-*-solaris2*): Add t-eh-dw2-dip to tmake_file.
(arm*-*-linux-*eabi, arm*-*-uclinux*eabi, arm*-*-eabi*): Add
arm/t-bpabi to tmake_file.
Set unwind_header.
(arm*-*-symbianelf*): Add arm/t-symbian to tmake_file.

Re: Ping #1: [testsuite, AVR]: Add some progmem test cases

2011-07-04 Thread Mike Stump
On Jul 4, 2011, at 4:07 AM, Denis Chertykov  wrote:
>> 
>> testsuite/

>>* gcc.target/avr/torture/progmem-1.cpp: New file.
> 
> I don't know who must approve tests.
> If me then Approved

You!   If there are ugly details more related to the test suite framework, feel 
free to kick it up.


Re: [testsuite, AVR]: Add some progmem test cases

2011-07-04 Thread Mike Stump
On Jun 30, 2011, at 10:38 AM, Georg-Johann Lay  wrote:
> Is
>  ./testsuite/gcc.target/avr/
> realm of avr port maintainers?

I'm fine with the avr people reviewing and approving all they think is ready 
for the tree.  If they go out into the weeds, we can reign them in, I'm sure 
that would never happen.  If a port is lacking in review bandwidth, I might 
fire up, but I don't think avr fits that description.
> 


[PATCH] Fix an endless recursion during simplification of MULT (PR rtl-optimization/49472)

2011-07-04 Thread Jakub Jelinek
Hi!

On the attached testcase simplify-rtx.c was endlessly oscillating when
trying to simplify a complex debug insn location.  The first
hunk changes oscillation between 3 possible expressions into oscillation
between 2 possible expressions, by preferring to change second argument
instead of first, because swap_commutative_operands_p prefers to put
NEG to the second argument instead of first.

The second hunk fixes the oscillation by not trying to optimize
if we just move the NEG around.  Otherwise, on
(mult (mult (reg A) (reg B)) (neg (reg B)))
those hunks try to move the neg to the first argument to see if it would
simplify things.  That becomes then (mult (mult (reg A) (neg (reg B))) (reg B))
and as MULT is associative and swap_commutative_operands_p prefers to
put NEG last, it optimizes it again into the original form and back
endlessly.  The patch still tries to simplify the negation of the other
argument, but if the other argument is also MULT and it didn't really
simplify it, just moved the negation around, it will stop.

Bootstrapped/regtested on x86_64-linux and i686-linux.  Ok for trunk?
The bug is latent on 4.6 branch, ok for branch as well?

2011-07-04  Jakub Jelinek  

PR rtl-optimization/49472
* simplify-rtx.c (simplify_unary_operation_1) : When
negating MULT, negate the second operand instead of first.
(simplify_binary_operation_1) : If one operand is
a NEG and the other is MULT, don't attempt to optimize by
negation of the MULT operand if it only moves the NEG operation
around.

* gfortran.dg/pr49472.f90: New test.

--- gcc/simplify-rtx.c.jj   2011-06-21 16:46:01.0 +0200
+++ gcc/simplify-rtx.c  2011-07-04 12:14:51.0 +0200
@@ -686,13 +686,13 @@ simplify_unary_operation_1 (enum rtx_cod
  return simplify_gen_binary (MINUS, mode, temp, XEXP (op, 1));
}
 
-  /* (neg (mult A B)) becomes (mult (neg A) B).
+  /* (neg (mult A B)) becomes (mult A (neg B)).
 This works even for floating-point values.  */
   if (GET_CODE (op) == MULT
  && !HONOR_SIGN_DEPENDENT_ROUNDING (mode))
{
- temp = simplify_gen_unary (NEG, mode, XEXP (op, 0), mode);
- return simplify_gen_binary (MULT, mode, temp, XEXP (op, 1));
+ temp = simplify_gen_unary (NEG, mode, XEXP (op, 1), mode);
+ return simplify_gen_binary (MULT, mode, XEXP (op, 0), temp);
}
 
   /* NEG commutes with ASHIFT since it is multiplication.  Only do
@@ -2271,12 +2271,34 @@ simplify_binary_operation_1 (enum rtx_co
   if (GET_CODE (op0) == NEG)
{
  rtx temp = simplify_unary_operation (NEG, mode, op1, mode);
+ /* If op1 is a MULT as well and simplify_unary_operation
+just moved the NEG to the second operand, simplify_gen_binary
+below could through simplify_associative_operation move
+the NEG around again and recurse endlessly.  */
+ if (temp
+ && GET_CODE (op1) == MULT
+ && GET_CODE (temp) == MULT
+ && XEXP (op1, 0) == XEXP (temp, 0)
+ && GET_CODE (XEXP (temp, 1)) == NEG
+ && XEXP (op1, 1) == XEXP (XEXP (temp, 1), 0))
+   temp = NULL_RTX;
  if (temp)
return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp);
}
   if (GET_CODE (op1) == NEG)
{
  rtx temp = simplify_unary_operation (NEG, mode, op0, mode);
+ /* If op0 is a MULT as well and simplify_unary_operation
+just moved the NEG to the second operand, simplify_gen_binary
+below could through simplify_associative_operation move
+the NEG around again and recurse endlessly.  */
+ if (temp
+ && GET_CODE (op0) == MULT
+ && GET_CODE (temp) == MULT
+ && XEXP (op0, 0) == XEXP (temp, 0)
+ && GET_CODE (XEXP (temp, 1)) == NEG
+ && XEXP (op0, 1) == XEXP (XEXP (temp, 1), 0))
+   temp = NULL_RTX;
  if (temp)
return simplify_gen_binary (MULT, mode, temp, XEXP (op1, 0));
}
--- gcc/testsuite/gfortran.dg/pr49472.f90.jj2011-07-04 12:23:12.0 
+0200
+++ gcc/testsuite/gfortran.dg/pr49472.f90   2011-07-04 12:22:53.0 
+0200
@@ -0,0 +1,15 @@
+! PR rtl-optimization/49472
+! { dg-do compile }
+! { dg-options "-O -fcompare-debug -ffast-math" }
+subroutine pr49472
+  integer, parameter :: n = 3
+  real(8) :: a, b, c, d, e (n+1)
+  integer :: i
+  do i=2, (n+1)
+b = 1. / ((i - 1.5d0) * 1.)
+c = b * a
+d = -b * c / (1. + b * b) ** 1.5d0
+e(i) = d
+  end do
+  call dummy (e)
+end subroutine

Jakub


[PATCH] Fix dead_debug_insert_before ICE (PR debug/49522)

2011-07-04 Thread Jakub Jelinek
Hi!

In dead_debug_* we don't immediately rescan insns, because that kills all
the df links we need to use, only queue their rescanning.

There are two kinds of changes we do on the debug insns without immediate
rescanning:
1) reset the debug insn
2) replace a reg use with DEBUG_EXPR of the same mode or
   subreg of a larger DEBUG_EXPR with the same outer mode as the reg

In the attached testcase on arm a debug insn is reset, because a multi-reg
register has been used there and as the debug insn location was that
multi-reg register before, it is now VOIDmode after the reset - (clobber
(const_int 0)).  Fixed by disregarding the reset debug insns.  Changes
of kind 2) that needed rescanning don't need this, as the mode doesn't
change in that case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6?

2011-07-04  Jakub Jelinek  

PR debug/49522
* df-problems.c (dead_debug_insert_before): Ignore uses
where the use debug insn has been reset.

* gcc.dg/debug/pr49522.c: New test.

--- gcc/df-problems.c.jj2011-06-17 11:02:19.0 +0200
+++ gcc/df-problems.c   2011-07-04 10:46:42.0 +0200
@@ -3148,6 +3148,7 @@ dead_debug_insert_before (struct dead_de
   struct dead_debug_use *cur;
   struct dead_debug_use *uses = NULL;
   struct dead_debug_use **usesp = &uses;
+  bool no_reg_ok = false;
   rtx reg = NULL;
   rtx dval;
   rtx bind;
@@ -3161,6 +3162,21 @@ dead_debug_insert_before (struct dead_de
 {
   if (DF_REF_REGNO (cur->use) == uregno)
{
+ /* If cur->use insn has been meanwhile reset, but hasn't been
+rescanned, just ignore that use.  */
+ if (DF_REF_REAL_LOC (cur->use)
+ == &INSN_VAR_LOCATION_LOC (DF_REF_INSN (cur->use))
+ && VAR_LOC_UNKNOWN_P (*DF_REF_REAL_LOC (cur->use)))
+   {
+ gcc_assert (debug->to_rescan != NULL
+ && bitmap_bit_p (debug->to_rescan,
+  INSN_UID (DF_REF_INSN (cur->use;
+ *tailp = cur->next;
+ XDELETE (cur);
+ if (!reg)
+   no_reg_ok = true;
+ continue;
+   }
  *usesp = cur;
  usesp = &cur->next;
  *tailp = cur->next;
@@ -3174,6 +3190,9 @@ dead_debug_insert_before (struct dead_de
tailp = &(*tailp)->next;
 }
 
+  if (no_reg_ok && !reg)
+return;
+
   gcc_assert (reg);
 
   /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL).  */
--- gcc/testsuite/gcc.dg/debug/pr49522.c.jj 2011-07-04 10:54:23.0 
+0200
+++ gcc/testsuite/gcc.dg/debug/pr49522.c2011-07-04 10:54:02.0 
+0200
@@ -0,0 +1,41 @@
+/* PR debug/49522 */
+/* { dg-do compile } */
+/* { dg-options "-fcompare-debug" } */
+
+int val1 = 0L;
+volatile int val2 = 7L;
+long long val3;
+int *ptr = &val1;
+
+static int
+func1 ()
+{
+  return 0;
+}
+
+static short int
+func2 (short int a, unsigned int b)
+{
+  return !b ? a : a >> b;
+}
+
+static unsigned long long
+func3 (unsigned long long a, unsigned long long b)
+{
+  return !b ? a : a % b;
+}
+
+void
+func4 (unsigned short arg1, int arg2)
+{
+  for (arg2 = 0; arg2 < 2; arg2++)
+{
+  *ptr = func3 (func3 (10, func2 (val3, val2)), val3);
+  for (arg1 = -14; arg1 > 14; arg1 = func1 ())
+   {
+ *ptr = -1;
+ if (foo ())
+   ;
+   }
+}
+}

Jakub


[PATCH] Fix ICE with gfortran ... -L without argument (PR fortran/49623)

2011-07-04 Thread Jakub Jelinek
Hi!

If -L doesn't have an argument, find_spec_file ICEs on it, as
the argument is NULL.  As suggested by Joseph, this disregards in
this loop all options which don't have the required argument.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk/4.6?

2011-07-04  Jakub Jelinek  

PR fortran/49623
* gfortranspec.c (lang_specific_driver): Ignore options with
CL_ERR_MISSING_ARG errors.

--- gcc/fortran/gfortranspec.c.jj   2011-07-04 14:58:56.0 +0200
+++ gcc/fortran/gfortranspec.c  2011-07-04 15:01:58.0 +0200
@@ -255,6 +255,9 @@ lang_specific_driver (struct cl_decoded_
 
   for (i = 1; i < argc; ++i)
 {
+  if (decoded_options[i].errors & CL_ERR_MISSING_ARG)
+   continue;
+
   switch (decoded_options[i].opt_index)
{
case OPT_SPECIAL_input_file:

Jakub


Re: [pph] Fix global variable assembly ordering (issue4627087)

2011-07-04 Thread Diego Novillo
On Fri, Jul 1, 2011 at 21:35, Gabriel Charette  wrote:
> As variables are discovered (while parsing the header) they are added to the 
> varpool and their RTL is built.
>
> We do not stream, nor the varpool, nor the RTL (and I don't think we want to 
> + that wouldn't
> work with multiple pph).

Right.  Additionally, saving RTL makes the PPH target-dependent.  We
don't want that.

>
> We want to rebuild the varpool when streaming the global variables of the pph 
> in so as to
> redefine them in the varpool in the same order they would have been found in 
> a regular
> #include style parse.

Right.

> I'm not sure whether "global variables, not externals" is specific enough or 
> too broad (I can't reuse the caller
> of varpool_finalize_decl (rest_of_decl_compilation) to take care of this 
> logic because it needs some parser
> state which we no longer have). I will create more tests next week with 
> different orderings for functions,
> structs, etc. coming in from the pph.

Hm, I think we actually want to call rest_of_decl_compilation here.
This is also used from the LTO front end when reconstructing
variables.  Your patch is in the right direction, though, so I've
applied it for now.


Diego.


Re: [PATCH] Address lowering [1/3] Main patch

2011-07-04 Thread Michael Matz
Hi,

On Mon, 4 Jul 2011, Richard Guenther wrote:

> I still do not like the implementation of yet another CSE machinery
> given that we already have two.

>From reading it it really seems to be a normal block-local CSE, without 
anything fancy.  Hence, moving the pass just a little earlier (before 
pass_vrp/pass_dominator) should already provide for all optimizations.  If 
not those should be improved.

I see that it is used for also getting rid of the zero-offset statements 
in case non-zero-offsets follow.  I think that's generally worthwhile so 
probably should be done in one of the above optimizers.

You handle NOP_EXPR different from CONVERT_EXPR.  The middle-end doesn't 
distinguish between them (yes, ignore the comment about one generating 
code, the other not).

Your check for small types:

+ if (TYPE_MODE (TREE_TYPE (TREE_OPERAND (expr, 0))) == SImode)
+   ref_found = true;

You probably want != BLKmode .

+  if (changed && is_zero_offset_ref (gimple_assign_lhs (stmt)))
+VEC_safe_push (gimple, heap, zero_offset_refs, stmt);
+
+  rhs1 = gimple_assign_rhs1_ptr (stmt);
+  rhs_changed = tree_ssa_lower_addr_tree (rhs1, gsi, speed, false);
+
+  /* Record zero-offset mem_refs on the RHS.  */
+  if (rhs_changed && is_zero_offset_ref (gimple_assign_rhs1 (stmt)))
+VEC_safe_push (gimple, heap, zero_offset_refs, stmt);

This possibly adds stmt twice to zero_offset_refs.  Do you really want 
this?


Ciao,
Michael.


Re: [1/11] Use targetm.shift_truncation_mask more consistently

2011-07-04 Thread Richard Henderson
On 07/01/2011 10:27 AM, Bernd Schmidt wrote:
>   * simplify-rtx.c (simplify_const_binary_operation): Use the
>   shift_truncation_mask hook instead of performing modulo by
>   width.  Compare against mode precision, not bitsize.
>   * combine.c (combine_simplify_rtx, simplify_shift_const_1):
>   Use shift_truncation_mask instead of constructing the value
>   manually.

Ok.

r~


Re: [PATCH] Fix bootstrap on OpenBSD, PR48851

2011-07-04 Thread David Edelsohn
On Mon, Jul 4, 2011 at 8:51 AM, Richard Guenther  wrote:
> On Mon, 4 Jul 2011, Bruce Korb wrote:
>
>> Hi Richard,
>>
>> On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther  wrote:
>> >
>> > It happens that OpenBSD suffers from a bogus fixinclude that changes
>> > its perfectly valid NULL define from (void *)0 to 0.  The fix itself
>> > appears to be very old and is completely bogus - it replaces
>> > (void *)0 with 0 under the assumption the former is invalid for C++ -
>> > which is true - but 0 is inappropriate for C which is much worse.
>> >
>> > Thus, I propose to remove the fix altogether.  Platform maintainers
>> > can arrange for a new fix if the platforms still need fixing (which
>> > I seriously doubt after so many years and platform obsoletion).
>> >
>> > This restores bootstrap on OpenBSD.
>> >
>> > Ok for trunk and active branches?
>>
>> Sounds completely reasonable to me, but I think the platform maintainers
>> do need to say, "okay".  Cheers - Bruce
>
> We do not have an Interix maintainer listed, that leaves David for AIX.
> David, is this ok?  If not, can you please work on a better more
> specific fixinclude wrapping the C++ variant inside __GNUG__?

Okay with me.

Thanks, David



Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-07-04 Thread Andrew Stubbs

On 28/06/11 16:30, Andrew Stubbs wrote:

On 23/06/11 15:42, Andrew Stubbs wrote:

This patch fixes the case where widening multiply-and-accumulate were
not recognised because the multiplication itself is not actually
widening.

This can happen when you have "DI + SI * SI" - the multiplication will
be done in SImode as a non-widening multiply, and it's only the final
accumulate step that is widening.

This was not recognised for two reasons:

1. is_widening_mult_p inferred the output type from the multiply
statement, which in not useful in this case.

2. The inputs to the multiply instruction may not have been converted at
all (because they're not being widened), so the pattern match failed.

The patch fixes these issues by making the output type explicit, and by
permitting unconverted inputs (the types are still checked, so this is
safe).

OK?


This update fixes Janis' testsuite issue.


This updates the context changed by my update to patch 3.

The content of this patch has not changed.

Andrew
2011-07-04  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
	'type'.
	Use 'type' from caller, not inferred from 'rhs'.
	Don't reject non-conversion statements. Do return lhs in this case.
	(is_widening_mult_p): Add new argument 'type'.
	Use 'type' from caller, not inferred from 'stmt'.
	Pass type to is_widening_mult_rhs_p.
	(convert_mult_to_widen): Pass type to is_widening_mult_p.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-8.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap =
  }
 };
 
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+   assuming a target type of TYPE.
There are two cases:
 
  - RHS makes some value at least twice as wide.  Store that value
@@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched.  */
 
 static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+			tree *new_rhs_out)
 {
   gimple stmt;
-  tree type, type1, rhs1;
+  tree type1, rhs1;
   enum tree_code rhs_code;
 
   if (TREE_CODE (rhs) == SSA_NAME)
 {
-  type = TREE_TYPE (rhs);
   stmt = SSA_NAME_DEF_STMT (rhs);
   if (!is_gimple_assign (stmt))
 	return false;
 
-  rhs_code = gimple_assign_rhs_code (stmt);
-  if (TREE_CODE (type) == INTEGER_TYPE
-	  ? !CONVERT_EXPR_CODE_P (rhs_code)
-	  : rhs_code != FIXED_CONVERT_EXPR)
-	return false;
-
   rhs1 = gimple_assign_rhs1 (stmt);
   type1 = TREE_TYPE (rhs1);
   if (TREE_CODE (type1) != TREE_CODE (type)
 	  || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
 	return false;
 
-  *new_rhs_out = rhs1;
+  rhs_code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE (type) == INTEGER_TYPE
+	  ? !CONVERT_EXPR_CODE_P (rhs_code)
+	  : rhs_code != FIXED_CONVERT_EXPR)
+	*new_rhs_out = gimple_assign_lhs (stmt);
+  else
+	*new_rhs_out = rhs1;
   *type_out = type1;
   return true;
 }
@@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
   return false;
 }
 
-/* Return true if STMT performs a widening multiplication.  If so,
-   store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
-   respectively.  Also fill *RHS1_OUT and *RHS2_OUT such that converting
-   those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
-   operands of the multiplication.  */
+/* Return true if STMT performs a widening multiplication, assuming the
+   output type is TYPE.  If so, store the unwidened types of the operands
+   in *TYPE1_OUT and *TYPE2_OUT respectively.  Also fill *RHS1_OUT and
+   *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+   and *TYPE2_OUT would give the operands of the multiplication.  */
 
 static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
 		tree *type1_out, tree *rhs1_out,
 		tree *type2_out, tree *rhs2_out)
 {
-  tree type;
-
-  type = TREE_TYPE (gimple_assign_lhs (stmt));
   if (TREE_CODE (type) != INTEGER_TYPE
   && TREE_CODE (type) != FIXED_POINT_TYPE)
 return false;
 
-  if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+			   rhs1_out))
 return false;
 
-  if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,

Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-07-04 Thread Andrew Stubbs

On 28/06/11 16:08, Andrew Stubbs wrote:

On 23/06/11 15:41, Andrew Stubbs wrote:

This patch removes the restriction that the inputs to a widening
multiply must be of the same mode.

It does this by extending the smaller of the two inputs to match the
larger; therefore, it remains the case that subsequent code (in the
expand pass, for example) can rely on the type of rhs1 being the input
type of the operation, and the gimple verification code is still valid.

OK?


This update fixes the testcase issue Janis highlighted.


And this one updates the context changed by my update to patch 3.

The content of the patch has not changed.

Andrew
2011-06-28  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
	Ensure the the larger type is the first operand.
	(convert_mult_to_widen): Insert cast if type2 is smaller than type1.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-7.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt,
   *type2_out = *type1_out;
 }
 
-  /* FIXME: remove this restriction.  */
-  if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
-return false;
+  /* Ensure that the larger of the two operands comes first. */
+  if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+{
+  tree tmp;
+  tmp = *type1_out;
+  *type1_out = *type2_out;
+  *type2_out = tmp;
+  tmp = *rhs1_out;
+  *rhs1_out = *rhs2_out;
+  *rhs2_out = tmp;
+}
 
   return true;
 }
@@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
   enum insn_code handler;
   enum machine_mode to_mode, from_mode;
   optab op;
+  int cast1 = false, cast2 = false;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 	return false;
 
 	  type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
-
-	  rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
-	create_tmp_var (type1, NULL), rhs1, type1);
-	  rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
-	create_tmp_var (type2, NULL), rhs2, type2);
+	  cast1 = cast2 = true;
 	}
   else
 	return false;
 }
 
+  if (TYPE_MODE (type2) != from_mode)
+{
+  type2 = lang_hooks.types.type_for_mode (from_mode,
+	  TYPE_UNSIGNED (type2));
+  cast2 = true;
+}
+
+  if (cast1)
+rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+  create_tmp_var (type1, NULL), rhs1, type1);
+  if (cast2)
+rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+  create_tmp_var (type2, NULL), rhs2, type2);
+
   gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
   gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
   gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2215,6 +2234,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   optab this_optab;
   enum tree_code wmult_code;
   enum insn_code handler;
+  int cast1 = false, cast2 = false;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2302,17 +2322,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
 	{
 	  type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
-	  mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
-	 create_tmp_var (type1, NULL),
-	 mult_rhs1, type1);
-	  mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
-	 create_tmp_var (type2, NULL),
-	 mult_rhs2, type2);
+	  cast1 = cast2 = true;
 	}
   else
 	return false;
 }
 
+  if (TYPE_MODE (type2) != TYPE_MODE (type1))
+{
+  type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+	  TYPE_UNSIGNED (type2));
+  cast2 = true;
+}
+
+  if (cast1)
+mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type1, NULL),
+   mult_rhs1, type1);
+  if (cast2)
+mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type2, NULL),
+   mult_rhs2, type2);
+
   /* Verify that the convertions between the mult and the add doesn't do
  anything unexpected.  */
   if (!valid_types_for_madd_p (type1, type2, mult_rhs))


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-04 Thread Andrew Stubbs

On 28/06/11 15:14, Andrew Stubbs wrote:

On 28/06/11 13:33, Andrew Stubbs wrote:

On 23/06/11 15:41, Andrew Stubbs wrote:

If one or both of the inputs to a widening multiply are of unsigned type
then the compiler will attempt to use usmul_widen_optab or
umul_widen_optab, respectively.

That works fine, but only if the target supports those operations
directly. Otherwise, it just bombs out and reverts to the normal
inefficient non-widening multiply.

This patch attempts to catch these cases and use an alternative signed
widening multiply instruction, if one of those is available.

I believe this should be legal as long as the top bit of both inputs is
guaranteed to be zero. The code achieves this guarantee by
zero-extending the inputs to a wider mode (which must still be narrower
than the output mode).

OK?


This update fixes the testsuite issue Janis pointed out.


And this one fixes up the wmul-5.c testcase also. The patch has changed
the correct result.


Here's an update for the context changed by the update to patch 3.

The content of the patch has not changed.

Andrew
2011-07-04  Andrew Stubbs  

	gcc/
	* Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency.
	* optabs.c (find_widening_optab_handler): Rename to ...
	(find_widening_optab_handler_and_mode): ... this, and add new
	argument 'found_mode'.
	* optabs.h (find_widening_optab_handler): Rename to ...
	(find_widening_optab_handler_and_mode): ... this.
	(find_widening_optab_handler): New macro.
	* tree-ssa-math-opts.c: Include langhooks.h
	(build_and_insert_cast): New function.
	(convert_mult_to_widen): Add new argument 'gsi'.
	Convert unsupported unsigned multiplies to signed.
	(convert_plusminus_to_widen): Likewise.
	(execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen.

	gcc/testsuite/
	* gcc.target/arm/wmul-5.c: Update expected result.
	* gcc.target/arm/wmul-6.c: New file.

--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
 tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \
$(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \
-   $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h
+   $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \
+   langhooks.h
 tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \
$(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
non-widening optabs also.  */
 
 enum insn_code
-find_widening_optab_handler (optab op, enum machine_mode to_mode,
-			 enum machine_mode from_mode,
-			 int permit_non_widening)
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+  enum machine_mode from_mode,
+  int permit_non_widening,
+  enum machine_mode *found_mode)
 {
   for (; (permit_non_widening || from_mode != to_mode)
 	 && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
@@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode,
 		   from_mode);
 
   if (handler != CODE_FOR_nothing)
-	return handler;
+	{
+	  if (found_mode)
+	*found_mode = from_mode;
+	  return handler;
+	}
 }
 
   return CODE_FOR_nothing;
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
 extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
 
 /* Find a widening optab even if it doesn't widen as much as we want.  */
-extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
-		   enum machine_mode, int);
+#define find_widening_optab_handler(A,B,C,D) \
+  find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+			enum machine_mode,
+			enum machine_mode,
+			int,
+			enum machine_mode *);
 
 /* An extra flag to control optab_for_tree_code's behavior.  This is needed to
distinguish between machines with a vector shift that takes a scalar for the
--- a/gcc/testsuite/gcc.target/arm/wmul-5.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -7,4 +7,4 @@ foo (long long a, char *b, char *c)
   return a + *b * *c;
 }
 
-/* { dg-final { scan-assembler "umlal" } } */
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+  return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-mat

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-04 Thread Andrew Stubbs

On 01/07/11 13:25, Richard Guenther wrote:

Well - some operations work the same on both signedness if you
just care about the twos-complement result.  This includes
multiplication (but not for example division).  For this special
case I suggest to not bother trying to invent a generic predicate
but do something local in tree-ssa-math-opts.c.


OK, here's my updated patch.

I've taken the view that we *know* what size and signedness the result 
of the multiplication is, and we know what size the input to the 
addition must be, so all the check has to do is make sure it does that 
same conversion, even if by a roundabout means.


What I hadn't grasped before is that when extending a value it's the 
source type that is significant, not the destination, so the checks are 
not as complex as I had thought.


So, this patch adds a test to ensure that:

 1. the type is not truncated so far that we lose any information; and

 2. the type is only ever extended in the proper signedness.

Also, just to be absolutely sure, I've also added a little bit of logic 
to permit extends that are then undone by a truncate. I'm really not 
sure what guarantees there are about what sort of cast sequences can 
exist? Is this necessary? I haven't managed to coax it to generated any 
examples of extends followed by truncates myself, but in any case, it's 
hardly any code and it'll make sure it's future proofed.


OK?

Andrew
2011-06-28  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (valid_types_for_madd_p): New function.
	(convert_plusminus_to_widen): Use valid_types_for_madd_p to
	identify optimization candidates.

	gcc/testsuite/
	* gcc.target/arm/wmul-5.c: New file.
	* gcc.target/arm/no-wmla-1.c: New file.

---
 .../gcc/testsuite/gcc.target/arm/no-wmla-1.c   |   11 ++
 .../gcc/testsuite/gcc.target/arm/wmul-5.c  |   10 ++
 src/gcc-mainline/gcc/tree-ssa-math-opts.c  |  112 ++--
 3 files changed, 123 insertions(+), 10 deletions(-)
 create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
 create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c

diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
new file mode 100644
index 000..17f7427
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
new file mode 100644
index 000..65c43e3
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
diff --git a/src/gcc-mainline/gcc/tree-ssa-math-opts.c b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
index d55ba57..5ef7bb4 100644
--- a/src/gcc-mainline/gcc/tree-ssa-math-opts.c
+++ b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
@@ -2085,6 +2085,78 @@ convert_mult_to_widen (gimple stmt)
   return true;
 }
 
+/* Check the input types, TYPE1 and TYPE2 to a widening multiply,
+   and then the convertions between the output of the multiply, and
+   the input to an addition EXPR, to ensure that they are compatible with
+   a widening multiply-and-accumulate.
+
+   This function assumes that expr is a valid string of conversion expressions
+   terminated by a multiplication.
+
+   This function tries NOT to make any (fragile) assumptions about what
+   sequence of conversions can exist in the input.  */
+
+static bool
+valid_types_for_madd_p (tree type1, tree type2, tree expr)
+{
+  gimple stmt, prev_stmt;
+  enum tree_code code, prev_code;
+  tree prev_expr, type, prev_type;
+  int bitsize, prev_bitsize, initial_bitsize, min_bitsize;
+  bool initial_unsigned;
+
+  initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+  initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+  stmt = SSA_NAME_DEF_STMT (expr);
+  code = gimple_assign_rhs_code (stmt);
+  type = TREE_TYPE (expr);
+  bitsize = TYPE_PRECISION (type);
+  min_bitsize = bitsize;
+
+  if (code == MULT_EXPR || code == WIDEN_MULT_EXPR)
+return true;
+
+  if (!INTEGRAL_TYPE_P (type)
+  || TYPE_PRECISION (type) < initial_bitsize)
+return false;
+
+  /* Step through the conversions backwards.  */
+  while (true)
+{
+  prev_expr = gimple_assign_rhs1 (stmt);
+  prev_stmt = SSA_NAME_DEF_STMT (prev_expr);
+  prev_code = gimple_assign_rhs_code (prev_stmt);
+  prev_type = TREE_TYPE (prev_expr);
+  prev_bitsize = TYPE_PRECISION (prev_type);
+
+  if (prev_code == MULT_EXPR || prev_code == WIDEN

Re: [PATCH] Address lowering [1/3] Main patch

2011-07-04 Thread Richard Guenther
On Thu, Jun 30, 2011 at 4:39 PM, William J. Schmidt
 wrote:
> This is the first of three patches related to lowering addressing
> expressions to MEM_REFs and TARGET_MEM_REFs in late gimple.  This patch
> contains the new pass together with supporting changes in existing
> modules.  The second patch contains an independent change to the RTL
> forward propagator to keep it from undoing an optimization made in the
> first patch.  The third patch contains new test cases and changes to
> existing test cases.
>
> Although I've broken it up into three patches to make the review easier,
> it would be best to commit at least the first and third together to
> avoid regressions.  The second can stand alone.
>
> I've done regression tests on powerpc64 and x86_64, and have asked
> Andreas Krebbel to test against the IBM z (390) platform.  I've done
> performance regression testing on powerpc64.  The only performance
> regression of note is the 2% degradation to 188.ammp due to loss of
> field disambiguation information.  As discussed in another thread,
> fixing this introduces more complexity than it's worth.

Are there also performance improvements?  What about code size?

I tried to get an understanding to what kind of optimizations this patch
produces based on the test of testcases you added, but I have a hard
time here.  Can you outline some please?

I still do not like the implementation of yet another CSE machinery
given that we already have two.  I think most of the need for CSE
comes from the use of the affine combination framework and
force_gimple_operand.  In fact I'd be interested to see cases that
are optimized that could not be handled by a combine-like
pattern matcher?

Thanks,
Richard.


Re: PATCH: PR target/49600: Bad SSE2 int->float split in i386.md

2011-07-04 Thread H.J. Lu
On Mon, Jul 4, 2011 at 3:18 AM, Uros Bizjak  wrote:
> On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu  wrote:
>
> In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true,
> TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we
> will get gcc_unreachable.  This patch removes TARGET_INTER_UNIT_MOVES
> check.  OK for trunk?

 This will result in register allocation failure. Operand 0 of
>>
>> That particular sse2_loadld insn matches:
>>
>> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83])
>>        (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64
>> test ] [64]))
>>            (const_vector:V4SI [
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                    (const_int 0 [0])
>>                ])
>>            (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0}
>>     (nil))
>>
>
> Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The
> constraint should be Yi, but then we don't shadow other alternatives
> correctly.
>
 sse2_loadld pattern has conditional constraint Yi that depends on
 TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld
 after reload.  I'm testing attached patch.

 BTW: Do you perhaps have a testcase for this problem?
>>>
>>> I have a testcase. But it needs a new x86 optimization we are working on it.
>>>
 2011-07-03  Uros Bizjak  

        PR target/49600
        * config/i386/i386.md (SSE2 int->float split): Push operand 1 in
        general register to memory for !TARGET_INTER_UNIT_MOVES.

>>>
>>> I will give it a try.
>>>
>>
>> It doesn't work: I still got
>
> Yes, I later noticed that I have changed the wrong pattern (the one
> with memory clobber) ;( . Attached is the correct patch.
>

This works.  Can you check it in?

Thanks.

-- 
H.J.


Re: [PATCH] Handle vectorization of invariant loads (PR46787)

2011-07-04 Thread H.J. Lu
On Wed, Jun 29, 2011 at 4:19 AM, Richard Guenther  wrote:
>
> The following patch makes us handle invariant loads during vectorization.
> Dependence analysis currently isn't clever enough to disambiguate them
> thus we insert versioning-for-alias checks.  For the testcase hoisting
> the load is still always possible though, and for a read-after-write
> dependence it would be possible for the vectorized loop copy as the
> may-aliasing write is varying by the scalar variable size.
>
> The existing code for vectorizing invariant accesses looks very
> suspicious - it generates a vector load at the scalar address
> to then just extract the first vector element.  Huh.  IMHO this
> can be simplified as done, by just re-using the scalar load result.
> But maybe this code was supposed to deal with something entirely
> different?
>
> This patch gives a 33% speedup to the phoronix himeno testcase
> if you bump the maximum alias versioning checks we want to insert.
>
> I'm currently re-bootstrapping & testing this but an earlier version
> was ok on x86_64-unknown-linux-gnu.
>
> 2011-06-29  Richard Guenther  
>
>        PR tree-optimization/46787
>        * tree-data-ref.c (dr_address_invariant_p): Remove.
>        (find_data_references_in_stmt): Invariant accesses are ok now.
>        * tree-vect-stmts.c (vectorizable_load): Handle invariant
>        loads.
>        * tree-vect-data-refs.c (vect_analyze_data_ref_access): Allow
>        invariant loads.
>
>        * gcc.dg/vect/vect-121.c: New testcase.
>

This also caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49628


-- 
H.J.


Re: Improve Solaris mudflap support (PR libmudflap/49550)

2011-07-04 Thread Rainer Orth
Frank,

this patch has remained unreviewed for a week.  Could you please have a
look?

Thanks.
Rainer


Rainer Orth  writes:

> This is the first of two patches to get mudflap fully working on
> Solaris 11, both with Sun ld and GNU ld.
>
> It addresses a couple of testsuite failures:
>
> * Several tests fail with 3 unexpected register violations:
>
> ***
> mudflap violation 1 (register): time=1309356076.070433 ptr=21680 size=16
> pc=7fa07a64
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__mf_register+0x2c
>  [0x7fa07a64]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__wrap_main+0x194
>  [0x7fa07c18]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/testsuite/heap-scalestress.exe'_start+0x5c
>  [0x10afc]
> Nearby object 1: checked region begins 0B into and ends 15B into
> mudflap object a2aa8: name=`/usr/include/iso/stdio_iso.h:163:15 __iob'
> bounds=[21680,217bf] size=320 area=static check=0r/0w liveness=0
> alloc time=1309356076.069900 pc=7fa07a64
> number of nearby objects: 1
>
>   All 3 are 0, 16, or 32 bytes __iob[].  The error goes away with
>   -no-heur-stdlib.
>
>   If running the test with -trace-calls, I find:
>
> mf: register ptr=21680 size=16 type=4 name='stdin'
> mf: violation pc=7fa07b94 location= type=3 ptr=21680 size=16
> ***
> mudflap violation 1 (register): time=1309365780.121411 ptr=21680 size=16
> pc=7fa07b94
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__mf_register+0x2c
>  [0x7fa07b94]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.l
> ibs/libmudflap.so.0.0.0'__wrap_main+0x194 [0x7fa07d48]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/te
> stsuite/heap-scalestress.exe'_start+0x5c [0x10afc]
> Nearby object 1: checked region begins 0B into and ends 15B into
> mudflap object a2aa8: name=`/usr/include/iso/stdio_iso.h:163:15 __iob'
> bounds=[21680,217bf] size=320 area=static check=0r/0w liveness=0
> alloc time=1309365780.077107 pc=7fa07b94
> number of nearby objects: 1
>
>   The conflict is between
>
> mf: register ptr=21680 size=320 type=4 
> name='/usr/include/iso/stdio_iso.h:163:15 __iob'
>
>   and
>
> mf: register ptr=21680 size=16 type=4 name='stdin'
> mf: violation pc=7fa07260 location= type=3 ptr=21680 size=16
>
>   where the registration of __iob has been done automatically by the
>   compiler.  I avoid this problem by not registering stdin, stdout, and
>   stderr separately on Solaris.
>
> * Some tests were failing while calling unregister in munmap.  It turned
>   out that there had been no corresponding mmap registration before.
>   This occurs because Solaris has mmap64 for largefile-aware programs
>   instead.  Fixed by wrapping mmap64, too.  What I don't know is if
>   mmap64 needs to be added to MFWRAP_SPEC in gcc.c?  If so, I'd rather
>   do it by adding some MFWRAP_OS_SPEC to avoid having to duplicate the
>   whole spec in the Solaris config headers.
>
> * As noted in the last patch, the getmntent signature differs in
>   Solaris.  This patch implements a wrapper for the Solaris version.
>
> * libmudflap.cth/pass37-frag.c would fail like this:
>
> ***
> mudflap violation 1 (unregister): time=1309444614.922185 ptr=7f9e90a4 size=4
> pc=7fa07e78
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_unregister+0xec
>  [0x7fa07e78]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_pthread_cleanup+0x4c
>  [0x7fa277dc]
>   
> /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_pthread_spawner+0x14c
>  [0x7fa27938]
> Nearby object 1: checked region begins 0B into and ends 3B into
> mudflap object a43b8: name=`errno area'
> bounds=[7f9e90a4,7f9e90a7] size=4 area=static check=0r/0w liveness=0
> alloc time=1309444614.909941 pc=7fa077e8 thread=1
> number of nearby objects: 1
> FAIL: libmudflap.cth/pass37-frag.c execution test
>
>   Investigating with -trace-calls reveals that all registrations and
>   unregistrations of errno are for the same address, which is wrong for
>   multithreaded programs which access errno via an accessor function.
>   To enable that,  needs to be included with _REENTRANT
>   defined.  It turned out that it suffices to do this in mf-hooks3.c.
>
> * libmudflap.c/heap-scalestress.c always timed out on my SPARC test
>   system: on a 1.2 GHz UltraSPARC-T2, it takes
>
> real8:47.06
> user  43.12
> sys 8:03.77
>
>   which is way over the limit.  On my laptop (1.6 GHz Core i7), it takes
>
> real  37.35
> user   5.06
> sys   32.23
>
>   I've divided SCALE by 10 to account for this.
>
> * I've replaced all the __FreeBSD__ && ... 

Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns

2011-07-04 Thread Richard Guenther
On Fri, Jul 1, 2011 at 5:23 PM, Kai Tietz  wrote:
> So updated patch (bootstrapped and tested for all standard languages
> plus Ada and Obj-C++) on x86_64-pc-linux-gnu host.
>
> Index: gcc-head/gcc/tree-ssa-forwprop.c
> ===
> --- gcc-head.orig/gcc/tree-ssa-forwprop.c
> +++ gcc-head/gcc/tree-ssa-forwprop.c
> @@ -1602,6 +1602,156 @@ simplify_builtin_call (gimple_stmt_itera
>   return false;
>  }
>
> +/* Checks if expression has type of one-bit precision, or is a known
> +   truth-valued expression.  */
> +static bool
> +truth_valued_ssa_name (tree name)
> +{
> +  gimple def;
> +  tree type = TREE_TYPE (name);
> +
> +  if (!INTEGRAL_TYPE_P (type))
> +    return false;
> +  /* Don't check here for BOOLEAN_TYPE as the precision isn't
> +     necessarily one and so ~X is not equal to !X.  */
> +  if (TYPE_PRECISION (type) == 1)
> +    return true;
> +  def = SSA_NAME_DEF_STMT (name);
> +  if (is_gimple_assign (def))
> +    return truth_value_p (gimple_assign_rhs_code (def));
> +  return false;
> +}
> +
> +/* Helper routine for simplify_bitwise_binary_1 function.
> +   Return for the SSA name NAME the expression X if it mets condition
> +   NAME = !X. Otherwise return NULL_TREE.
> +   Detected patterns for NAME = !X are:
> +     !X and X == 0 for X with integral type.
> +     X ^ 1, X != 1,or ~X for X with integral type with precision of one.  */
> +static tree
> +lookup_logical_inverted_value (tree name)
> +{
> +  tree op1, op2;
> +  enum tree_code code;
> +  gimple def;
> +
> +  /* If name has none-intergal type, or isn't a SSA_NAME, then
> +     return.  */
> +  if (TREE_CODE (name) != SSA_NAME
> +      || !INTEGRAL_TYPE_P (TREE_TYPE (name)))
> +    return NULL_TREE;
> +  def = SSA_NAME_DEF_STMT (name);
> +  if (!is_gimple_assign (def))
> +    return NULL_TREE;
> +
> +  code = gimple_assign_rhs_code (def);
> +  op1 = gimple_assign_rhs1 (def);
> +  op2 = NULL_TREE;
> +
> +  /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand.
> +     If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR,
> +     or BIT_NOT_EXPR, then return.  */
> +  if (code == EQ_EXPR || code == NE_EXPR
> +      || code == BIT_XOR_EXPR)
> +    op2 = gimple_assign_rhs2 (def);
> +
> +  switch (code)
> +    {
> +    case TRUTH_NOT_EXPR:
> +      return op1;
> +    case BIT_NOT_EXPR:
> +      if (truth_valued_ssa_name (name))
> +       return op1;
> +      break;
> +    case EQ_EXPR:
> +      /* Check if we have X == 0 and X has an integral type.  */
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
> +       break;
> +      if (integer_zerop (op2))
> +       return op1;
> +      break;
> +    case NE_EXPR:
> +      /* Check if we have X != 1 and X is a truth-valued.  */
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
> +       break;
> +      if (integer_onep (op2) && truth_valued_ssa_name (op1))
> +       return op1;
> +      break;
> +    case BIT_XOR_EXPR:
> +      /* Check if we have X ^ 1 and X is truth valued.  */
> +      if (integer_onep (op2) && truth_valued_ssa_name (op1))
> +       return op1;
> +      break;
> +    default:
> +      break;
> +    }
> +
> +  return NULL_TREE;
> +}
> +
> +/* Try to optimize patterns X & !X -> zero, X | !X -> one, and
> +   X ^ !X -> one, if type of X is valid for this.
> +
> +   See for list of detected logical-not patterns the
> +   lookup_logical_inverted_value function.  */

As usual - refer to actual arguments.  I'd do

/* Optimize ARG1 CODE ARG2 to a constant for bitwise binary
   operations CODE if one operand has the logically inverted
   value of the other.  */

> +static tree
> +simplify_bitwise_binary_1 (enum tree_code code, tree arg1,
> +                          tree arg2)
> +{
> +  tree a1not, a2not;
> +  tree op = NULL_TREE;
> +
> +  /* If CODE isn't a bitwise binary operation, return NULL_TREE.  */
> +  if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR
> +      && code != BIT_XOR_EXPR)
> +    return NULL_TREE;
> +
> +  /* First check if operands ARG1 and ARG2 are equal.  */
> +  if (operand_equal_p (arg1, arg2, 0))
> +    return NULL_TREE;

That's an early out - use arg1 == arg2 instead and mention why
we do not optimize it - it's done by fold_stmt.

> +  /* See if we have in arguments logical-not patterns.  */
> +  a1not = lookup_logical_inverted_value (arg1);
> +  a2not = lookup_logical_inverted_value (arg2);

You didn't re-organize the code to only call one of the lookups if
that succeeded as I requested.

> +  /* If there are no logical-not in arguments,  return NULL_TREE. */
> +  if (!a1not && !a2not)
> +    return NULL_TREE;
> +
> +  /* If both arguments are logical-not patterns, then try to fold
> +     them or return NULL_TREE.  */
> +  if (a1not && a2not)
> +    {
> +      /* If logical-not operands of ARG1 and ARG2 are equal, then fold
> +        them..  */

No double-full-stop please.  Instead of "fold" say "simplify".

> +      if (operand_equal_p (a1not, a2not, 0))

The only case where a1not or a2not

Re: [PATCH] Fix bootstrap on OpenBSD, PR48851

2011-07-04 Thread Richard Guenther
On Mon, 4 Jul 2011, Bruce Korb wrote:

> Hi Richard,
> 
> On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther  wrote:
> >
> > It happens that OpenBSD suffers from a bogus fixinclude that changes
> > its perfectly valid NULL define from (void *)0 to 0.  The fix itself
> > appears to be very old and is completely bogus - it replaces
> > (void *)0 with 0 under the assumption the former is invalid for C++ -
> > which is true - but 0 is inappropriate for C which is much worse.
> >
> > Thus, I propose to remove the fix altogether.  Platform maintainers
> > can arrange for a new fix if the platforms still need fixing (which
> > I seriously doubt after so many years and platform obsoletion).
> >
> > This restores bootstrap on OpenBSD.
> >
> > Ok for trunk and active branches?
> 
> Sounds completely reasonable to me, but I think the platform maintainers
> do need to say, "okay".  Cheers - Bruce

We do not have an Interix maintainer listed, that leaves David for AIX.
David, is this ok?  If not, can you please work on a better more
specific fixinclude wrapping the C++ variant inside __GNUG__?

Thanks,
Richard.

Re: [PATCH] Fix bootstrap on OpenBSD, PR48851

2011-07-04 Thread Bruce Korb
Hi Richard,

On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther  wrote:
>
> It happens that OpenBSD suffers from a bogus fixinclude that changes
> its perfectly valid NULL define from (void *)0 to 0.  The fix itself
> appears to be very old and is completely bogus - it replaces
> (void *)0 with 0 under the assumption the former is invalid for C++ -
> which is true - but 0 is inappropriate for C which is much worse.
>
> Thus, I propose to remove the fix altogether.  Platform maintainers
> can arrange for a new fix if the platforms still need fixing (which
> I seriously doubt after so many years and platform obsoletion).
>
> This restores bootstrap on OpenBSD.
>
> Ok for trunk and active branches?

Sounds completely reasonable to me, but I think the platform maintainers
do need to say, "okay".  Cheers - Bruce


Re: [PATCH] Fix PR49518

2011-07-04 Thread Richard Guenther
On Mon, 4 Jul 2011, Ira Rosen wrote:

> 
> 
> Richard Guenther  wrote on 04/07/2011 02:38:50 PM:
> 
> > Handling of negative steps broke one of the many asserts in
> > the vectorizer.  The following patch drops one that I can't
> > make sense of.  I think all asserts need comments - especially
> > this one would, as I can't see why using vf is correct to
> > test against and not nelements (and why <= vf and not < vf).
> 
> There is an explanation 10 rows above the assert. It doesn't make sense to
> peel more than vf iterations (and not nelements, since for the case of
> multiple types it may help to align more data-refs - see the comment in the
> code). IIRC <= is for the case of aligned access, but I am not sure about
> that, so maybe you are right.
> 
> I don't see how it is related to negative steps.
> 
> I think that the real reason for this failure is that the loads are
> actually irrelevant (hence, vf=4 that doesn't take char loads into
> account), but we don't check that when we analyze data-refs. So, in my
> opinion, the proper fix will add such check.

The following also works for me:

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 175802)
+++ tree-vect-data-refs.c   (working copy)
@@ -1495,6 +1495,9 @@ vect_enhance_data_refs_alignment (loop_v
   stmt = DR_STMT (dr);
   stmt_info = vinfo_for_stmt (stmt);
 
+  if (!STMT_VINFO_RELEVANT (stmt_info))
+   continue;
+
   /* For interleaving, only the alignment of the first access
  matters.  */
   if (STMT_VINFO_STRIDED_ACCESS (stmt_info)

does that look better or do you propose to clean the datarefs
vector from those references?

Thanks,
Richard.


Re: [PATCH] Fix PR49518

2011-07-04 Thread Ira Rosen


Richard Guenther  wrote on 04/07/2011 02:38:50 PM:

> Handling of negative steps broke one of the many asserts in
> the vectorizer.  The following patch drops one that I can't
> make sense of.  I think all asserts need comments - especially
> this one would, as I can't see why using vf is correct to
> test against and not nelements (and why <= vf and not < vf).

There is an explanation 10 rows above the assert. It doesn't make sense to
peel more than vf iterations (and not nelements, since for the case of
multiple types it may help to align more data-refs - see the comment in the
code). IIRC <= is for the case of aligned access, but I am not sure about
that, so maybe you are right.

I don't see how it is related to negative steps.

I think that the real reason for this failure is that the loads are
actually irrelevant (hence, vf=4 that doesn't take char loads into
account), but we don't check that when we analyze data-refs. So, in my
opinion, the proper fix will add such check.

Thanks,
Ira

>
> Well, ok?
>
> Thanks,
> Richard.
>
> 2011-07-04  Richard Guenther  
>
>PR tree-optimization/49518
>* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
>Drop assert.
>
>* gcc.dg/torture/pr49518.c: New testcase.
>
> Index: gcc/tree-vect-data-refs.c
> ===
> --- gcc/tree-vect-data-refs.c   (revision 175800)
> +++ gcc/tree-vect-data-refs.c   (working copy)
> @@ -1552,7 +1552,6 @@ vect_enhance_data_refs_alignment (loop_v
>
>for (j = 0; j < possible_npeel_number; j++)
>  {
> -  gcc_assert (npeel_tmp <= vf);
>vect_peeling_hash_insert (loop_vinfo, dr, npeel_tmp);
>npeel_tmp += nelements;
>  }
> Index: gcc/testsuite/gcc.dg/torture/pr49518.c
> ===
> --- gcc/testsuite/gcc.dg/torture/pr49518.c   (revision 0)
> +++ gcc/testsuite/gcc.dg/torture/pr49518.c   (revision 0)
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +
> +int a, b;
> +struct S { unsigned int s, t, u; } c, d = { 0, 1, 0 };
> +
> +void
> +test (unsigned char z)
> +{
> +  char e[] = {0, 0, 0, 0, 1};
> +  for (c.s = 1; c.s; c.s++)
> +{
> +  b = e[c.s];
> +  if (a)
> +   break;
> +  b = z >= c.u;
> +  if (d.t)
> +   break;
> +}
> +}



Re: [Path, AVR]: Implement __builtin_avr_fmul* if no hardware multiplier

2011-07-04 Thread Denis Chertykov
2011/7/4 Georg-Johann Lay :
> The current implementation of __builtin_avr_fmul/fmuls/fmulsu has a
> gap if no hardware multiplier is available.
>
> This patch closes that gap by providing libgcc implementations named
> __fmul, __fmuls resp. __fmulsu.
>
> The implementations yield the same result as respective FMUL*
> instructions and have been testes against these instructions for all
> possible combinations of input values on an atmega88 device.
>
> Johann
>
>
>        * doc/extend.texi (AVR Built-in Functions): Update documentation
>        of __builtin_avr_fmul*.
>        * config/avr/avr.c (avr_init_builtins): Don't depend on
>        AVR_HAVE_MUL.
>        * config/avr/avr-c.c (avr_cpu_cpp_builtins): Ditto.
>        * config/avr/avr.md (fmul): Rename to fmul_insn.
>        (fmuls): Rename to fmuls_insn.
>        (fmulsu): Rename to fmulsu_insn.
>        (fmul,fmuls,fmulsu): New expander.
>        (*fmul.call,*fmuls.call,*fmulsu.call): New Insn.
>        * config/avr/t-avr (LIB1ASMFUNCS): Add _fmul, _fmuls, _fmulsu.
>        * config/avr/libgcc.S (__fmul): New function.
>        (__fmuls): New function.
>        (__fmulsu,__fmulsu_exit): New function.
>

Approved.

Denis.


Re: Ping #1: [Patch, AVR, 4.6+trunk]: PR44643 addendum

2011-07-04 Thread Denis Chertykov
2011/7/4 Georg-Johann Lay :
> Georg-Johann Lay wrote:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html
>
>> avr_insert_attributes uses TREE_READONLY on get readonlyness of node.
>>
>> That does not work for C++ arrays: it gives false error
>> "variable must be const in order to be put into read-only section by
>> means of '__attribute__((progmem))'".
>>
>> This patch peels arrays and uses TYPE_READONLY.
>>
>> I did not open separate PR for this, tagged it as addendum to PR44643
>> instead.
>>
>> Lightly tested on own code. There is no 'progmem' in testsuite, so
>> from testsuite's perspective that code is dead, anyway...
>>
>> Johann
>>
>>       PR target/44643
>>       * config/avr/avr.c (avr_insert_attributes): Use TYPE_READONLY
>>       instead of TREE_READONLY.

Approved.

Denis.


[PATCH] Fix PR49615

2011-07-04 Thread Richard Guenther

This fixes an oversight in split_bbs_on_noreturn_calls.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied everywhere.

Richard.

2011-07-04  Richard Guenther  

PR tree-optimization/49615
* tree-cfgcleanup.c (split_bbs_on_noreturn_calls): Fix
basic-block index check.

* g++.dg/torture/pr49615.C: New testcase.

Index: gcc/tree-cfgcleanup.c
===
*** gcc/tree-cfgcleanup.c   (revision 175752)
--- gcc/tree-cfgcleanup.c   (working copy)
*** split_bbs_on_noreturn_calls (void)
*** 599,605 
   BB is present in the cfg.  */
if (bb == NULL
|| bb->index < NUM_FIXED_BLOCKS
!   || bb->index >= n_basic_blocks
|| BASIC_BLOCK (bb->index) != bb
|| !gimple_call_noreturn_p (stmt))
  continue;
--- 599,605 
   BB is present in the cfg.  */
if (bb == NULL
|| bb->index < NUM_FIXED_BLOCKS
!   || bb->index >= last_basic_block
|| BASIC_BLOCK (bb->index) != bb
|| !gimple_call_noreturn_p (stmt))
  continue;
Index: gcc/testsuite/g++.dg/torture/pr49615.C
===
*** gcc/testsuite/g++.dg/torture/pr49615.C  (revision 0)
--- gcc/testsuite/g++.dg/torture/pr49615.C  (revision 0)
***
*** 0 
--- 1,29 
+ /* { dg-do compile } */
+ /* { dg-options "-g" } */
+ 
+ template 
+ static inline bool Dispatch (T* obj, void (T::*func) ())
+ {
+   (obj->*func) ();
+ }
+ class C
+ {
+   bool f (int);
+   void g ();
+ };
+ bool C::f (int n)
+ {
+   bool b;
+   switch (n)
+ {
+   case 0:
+ b = Dispatch (this, &C::g);
+   case 1:
+ b = Dispatch (this, &C::g);
+ }
+ }
+ void C::g ()
+ {
+   for (;;) { }
+ }
+ 


[Path, AVR]: Implement __builtin_avr_fmul* if no hardware multiplier

2011-07-04 Thread Georg-Johann Lay
The current implementation of __builtin_avr_fmul/fmuls/fmulsu has a
gap if no hardware multiplier is available.

This patch closes that gap by providing libgcc implementations named
__fmul, __fmuls resp. __fmulsu.

The implementations yield the same result as respective FMUL*
instructions and have been testes against these instructions for all
possible combinations of input values on an atmega88 device.

Johann


* doc/extend.texi (AVR Built-in Functions): Update documentation
of __builtin_avr_fmul*.
* config/avr/avr.c (avr_init_builtins): Don't depend on
AVR_HAVE_MUL.
* config/avr/avr-c.c (avr_cpu_cpp_builtins): Ditto.
* config/avr/avr.md (fmul): Rename to fmul_insn.
(fmuls): Rename to fmuls_insn.
(fmulsu): Rename to fmulsu_insn.
(fmul,fmuls,fmulsu): New expander.
(*fmul.call,*fmuls.call,*fmulsu.call): New Insn.
* config/avr/t-avr (LIB1ASMFUNCS): Add _fmul, _fmuls, _fmulsu.
* config/avr/libgcc.S (__fmul): New function.
(__fmuls): New function.
(__fmulsu,__fmulsu_exit): New function.
Index: doc/extend.texi
===
--- doc/extend.texi	(revision 175800)
+++ doc/extend.texi	(working copy)
@@ -8226,8 +8226,8 @@ or if not a specific built-in is impleme
 The following built-in functions map to the respective machine
 instruction, i.e. @code{nop}, @code{sei}, @code{cli}, @code{sleep},
 @code{wdr}, @code{swap}, @code{fmul}, @code{fmuls}
-resp. @code{fmulsu}. The latter three are only available if the AVR
-device actually supports multiplication.
+resp. @code{fmulsu}. The three @code{fmul*} built-ins are implemented
+as library call if no hardware multiplier is available.
 
 @smallexample
 void __builtin_avr_nop (void)
Index: config/avr/libgcc.S
===
--- config/avr/libgcc.S	(revision 175628)
+++ config/avr/libgcc.S	(working copy)
@@ -1417,3 +1417,91 @@ DEFUN __ashldi3
 ret
 ENDF __ashldi3
 #endif /* defined (L_ashldi3) */
+
+
+/***/
+;;; Softmul versions of FMUL, FMULS and FMULSU to implement
+;;; __builtin_avr_fmul* if !AVR_HAVE_MUL
+/***/
+
+#define A1 24
+#define B1 25
+#define C0 22
+#define C1 23
+#define A0 __tmp_reg__
+
+#ifdef L_fmuls
+;;; r23:r22 = fmuls (r24, r25) like in FMULS instruction
+;;; Clobbers: r24, r25, __tmp_reg__
+DEFUN __fmuls
+;; A0.7 = negate result?
+mov  A0, A1
+eor  A0, B1
+;; B1 = |B1|
+sbrc B1, 7
+neg  B1
+XJMP __fmulsu_exit
+ENDF __fmuls
+#endif /* L_fmuls */
+
+#ifdef L_fmulsu
+;;; r23:r22 = fmulsu (r24, r25) like in FMULSU instruction
+;;; Clobbers: r24, r25, __tmp_reg__
+DEFUN __fmulsu
+;; A0.7 = negate result?
+mov  A0, A1
+;; FALLTHRU
+ENDF __fmulsu
+
+;; Helper for __fmuls and __fmulsu
+DEFUN __fmulsu_exit
+;; A1 = |A1|
+sbrc A1, 7
+neg  A1
+#ifdef __AVR_HAVE_JMP_CALL__
+;; Some cores have problem skipping 2-word instruction
+tst  A0
+brmi 1f
+#else
+sbrs A0, 7
+#endif /* __AVR_HAVE_JMP_CALL__ */
+XJMP  __fmul
+1:  XCALL __fmul
+;; C = -C iff A0.7 = 1
+com  C1
+neg  C0
+sbci C1, -1
+ret
+ENDF __fmulsu_exit
+#endif /* L_fmulsu */
+
+
+#ifdef L_fmul
+;;; r22:r23 = fmul (r24, r25) like in FMUL instruction
+;;; Clobbers: r24, r25, __tmp_reg__
+DEFUN __fmul
+; clear result
+clr   C0
+clr   C1
+clr   A0
+1:  tst   B1
+;; 1.0 = 0x80, so test for bit 7 of B to see if A must to be added to C.
+2:  brpl  3f
+;; C += A
+add   C0, A0
+adc   C1, A1
+3:  ;; A >>= 1
+lsr   A1
+ror   A0
+;; B <<= 1
+lsl   B1
+brne  2b
+ret
+ENDF __fmul
+#endif /* L_fmul */
+
+#undef A0
+#undef A1
+#undef B1
+#undef C0
+#undef C1
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 175628)
+++ config/avr/avr.md	(working copy)
@@ -3394,7 +3394,27 @@ (define_insn "wdr"
(set_attr "cc" "none")])
   
 ;; FMUL
-(define_insn "fmul"
+(define_expand "fmul"
+  [(set (reg:QI 24)
+(match_operand:QI 1 "register_operand" ""))
+   (set (reg:QI 25)
+(match_operand:QI 2 "register_operand" ""))
+   (parallel [(set (reg:HI 22)
+   (unspec:HI [(reg:QI 24)
+   (reg:QI 25)] UNSPEC_FMUL))
+  (clobber (reg:HI 24))])
+   (set (match_operand:HI 0 "register_operand" "")
+(reg:HI 22))]
+  ""
+  {
+if (AVR_HAVE_MUL)
+  {
+emit_insn (gen_fmul_insn (operand0, operand1, operand2));
+DONE;
+  }
+  })
+
+(define_insn "fmul_insn"
   [(set (match_operand:HI 0 "register_operand" "=r")
 (unspec:HI [(match_operand:QI 1 "register_operand" "a")
 (match_operand:QI 2 "register_operand" "a")]
@@ -3406,8 +3426,38 @@ (define_insn "fmul"
   [(set_attr "len

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-07-04 Thread Ulrich Weigand
Julian Brown wrote:

> The most awkward change in the patch is to generic code (expmed.c,
> {store,extract}_bit_field_1): in big-endian mode, the existing behaviour
> (when inserting/extracting a bitfield to a memory location) is
> definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations,
> and if bitsize (the size of the field to insert/extract) is greater than
> BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative.
> That can't possibly be intentional; I can only assume that this code
> path is not exercised for machines which have memory alternatives for
> bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN
> mode.
[snip]
> @@ -648,7 +648,7 @@ store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT 
> bitsize,
>/* On big-endian machines, we count bits from the most significant.
>If the bit field insn does not, we must invert.  */
>  
> -  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
> +  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN && !MEM_P (xop0))
>   xbitpos = unit - bitsize - xbitpos;

I agree that the current code cannot possibly be correct.  However, just
disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering *completely*
seems wrong to me as well.  

According to the docs, the meaning bit position passed to the extv/insv
expanders is determined by BITS_BIG_ENDIAN, both in the cases of register
and memory operands.  Therefore, if BITS_BIG_ENDIAN differs from
BYTES_BIG_ENDIAN, we should need a correction for memory operands as
well.  However, this correction needs to be relative to the size of
the access (i.e. the operand to the extv/insn), not just BITS_PER_UNIT.

>From looking at the sources, the simplest way to implement that might
be to swap the order of the two corrections, that is, change this:


  /* On big-endian machines, we count bits from the most significant.
 If the bit field insn does not, we must invert.  */

  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
xbitpos = unit - bitsize - xbitpos;

  /* We have been counting XBITPOS within UNIT.
 Count instead within the size of the register.  */
  if (BITS_BIG_ENDIAN && !MEM_P (xop0))
xbitpos += GET_MODE_BITSIZE (op_mode) - unit;

  unit = GET_MODE_BITSIZE (op_mode);


to look instead like:

  /* We have been counting XBITPOS within UNIT.
 Count instead within the size of the register.  */
  if (BYTES_BIG_ENDIAN && !MEM_P (xop0))
xbitpos += GET_MODE_BITSIZE (op_mode) - unit;

  unit = GET_MODE_BITSIZE (op_mode);

  /* On big-endian machines, we count bits from the most significant.
 If the bit field insn does not, we must invert.  */

  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
xbitpos = unit - bitsize - xbitpos;


(Note that the condition in the first if must then check BYTES_BIG_ENDIAN
instead of BITS_BIG_ENDIAN.)   This change results in unchanged behaviour
for register operands in all cases, and memory operands if BITS_BIG_ENDIAN
== BYTES_BIG_ENDIAN.  For the problematic case of memory operands with
BITS_BIG_ENDIAN != BYTES_BIG ENDIAN it should result in the appropriate
correction.

Note that with that change, the new code your patch introduces to the
ARM back-end will also need to change.  You currently handle bitpos
like this:

  base_addr = adjust_address (operands[1], HImode,
  bitpos / BITS_PER_UNIT);

This implicitly assumes that bitpos counts according to BYTES_BIG_ENDIAN,
not BITS_BIG_ENDIAN -- which exactly cancels out the common code behaviour
introduced by your patch ...

Thoughts?  Am I overlooking something here?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


[PATCH] Fix PR49518

2011-07-04 Thread Richard Guenther

Handling of negative steps broke one of the many asserts in
the vectorizer.  The following patch drops one that I can't
make sense of.  I think all asserts need comments - especially
this one would, as I can't see why using vf is correct to
test against and not nelements (and why <= vf and not < vf).

Well, ok?

Thanks,
Richard.

2011-07-04  Richard Guenther  

PR tree-optimization/49518
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Drop assert.

* gcc.dg/torture/pr49518.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 175800)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -1552,7 +1552,6 @@ vect_enhance_data_refs_alignment (loop_v
 
   for (j = 0; j < possible_npeel_number; j++)
 {
-  gcc_assert (npeel_tmp <= vf);
   vect_peeling_hash_insert (loop_vinfo, dr, npeel_tmp);
   npeel_tmp += nelements;
 }
Index: gcc/testsuite/gcc.dg/torture/pr49518.c
===
--- gcc/testsuite/gcc.dg/torture/pr49518.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr49518.c  (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+
+int a, b;
+struct S { unsigned int s, t, u; } c, d = { 0, 1, 0 };
+
+void
+test (unsigned char z)
+{
+  char e[] = {0, 0, 0, 0, 1};
+  for (c.s = 1; c.s; c.s++)
+{
+  b = e[c.s];
+  if (a)
+   break;
+  b = z >= c.u;
+  if (d.t)
+   break;
+}
+}


Re: Ping #1: [testsuite, AVR]: Add some progmem test cases

2011-07-04 Thread Denis Chertykov
2011/7/4 Georg-Johann Lay :
> Georg-Johann Lay wrote:
>> Some runtime and checks for error/warning for C/C++.
>
> Note that some tests fail because of pending
>
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html
>
> Johann
>
> testsuite/
>        * gcc.target/avr/avr.exp: Run over cpp files, too.
>        * gcc.target/avr/torture/avr-torture.exp: Ditto.
>        * gcc.target/avr/progmem.h: New file.
>        * gcc.target/avr/exit-abort.h: New file.
>        * gcc.target/avr/progmem-error-1.c: New file.
>        * gcc.target/avr/progmem-error-1.cpp: New file.
>        * gcc.target/avr/progmem-warning-1.c: New file.
>        * gcc.target/avr/torture/progmem-1.c: New file.
>        * gcc.target/avr/torture/progmem-1.cpp: New file.

I don't know who must approve tests.
If me then Approved

Denis.


[PATCH] Fix bootstrap on OpenBSD, PR48851

2011-07-04 Thread Richard Guenther

It happens that OpenBSD suffers from a bogus fixinclude that changes
its perfectly valid NULL define from (void *)0 to 0.  The fix itself
appears to be very old and is completely bogus - it replaces
(void *)0 with 0 under the assumption the former is invalid for C++ - 
which is true - but 0 is inappropriate for C which is much worse.

Thus, I propose to remove the fix altogether.  Platform maintainers
can arrange for a new fix if the platforms still need fixing (which
I seriously doubt after so many years and platform obsoletion).

This restores bootstrap on OpenBSD.

Ok for trunk and active branches?

Thanks,
Richard.

2011-07-04  Richard Guenther  

PR bootstrap/48851
* inclhack.def (void_null): Remove bogus fix.
* fixincl.x: Regenerated.

Index: fixincludes/inclhack.def
===
--- fixincludes/inclhack.def(revision 175800)
+++ fixincludes/inclhack.def(working copy)
@@ -4399,32 +4399,6 @@ fix = {
 
 
 /*
- *  AIX and Interix headers define NULL to be cast to a void pointer,
- *  which is illegal in ANSI C++.
- */
-fix = {
-hackname  = void_null;
-files = curses.h;
-files = dbm.h;
-files = locale.h;
-files = stdio.h;
-files = stdlib.h;
-files = string.h;
-files = time.h;
-files = unistd.h;
-files = sys/dir.h;
-files = sys/param.h;
-files = sys/types.h;
-/* avoid changing C++ friendly NULL */
-bypass= __cplusplus;
-select= "^#[ \t]*define[ \t]+NULL[ \t]+\\(\\(void[ \t]*\\*\\)0\\)";
-c_fix = format;
-c_fix_arg = "#define NULL 0";
-test_text = "# define\tNULL \t((void *)0)  /* typed NULL */";
-};
-
-
-/*
  *  Make VxWorks header which is almost gcc ready fully gcc ready.
  */
 fix = {
Index: fixincludes/fixincl.x
===
--- fixincludes/fixincl.x   (revision 175800)
+++ fixincludes/fixincl.x   (working copy)
@@ -2,11 +2,11 @@
  * 
  * DO NOT EDIT THIS FILE   (fixincl.x)
  * 
- * It has been AutoGen-ed  Sunday June  5, 2011 at 09:04:54 PM CDT
+ * It has been AutoGen-ed  Monday July  4, 2011 at 12:59:38 PM CEST
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Sun Jun  5 21:04:54 CDT 2011
+/* DO NOT SVN-MERGE THIS FILE, EITHER Mon Jul  4 12:59:38 CEST 2011
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -15,7 +15,7 @@
  * certain ANSI-incompatible system header files which are fixed to work
  * correctly with ANSI C and placed in a directory that GNU C will search.
  *
- * This file contains 211 fixup descriptions.
+ * This file contains 210 fixup descriptions.
  *
  * See README for more information.
  *
@@ -8199,48 +8199,6 @@ static const char* apzVa_I960_MacroPatch
 
 /* * * * * * * * * * * * * * * * * * * * * * * * * *
  *
- *  Description of Void_Null fix
- */
-tSCC zVoid_NullName[] =
- "void_null";
-
-/*
- *  File name selection pattern
- */
-tSCC zVoid_NullList[] =
-  
"curses.h\0dbm.h\0locale.h\0stdio.h\0stdlib.h\0string.h\0time.h\0unistd.h\0sys/dir.h\0sys/param.h\0sys/types.h\0";
-/*
- *  Machine/OS name selection pattern
- */
-#define apzVoid_NullMachs (const char**)NULL
-
-/*
- *  content selection pattern - do fix if pattern found
- */
-tSCC zVoid_NullSelect0[] =
-   "^#[ \t]*define[ \t]+NULL[ \t]+\\(\\(void[ \t]*\\*\\)0\\)";
-
-/*
- *  content bypass pattern - skip fix if pattern found
- */
-tSCC zVoid_NullBypass0[] =
-   "__cplusplus";
-
-#defineVOID_NULL_TEST_CT  2
-static tTestDesc aVoid_NullTests[] = {
-  { TT_NEGREP,   zVoid_NullBypass0, (regex_t*)NULL },
-  { TT_EGREP,zVoid_NullSelect0, (regex_t*)NULL }, };
-
-/*
- *  Fix Command Arguments for Void_Null
- */
-static const char* apzVoid_NullPatch[] = {
-"format",
-"#define NULL 0",
-(char*)NULL };
-
-/* * * * * * * * * * * * * * * * * * * * * * * * * *
- *
  *  Description of Vxworks_Gcc_Problem fix
  */
 tSCC zVxworks_Gcc_ProblemName[] =
@@ -8591,9 +8549,9 @@ static const char* apzX11_SprintfPatch[]
  *
  *  List of all fixes
  */
-#define REGEX_COUNT  250
+#define REGEX_COUNT  248
 #define MACH_LIST_SIZE_LIMIT 181
-#define FIX_COUNT211
+#define FIX_COUNT210
 
 /*
  *  Enumerate the fixes
@@ -8801,7 +8759,6 @@ typedef enum {
 ULTRIX_CONST_FIXIDX,
 ULTRIX_CONST2_FIXIDX,
 VA_I960_MACRO_FIXIDX,
-VOID_NULL_FIXIDX,
 VXWORKS_GCC_PROBLEM_FIXIDX,
 VXWORKS_NEEDS_VXTYPES_FIXIDX,
 VXWORKS_NEEDS_VXWORKS_FIXIDX,
@@ -9823,11 +9780,6 @@ tFixDesc fixDescList[ FIX_COUNT ] = {
  VA_I960_MACRO_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
  aVa_I960_MacroTests,   apzVa_I960_MacroPatch, 0 },
 
-  {  zVoid_NullName,zVoid_NullList,
- apzVoid_NullMachs,
- VOID_NULL_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
- aVoid_NullTests,   apzVoid_NullPatch, 0 },
-
   {  zVxworks_Gcc_ProblemNa

Re: [wwwdocs] Document IRIX 6.5, Tru64 UNIX V5.1 obsoletion

2011-07-04 Thread Gerald Pfeifer
On Fri, 1 Jul 2011, Rainer Orth wrote:
> I don't need approval for the patch, but would be grateful for
> improvements to wording.

I find it quite clear, thanks.  If you'd like, "is not" instead of
"isn't" is the only suggestion I found.

Gerald


Re: [Ada] Fix parallel LTO bootstrap

2011-07-04 Thread Eric Botcazou
> The changle is obviously correct, but I wonder how the bootstrap dies w/o
> '+'. It should IMO just prevent the parallelizm and take longer.

Same cryptic error as PR driver/46750.  

-- 
Eric Botcazou


Ping #1: [Patch, AVR, 4.6+trunk]: PR44643 addendum

2011-07-04 Thread Georg-Johann Lay
Georg-Johann Lay wrote:

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html

> avr_insert_attributes uses TREE_READONLY on get readonlyness of node.
> 
> That does not work for C++ arrays: it gives false error
> "variable must be const in order to be put into read-only section by
> means of '__attribute__((progmem))'".
> 
> This patch peels arrays and uses TYPE_READONLY.
> 
> I did not open separate PR for this, tagged it as addendum to PR44643
> instead.
> 
> Lightly tested on own code. There is no 'progmem' in testsuite, so
> from testsuite's perspective that code is dead, anyway...
> 
> Johann
> 
>   PR target/44643
>   * config/avr/avr.c (avr_insert_attributes): Use TYPE_READONLY
>   instead of TREE_READONLY.
> 



Ping #1: [testsuite, AVR]: Add some progmem test cases

2011-07-04 Thread Georg-Johann Lay
Georg-Johann Lay wrote:
> Some runtime and checks for error/warning for C/C++.

Note that some tests fail because of pending

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html

Johann

testsuite/
* gcc.target/avr/avr.exp: Run over cpp files, too.
* gcc.target/avr/torture/avr-torture.exp: Ditto.
* gcc.target/avr/progmem.h: New file.
* gcc.target/avr/exit-abort.h: New file.
* gcc.target/avr/progmem-error-1.c: New file.
* gcc.target/avr/progmem-error-1.cpp: New file.
* gcc.target/avr/progmem-warning-1.c: New file.
* gcc.target/avr/torture/progmem-1.c: New file.
* gcc.target/avr/torture/progmem-1.cpp: New file.

Index: gcc.target/avr/avr.exp
===
--- gcc.target/avr/avr.exp	(revision 175628)
+++ gcc.target/avr/avr.exp	(working copy)
@@ -34,7 +34,7 @@ if ![info exists DEFAULT_CFLAGS] then {
 dg-init
 
 # Main loop.
-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cCS\],cpp}]] \
 	"" $DEFAULT_CFLAGS
 
 # All done.
Index: gcc.target/avr/torture/avr-torture.exp
===
--- gcc.target/avr/torture/avr-torture.exp	(revision 175628)
+++ gcc.target/avr/torture/avr-torture.exp	(working copy)
@@ -52,7 +52,7 @@ set-torture-options $AVR_TORTURE_OPTIONS
 
 
 # Main loop.
-gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] $DEFAULT_CFLAGS
+gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cS\],cpp}]] $DEFAULT_CFLAGS
 
 # Finalize use of torture lists.
 torture-finish
Index: gcc.target/avr/torture/progmem-1.c
===
--- gcc.target/avr/torture/progmem-1.c	(revision 0)
+++ gcc.target/avr/torture/progmem-1.c	(revision 0)
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+
+#include "../exit-abort.h"
+#include "../progmem.h"
+
+const char strA[] PROGMEM = "@A";
+const char strc PROGMEM = 'c';
+
+unsigned int volatile s = 2;
+
+int main()
+{
+char c;
+
+c = pgm_read_char (&strA[s-1]);
+if (c != 'A')
+abort();
+
+c = pgm_read_char (&PSTR ("@@B")[s]);
+if (c != 'B')
+abort();
+
+c = pgm_read_char (&strc);
+if (c != 'c')
+abort();
+
+exit (0);
+
+return 0;
+}
Index: gcc.target/avr/torture/progmem-1.cpp
===
--- gcc.target/avr/torture/progmem-1.cpp	(revision 0)
+++ gcc.target/avr/torture/progmem-1.cpp	(revision 0)
@@ -0,0 +1,2 @@
+/* { dg-do run } */
+#include "progmem-1.c"
Index: gcc.target/avr/exit-abort.h
===
--- gcc.target/avr/exit-abort.h	(revision 0)
+++ gcc.target/avr/exit-abort.h	(revision 0)
@@ -0,0 +1,8 @@
+#ifdef __cplusplus
+extern "C" {
+#endif
+  extern void exit (int);
+  extern void abort (void);
+#ifdef __cplusplus
+}
+#endif
Index: gcc.target/avr/progmem-warning-1.c
===
--- gcc.target/avr/progmem-warning-1.c	(revision 0)
+++ gcc.target/avr/progmem-warning-1.c	(revision 0)
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized" } */
+
+#include "progmem.h"
+
+const char c PROGMEM; /* { dg-warning "uninitialized variable 'c' put into program memory area" } */
Index: gcc.target/avr/progmem-error-1.c
===
--- gcc.target/avr/progmem-error-1.c	(revision 0)
+++ gcc.target/avr/progmem-error-1.c	(revision 0)
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+#include "progmem.h"
+
+char str[] PROGMEM = "Hallo"; /* { dg-error "must be const" } */
Index: gcc.target/avr/progmem-error-1.cpp
===
--- gcc.target/avr/progmem-error-1.cpp	(revision 0)
+++ gcc.target/avr/progmem-error-1.cpp	(revision 0)
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+#include "progmem.h"
+
+char str[] PROGMEM = "Hallo"; /* { dg-error "must be const" } */
Index: gcc.target/avr/progmem.h
===
--- gcc.target/avr/progmem.h	(revision 0)
+++ gcc.target/avr/progmem.h	(revision 0)
@@ -0,0 +1,14 @@
+#define PROGMEM __attribute__((progmem))
+
+#define PSTR(s) \
+(__extension__({\
+static const char __c[] PROGMEM = (s);  \
+&__c[0];}))
+
+#define pgm_read_char(addr) \
+(__extension__({\
+unsigned int __addr16 = (unsigned int)(addr);   \
+char __result;  \
+__asm__ ("lpm %0, %a1"  \
+ : "=r" (__result) : "z" (__addr16));   \
+__result; }))


Re: PATCH: PR target/49600: Bad SSE2 int->float split in i386.md

2011-07-04 Thread Uros Bizjak
On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu  wrote:

 In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true,
 TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we
 will get gcc_unreachable.  This patch removes TARGET_INTER_UNIT_MOVES
 check.  OK for trunk?
>>>
>>> This will result in register allocation failure. Operand 0 of
>
> That particular sse2_loadld insn matches:
>
> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83])
>        (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64
> test ] [64]))
>            (const_vector:V4SI [
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                    (const_int 0 [0])
>                ])
>            (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0}
>     (nil))
>

Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The
constraint should be Yi, but then we don't shadow other alternatives
correctly.

>>> sse2_loadld pattern has conditional constraint Yi that depends on
>>> TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld
>>> after reload.  I'm testing attached patch.
>>>
>>> BTW: Do you perhaps have a testcase for this problem?
>>
>> I have a testcase. But it needs a new x86 optimization we are working on it.
>>
>>> 2011-07-03  Uros Bizjak  
>>>
>>>        PR target/49600
>>>        * config/i386/i386.md (SSE2 int->float split): Push operand 1 in
>>>        general register to memory for !TARGET_INTER_UNIT_MOVES.
>>>
>>
>> I will give it a try.
>>
>
> It doesn't work: I still got

Yes, I later noticed that I have changed the wrong pattern (the one
with memory clobber) ;( . Attached is the correct patch.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 175786)
+++ config/i386/i386.md (working copy)
@@ -5022,11 +5022,20 @@
   if (GET_CODE (op1) == SUBREG)
 op1 = SUBREG_REG (op1);
 
-  if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES)
+  if (GENERAL_REG_P (op1))
 {
   operands[4] = simplify_gen_subreg (V4SImode, operands[0], mode, 0);
-  emit_insn (gen_sse2_loadld (operands[4],
- CONST0_RTX (V4SImode), operands[1]));
+  if (TARGET_INTER_UNIT_MOVES)
+   emit_insn (gen_sse2_loadld (operands[4],
+   CONST0_RTX (V4SImode), operands[1]));
+  else
+   {
+ operands[5] = ix86_force_to_memory (GET_MODE (operands[1]),
+ operands[1]);
+ emit_insn (gen_sse2_loadld (operands[4],
+ CONST0_RTX (V4SImode), operands[5]));
+ ix86_free_from_memory (GET_MODE (operands[1]));
+   }
 }
   /* We can ignore possible trapping value in the
  high part of SSE register for non-trapping math. */


Re: [Ada] Fix parallel LTO bootstrap

2011-07-04 Thread Jan Hubicka
> Not clear why this never showed up on the 4.6 branch, but this now prevents a 
> parallel LTO bootstrap with Ada enabled from completing on the mainline.
> 
> Parallel LTO-bootstrapped, applied on the mainline and 4.6 branch.
> 
> 
> 2011-07-01  Eric Botcazou  
> 
>   * gcc-interface/Make-lang.in (gnat1): Prepend '+' to the command.
>   (gnatbind): Likewise.

The changle is obviously correct, but I wonder how the bootstrap dies w/o '+'.
It should IMO just prevent the parallelizm and take longer.

Honza


Re: [testsuite] ARM test pr42093.c: thumb2 or thumb1

2011-07-04 Thread Richard Earnshaw
On 01/07/11 20:56, Janis Johnson wrote:
> On 07/01/2011 02:02 AM, Richard Earnshaw wrote:
>> On 24/06/11 14:18, Ramana Radhakrishnan wrote:
>>> On 24/06/11 01:40, Janis Johnson wrote:
 Test gcc.target/arm/pr42093.c, added by Ramana, requires support for
 arm_thumb2 but fails for those targets.  The patch for which it was
 added modified support for thumb1.  Should the test instead require
 arm_thumb1_ok, as in this patch?
>>>
>>> No this is for a Thumb2 defect so the test is valid for Thumb2 - we 
>>> shouldn't be generating a tbb / tbh with signed offsets and that's what 
>>> was happening there.
>>>
>>> This test I think ends up being fragile because the generation of tbb / 
>>> tbh depends on how the blocks have been laid out . It would be 
>>> interesting to try and get a test that works reliably in T2 .
>>>
>>> cheers
>>> Ramana
>>>

 Janis
>>>
>>>
>>>
>> Perhaps -fno-reorder-blocks could be used to make it less fragile.
>>
>> R.
>>
> 
> It passes for all thumb2 targets with that option.
> 
> Janis
> 
> 
> 

Ok, so consider a patch to use that option pre-approved.

R.



Re: [PATCH, PR 49495] Cgraph verifier must look through aliases

2011-07-04 Thread Jan Hubicka
> Hi,
> 
> PR 49495 is actually a bug in the verifier that does not look through
> aliases at one point.  Fixed wit the patch below (created a special
> function, otherwise I just wasn't able to fit the 80 column limit).
> Bootstrapped and tested on x86_64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 2011-07-02  Martin Jambor  
> 
>   PR middle-end/49495
>   * cgraphunit.c (verify_edge_corresponds_to_fndecl): New function.
>   (verify_cgraph_node): Some functinality moved to
>   verify_edge_corresponds_to_fndecl, call it.

This is OK.
> 
> 
> Index: src/gcc/cgraphunit.c
> ===
> --- src.orig/gcc/cgraphunit.c
> +++ src/gcc/cgraphunit.c
> @@ -450,6 +450,34 @@ cgraph_debug_gimple_stmt (struct functio
>debug_gimple_stmt (stmt);
>  }
>  
> +/* Verify that call graph edge E corresponds to DECL from the associated
> +   statement.  Return true if the verification should fail.  */
> +
> +static bool
> +verify_edge_corresponds_to_fndecl (struct cgraph_edge *e, tree decl)
> +{
> +  if (!e->callee->global.inlined_to
> +  && decl
> +  && cgraph_get_node (decl)
> +  && (e->callee->former_clone_of
> +   != cgraph_function_or_thunk_node (cgraph_get_node (decl), NULL)->decl)
> +  /* IPA-CP sometimes redirect edge to clone and then back to the former
> +  function.  This ping-pong has to go, eventaully.  */
> +  && (cgraph_function_or_thunk_node (cgraph_get_node (decl), NULL)
> +   != cgraph_function_or_thunk_node (e->callee, NULL))
> +  && !clone_of_p (cgraph_get_node (decl),
> +   e->callee))
> +{
> +  error ("edge points to wrong declaration:");
> +  debug_tree (e->callee->decl);
> +  fprintf (stderr," Instead of:");
> +  debug_tree (decl);
> +  return true;
> +}
> +  else
> +return false;
> +}
> +
>  /* Verify cgraph nodes of given cgraph node.  */
>  DEBUG_FUNCTION void
>  verify_cgraph_node (struct cgraph_node *node)
> @@ -702,24 +730,8 @@ verify_cgraph_node (struct cgraph_node *
> }
>   if (!e->indirect_unknown_callee)
> {
> - if (!e->callee->global.inlined_to
> - && decl
> - && cgraph_get_node (decl)
> - && (e->callee->former_clone_of
> - != cgraph_get_node (decl)->decl)
> - /* IPA-CP sometimes redirect edge to clone and 
> then back to the former
> -function.  This ping-pong has to go, 
> eventaully.  */
> - && (cgraph_function_or_thunk_node 
> (cgraph_get_node (decl), NULL)
> - != cgraph_function_or_thunk_node 
> (e->callee, NULL))
> - && !clone_of_p (cgraph_get_node (decl),
> - e->callee))
> -   {
> - error ("edge points to wrong declaration:");
> - debug_tree (e->callee->decl);
> - fprintf (stderr," Instead of:");
> - debug_tree (decl);
> - error_found = true;
> -   }
> + if (verify_edge_corresponds_to_fndecl (e, decl))
> +   error_found = true;
Could you please move the error output here, somehow I like it better when all
the diagnostic is output at single place...

Honza
> }
>   else if (decl)
> {