Re: RFD: annotate iterator patterns with expanded forms

2016-04-25 Thread Hans-Peter Nilsson
On Mon, 25 Apr 2016, Bernd Schmidt wrote:
> Now that we're in stage1, I thought I'd bring this up again. For reference,
> the patch was here:
>   https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00165.html
>
> So, would you like this for cris and mmix? I could enable it for these, then
> we'd need someone to review/approve the generator parts. I'm still hoping the
> x86 maintainers would also consider it.

Thanks, but I'll have a chance to look at any of this, including
current and 6-branch breakage, in two weeks by the earliest.

brgds, H-P


Re: [DOC Patch] Add sample for @cc constraint

2016-04-25 Thread David Wohlferd

On 4/25/2016 2:51 AM, Bernd Schmidt wrote:

On 04/16/2016 01:12 AM, David Wohlferd wrote:

There were  basically 3 changes I was trying for in that doc patch. Are
any of them worth keeping?  Or are we done?

1) "Do not clobber flags if they are being used as outputs."
2) Output flags sample (with #if removed).
3) "On the x86 platform, flags are always treated as clobbered by
extended asm whether @code{"cc"} is specified or not."

I'm prepared to send an updated patch if there's anything here that
might get approved.


I think the updated flags sample would be nice to have.


Attached.

dw
Index: extend.texi
===
--- extend.texi	(revision 235054)
+++ extend.texi	(working copy)
@@ -8135,6 +8135,26 @@
 ``not'' @var{flag}, or inverted versions of those above
 @end table
 
+This example uses the @code{bt} instruction (which sets the carry flag) to
+see if bit 0 of an integer is set.  To see the improvement in the generated
+output, make sure optimizations are enabled.
+
+@example
+void TestEven (int value)
+@{
+  char CarryIsSet;
+
+  asm ("bt $0, %[value]"
+: "=@@ccc" (CarryIsSet)
+: [value] "rm" (value));
+
+  if (CarryIsSet)
+printf ("odd\n");
+  else
+printf ("even\n");
+@}
+@end example
+
 @end table
 
 @anchor{InputOperands}


[PATCH, i386]: Small improvements to move patterns

2016-04-25 Thread Uros Bizjak
No functional changes.

2016-04-25  Uros Bizjak  

* config/i386/i386.md (*movxi_internal_avx512f): Use insn type
attribute instead of which_alternative.
* config/i386/sse.md (*mov_internal): Ditto.
Use EXT_REX_SSE_REG_P where appropriate.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline.

Uros.
Index: i386.md
===
--- i386.md (revision 235416)
+++ i386.md (working copy)
@@ -1975,17 +1975,18 @@
&& (register_operand (operands[0], XImode)
|| register_operand (operands[1], XImode))"
 {
-  switch (which_alternative)
+  switch (get_attr_type (insn))
 {
-case 0:
+case TYPE_SSELOG1:
   return standard_sse_constant_opcode (insn, operands[1]);
-case 1:
-case 2:
+
+case TYPE_SSEMOV:
   if (misaligned_operand (operands[0], XImode)
  || misaligned_operand (operands[1], XImode))
return "vmovdqu32\t{%1, %0|%0, %1}";
   else
return "vmovdqa32\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
Index: sse.md
===
--- sse.md  (revision 235416)
+++ sse.md  (working copy)
@@ -839,19 +839,18 @@
&& (register_operand (operands[0], mode)
|| register_operand (operands[1], mode))"
 {
-  int mode = get_attr_mode (insn);
-  switch (which_alternative)
+  switch (get_attr_type (insn))
 {
-case 0:
+case TYPE_SSELOG1:
   return standard_sse_constant_opcode (insn, operands[1]);
-case 1:
-case 2:
+
+case TYPE_SSEMOV:
   /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
 in avx512f, so we need to use workarounds, to access sse registers
 16-31, which are evex-only. In avx512vl we don't need workarounds.  */
   if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
- && ((REG_P (operands[0]) && EXT_REX_SSE_REGNO_P (REGNO (operands[0])))
- || (REG_P (operands[1]) && EXT_REX_SSE_REGNO_P (REGNO 
(operands[1])
+ && (EXT_REX_SSE_REG_P (operands[0])
+ || EXT_REX_SSE_REG_P (operands[1])))
{
  if (memory_operand (operands[0], mode))
{
@@ -873,7 +872,7 @@
}
  else
/* Reg -> reg move is always aligned.  Just use wider move.  */
-   switch (mode)
+   switch (get_attr_mode (insn))
  {
  case MODE_V8SF:
  case MODE_V4SF:
@@ -888,7 +887,8 @@
gcc_unreachable ();
  }
}
-  switch (mode)
+
+  switch (get_attr_mode (insn))
{
case MODE_V16SF:
case MODE_V8SF:
@@ -931,6 +931,7 @@
default:
  gcc_unreachable ();
}
+
 default:
   gcc_unreachable ();
 }


Re: RFD: annotate iterator patterns with expanded forms

2016-04-25 Thread Bernd Schmidt

On 01/01/2016 07:02 PM, Hans-Peter Nilsson wrote:

On Tue, 1 Dec 2015, Bernd Schmidt wrote:



The automatic Makefile approach might look something like this. The effect is
similar to what happens when you edit tm.texi.in, except the build would not
be interrupted every time, only when you modify the iterator expansion of a
pattern. There's a new rtx code which can be put into a machine description to
enable this feature.


(No-one else chimed in, so:)

I really like this!


Now that we're in stage1, I thought I'd bring this up again. For 
reference, the patch was here:

  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00165.html

So, would you like this for cris and mmix? I could enable it for these, 
then we'd need someone to review/approve the generator parts. I'm still 
hoping the x86 maintainers would also consider it.



Bernd


[PATCH] PR target/70454: Build x86 libgomp with -march=i486 or better

2016-04-25 Thread H.J. Lu
If x86 libgomp isn't compiled with -march=i486 or better, append
-march=i486 XCFLAGS for x86 libgomp build.

Tested on i686 with and without --with-arch=i386.  Tested on
x86-64 with and without --with-arch_32=i386.  OK for trunk?


H.J.
---
PR target/70454
* configure.tgt (XCFLAGS): Append -march=i486 to compile x86
libgomp if needed.
---
 libgomp/configure.tgt | 36 
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
index 77e73f0..c876e80 100644
--- a/libgomp/configure.tgt
+++ b/libgomp/configure.tgt
@@ -67,28 +67,24 @@ if test x$enable_linux_futex = xyes; then
;;
 
 # Note that bare i386 is not included here.  We need cmpxchg.
-i[456]86-*-linux*)
+i[456]86-*-linux* | x86_64-*-linux*)
config_path="linux/x86 linux posix"
-   case " ${CC} ${CFLAGS} " in
- *" -m64 "*|*" -mx32 "*)
-   ;;
- *)
-   if test -z "$with_arch"; then
- XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
+   # Need i486 or better.
+   cat > conftestx.c < /dev/null 2>&1; then
+   if test "${target_cpu}" = x86_64; then
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
+   else
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
fi
-   esac
-   ;;
-
-# Similar jiggery-pokery for x86_64 multilibs, except here we
-# can't rely on the --with-arch configure option, since that
-# applies to the 64-bit side.
-x86_64-*-linux*)
-   config_path="linux/x86 linux posix"
-   case " ${CC} ${CFLAGS} " in
- *" -m32 "*)
-   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
-   ;;
-   esac
+   fi
+   rm -f conftestx.c conftestx.o
;;
 
 # Note that sparcv7 and sparcv8 is not included here.  We need cas.
-- 
2.5.5



Re: C, C++: Fix PR 69733 (bad location for ignored qualifiers warning)

2016-04-25 Thread Joseph Myers
On Fri, 22 Apr 2016, Bernd Schmidt wrote:

> +/* Returns the smallest location != UNKNOWN_LOCATION in LOCATIONS,
> +   considering only those c_declspec_words found in LIST, which
> +   must be terminated by cdw_number_of_elements.  */
> +
> +static location_t
> +smallest_type_quals_location (const location_t* locations,
> +   c_declspec_word *list)

I'd expect list to be a pointer to const...

> @@ -6101,6 +6122,18 @@ grokdeclarator (const struct c_declarato
>  qualify the return type, not the function type.  */
>   if (type_quals)
> {
> + enum c_declspec_word ignored_quals_list[] =
> +   {
> + cdw_const, cdw_volatile, cdw_restrict, cdw_address_space,
> + cdw_number_of_elements
> +   };

 ... and ignored_quals_list to be static const here.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] [FIX PR c/48116] -Wreturn-type does not work as advertised

2016-04-25 Thread Prasad Ghangal
On 11 April 2016 at 20:09, Prasad Ghangal  wrote:
>
> Hi!
>
> This is proposed patch for
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48116 (-Wreturn-type does
> not work as advertised)
>
> Currently gcc doesn't give any warning with -Wreturn-type or -Wall
> option for test cases like :
>
> void x (void) { }
> void y(void) { return x(); }
>
>
> applying this patch gives:
>
> $gcc foo.c -S -Wreturn-type
> foo.c: In function ‘y’:
> foo.c:2:23: warning: ISO C forbids ‘return’ with expression, in
> function returning void [-Wreturn-type]
>  void y(void) { return x(); }
>^~~
> foo.c:2:6: note: declared here
>  void y(void) { return x(); }
>   ^
>
> $gcc foo.c -S -Wall
> foo.c: In function ‘y’:
> foo.c:2:23: warning: ISO C forbids ‘return’ with expression, in
> function returning void [-Wreturn-type]
>  void y(void) { return x(); }
>^~~
> foo.c:2:6: note: declared here
>  void y(void) { return x(); }
>   ^
>
> $gcc foo.c -S -pedantic
> foo.c: In function ‘y’:
> foo.c:2:23: warning: ISO C forbids ‘return’ with expression, in
> function returning void [-Wpedantic]
>  void y(void) { return x(); }
>^~~
> foo.c:2:6: note: declared here
>  void y(void) { return x(); }
>   ^
>
>
> I have fully bootstrapped and tested on x86_64-pc-linux.
>
>
>
> Thanks,
> Prasad Ghangal



*PING*

https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00474.html

Is this patch OK for gcc 7?


Thanks,
Prasad Ghangal


Re: An abridged "Writing C" for the gcc web pages

2016-04-25 Thread Richard Sandiford
Bernd Schmidt  writes:
> (Apologies if you get this twice, the mailing list didn't like the html 
> attachment in the first attempt).
>
> We frequently get malformatted patches, and it's been brought to my 
> attention that some people don't even make the effort to read the GNU 
> coding standards before trying to contribute code. TL;DR seems to be the 
> excuse, and while I find that attitude inappropriate, we could probably 
> improve the situation by spelling out the most basic rules in an 
> abridged document on our webpages. Below is a draft I came up with. 
> Thoughts?

The patch got some slightly negative reactions, so to balance things out:
it looks like a good intro to me FWIW.  A couple of nits:

> +There should be a space before open-parentheses and after commas.
> +We also use spaces around binary operators.

I realise it's trying to be short, but the first rule doesn't account
for things like "((int) a + b) * 3" and "-(int) 3".  Maybe "There should
be a space between a function name and the opening parenthesis?".
Doesn't cover macros of course.

> +Also note that multi-line comments should always be formatted as in
> +the previous example.  There should not be extra stars at the
> +beginning of new lines, and the comment text should being immediately
> +after the opening /*.

s/being/begin/.

Thanks,
Richard


Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes

On 04/25/16 14:58, Wilco Dijkstra wrote:

Evandro Menezes wrote:

I agree with your assessment, but I'm more curious to understand how
this change affects code built with the default -mcpu=generic when run
on both A53 and A57, the typical configuration of big.LITTLE machines.

I wouldn't expect the result to be any different as the -mcpu setting makes
very little difference.


True, but the results when running on A53 could be quite different.

Thank you,

--
Evandro Menezes



Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Wilco Dijkstra
Evandro Menezes wrote:
> I agree with your assessment, but I'm more curious to understand how
> this change affects code built with the default -mcpu=generic when run
> on both A53 and A57, the typical configuration of big.LITTLE machines.

I wouldn't expect the result to be any different as the -mcpu setting makes
very little difference.

Wilco



Re: [PATCH] Fix missed DSE opportunity with operator delete.

2016-04-25 Thread Jason Merrill
Hmm, this seems to assume that operator delete itself doesn't do
anything with the object being deleted.  This is true of the default
implementation, but I don't see anything in the standard that
prohibits a user-supplied replacement or class-specific deallocation
function from accessing the memory.

Jason


On Mon, Apr 25, 2016 at 6:08 AM, Richard Biener
 wrote:
> On Fri, Apr 22, 2016 at 11:37 PM, Mikhail Maltsev  wrote:
>> On 04/20/2016 05:12 PM, Richard Biener wrote:
>>> You have
>>>
>>> +static tree
>>> +handle_free_attribute (tree *node, tree name, tree /*args*/, int /*flags*/,
>>> +  bool *no_add_attrs)
>>> +{
>>> +  tree decl = *node;
>>> +  if (TREE_CODE (decl) == FUNCTION_DECL
>>> +  && type_num_arguments (TREE_TYPE (decl)) != 0
>>> +  && POINTER_TYPE_P (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (decl)
>>> +DECL_ALLOC_FN_KIND (decl) = ALLOC_FN_FREE;
>>> +  else
>>> +{
>>> +  warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes,
>>> + "%qE attribute ignored", name);
>>> +  *no_add_attrs = true;
>>> +}
>>>
>>> so one can happily apply the attribute to
>>>
>>>  void foo (void *, void *);
>>>
>>> but then
>>>
>>> @@ -2117,6 +2127,13 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
>>>   /* Fallthru to general call handling.  */;
>>>}
>>>
>>> +  if (callee != NULL_TREE
>>> +  && (flags_from_decl_or_type (callee) & ECF_FREE) != 0)
>>> +{
>>> +  tree ptr = gimple_call_arg (call, 0);
>>> +  return ptr_deref_may_alias_ref_p_1 (ptr, ref);
>>> +}
>>>
>>> will ignore the 2nd argument.  I think it's better to ignore the attribute
>>> if type_num_arguments () != 1.
>>
>> Actually, the C++ standard ([basic.stc.dynamic]/2) defines the following 4
>> deallocation functions implicitly:
>>
>> void operator delete(void*);
>> void operator delete[](void*);
>> void operator delete(void*, std::size_t) noexcept;
>> void operator delete[](void*, std::size_t) noexcept;
>>
>> And the standard library also has:
>>
>> void operator delete(void*, const std::nothrow_t&);
>> void operator delete[](void*, const std::nothrow_t&);
>> void operator delete(void*, std::size_t, const std::nothrow_t&);
>> void operator delete[](void*, std::size_t, const std::nothrow_t&);
>>
>> IIUC, 'delete(void*, std::size_t)' is used by default in C++14
>> (https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01266.html). How should we 
>> handle
>> this?
>
> Hmm.  I guess by adjusting the documentation of the attribute to
> explicitely mention
> the behavior on the rest of the argument pointed-to memory (the
> function is assumed
> to neither write nor read from that memory).  Also explicitely mention
> that 'this' is
> always the first argument if present.
>
> Richard.
>
>> --
>> Regards,
>> Mikhail Maltsev


Re: [PATCH] Allow all 1s of integer as standard SSE constants

2016-04-25 Thread Uros Bizjak
On Mon, Apr 25, 2016 at 9:45 PM, Richard Sandiford
 wrote:

>>> Can you please investigate, what is wrong with all_ones_operand so it
>>> doesn't accept all (-1) operands?
>>
>> Does following work:
>>
>> ;; Return true if operand is a (vector) constant with all bits set.
>> (define_predicate "all_ones_operand"
>>   (match_code "const_int,const_wide_int,const_vector")
>> {
>>   if (op == constm1_rtx)
>> return true;
>>
>>   if (CONST_INT_P (op))
>> return INTVAL (op) == HOST_WIDE_INT_M1;
>>
>>   if (mode == VOIDmode)
>> mode = GET_MODE (op);
>>   return op == CONSTM1_RTX (mode);
>> })
>
> const_wide_int isn't necessary here.  An all-1s integer will always
> use CONST_INT, regardless of the mode size.
>
> I think this reduces to:
>
> (define_predicate "all_ones_operand"
>   (match_code "const_int,const_vector")
> {
>   if (CONST_INT_P (op))
> return INTVAL (op) == HOST_WIDE_INT_M1;
>   return op == CONSTM1_RTX (GET_MODE (op));
> }
>
> (which is still more complex than it should be -- roll on CONST_INTs
> with modes. :-))

This is now implemented in a different way, but nevertheless some
const_wide_int codes were removed from other predicates in a follow-up
patch.

Uros.


Re: [PATCH] Allow all 1s of integer as standard SSE constants

2016-04-25 Thread Richard Sandiford
Uros Bizjak  writes:
> On Fri, Apr 22, 2016 at 7:10 PM, Uros Bizjak  wrote:
>> On Fri, Apr 22, 2016 at 4:19 PM, H.J. Lu  wrote:
>>> On Fri, Apr 22, 2016 at 5:11 AM, Uros Bizjak  wrote:
 On Thu, Apr 21, 2016 at 10:58 PM, H.J. Lu  wrote:

> Here is the updated patch with my standard_sse_constant_p change and
> your SSE/AVX pattern change.  I didn't include your
> standard_sse_constant_opcode since it didn't compile nor is needed
> for this purpose.

 H.J.,

 please test the attached patch that finally rewrites and improves SSE
 constants handling.

 This is what I want to commit, a follow-up patch will further clean
 standard_sse_constant_opcode wrt TARGET_AVX512VL.

>>>
>>> It doesn't address my problem which is "Allow all 1s of integer as
>>> standard SSE constants".  The key here is "integer".  I'd like to use
>>> SSE/AVX store TI/OI/XI integers with -1.
>>
>> Yes, my patch *should* work for this. Please note that
>> all_ones_operand should catch all cases your additional patch adds.
>>
>> ;; Return true if operand is a (vector) constant with all bits set.
>> (define_predicate "all_ones_operand"
>>   (match_code "const_int,const_wide_int,const_vector")
>> {
>>   if (op == constm1_rtx)
>> return true;
>>
>>   if (mode == VOIDmode)
>> mode = GET_MODE (op);
>>   return op == CONSTM1_RTX (mode);
>> })
>>
>>
>> Can you please investigate, what is wrong with all_ones_operand so it
>> doesn't accept all (-1) operands?
>
> Does following work:
>
> ;; Return true if operand is a (vector) constant with all bits set.
> (define_predicate "all_ones_operand"
>   (match_code "const_int,const_wide_int,const_vector")
> {
>   if (op == constm1_rtx)
> return true;
>
>   if (CONST_INT_P (op))
> return INTVAL (op) == HOST_WIDE_INT_M1;
>
>   if (mode == VOIDmode)
> mode = GET_MODE (op);
>   return op == CONSTM1_RTX (mode);
> })

const_wide_int isn't necessary here.  An all-1s integer will always
use CONST_INT, regardless of the mode size.

I think this reduces to:

(define_predicate "all_ones_operand"
  (match_code "const_int,const_vector")
{
  if (CONST_INT_P (op))
return INTVAL (op) == HOST_WIDE_INT_M1;
  return op == CONSTM1_RTX (GET_MODE (op));
}

(which is still more complex than it should be -- roll on CONST_INTs
with modes. :-))

Thanks,
Richard


Re: [PATCH] Fix bootstrap on powerpc*-aix* (PR bootstrap/70704)

2016-04-25 Thread Jason Merrill
Looks good to me.

Jason

On Mon, Apr 25, 2016 at 3:30 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, some checking code, in particular the one
> in C++ FE's build_non_dependent_expr, may affect code generation, as it can
> instantiate templates that aren't instantiated otherwise, which affects
> the various counters like cfun->funcdef_no, DECL_UID etc.
>
> I'd like to commit the attached (shorter, safer) patch to gcc 6 branch,
> which just disables this checking.
>
> The larger, included patch, makes -fchecking a 3 state option, -fno-checking
> aka -fchecking=0, no checking, -fchecking aka -fchecking=1 the previous
> -fchecking except for the build_non_dependent_expr bits, and -fchecking=2
> as checking that might affect code generation and tweaks the configury,
> so that for non-release builds it defaults to this -fchecking=2, while for
> release checking builds stage1 defaults to -fchecking=1 and stage2+
> defaults to -fno-checking.
>
> The shorter patch is currently being bootstrapped on powerpc-ibm-aix7.1.3.0,
> the larger patch I've so far bootstrapped/regtested on x86_64-linux and
> i686-linux on gcc-6 branch with implicit --enable-checking=release (where
> stage1 performs yes but no extra checking), --disable-checking,
> --enable-checking=yes,rtl and on the trunk so far bootstrapped and regtest
> --pending with --enable-checking=yes,extra,rtl.  Will still check
> also the implicit --enable-checking=yes,extra.
>
> Ok for trunk (the included patch) and 6 branch (the attached patch)?
>
> 2016-04-25  Jakub Jelinek  
>
> PR bootstrap/70704
> * configure.ac (--enable-stage1-checking): For --disable-checking or
> implicit --enable-checking, make sure extra flag matches in between
> stage1 and later checking.
> * configure: Regenerated.
> gcc/
> * configure.ac (--enable-checking): Document extra flag, for
> non-release builds default to --enable-checking=yes,extra.
> If misc checking and extra checking, define CHECKING_P to 2 instead
> of 1.
> * common.opt (fchecking=): Add.
> * doc/invoke.texi (-fchecking=): Document.
> * doc/install.texi: Document --enable-checking changes.
> * configure: Regenerated.
> * config.in: Regenerated.
> gcc/cp/
> * pt.c (build_non_dependent_expr): Use flag_checking > 1 instead of
> just flag_checking.
>
> --- configure.ac.jj 2016-03-17 23:58:35.0 +0100
> +++ configure.ac2016-04-25 18:15:34.703842886 +0200
> @@ -3530,16 +3530,17 @@ AC_ARG_ENABLE(stage1-checking,
># For --disable-checking or implicit --enable-checking=release, avoid
># setting --enable-checking=gc in the default stage1 checking for LTO
># bootstraps.  See PR62077.
> -  stage1_checking=--enable-checking=release,misc,gimple,rtlflag,tree,types
>case $BUILD_CONFIG in
>  *lto*)
> -  if test "x$enable_checking" = x && \
> -test -d ${srcdir}/gcc && \
> -test x"`cat ${srcdir}/gcc/DEV-PHASE`" = xexperimental; then
> -   stage1_checking=--enable-checking=yes,types
> -  fi;;
> -*) stage1_checking=--enable-checking=yes,types;;
> +  
> stage1_checking=--enable-checking=release,misc,gimple,rtlflag,tree,types;;
> +*)
> +  stage1_checking=--enable-checking=yes,types;;
>esac
> +  if test "x$enable_checking" = x && \
> + test -d ${srcdir}/gcc && \
> + test x"`cat ${srcdir}/gcc/DEV-PHASE`" = xexperimental; then
> +stage1_checking=yes,types,extra
> +  fi
>  else
>stage1_checking=--enable-checking=$enable_checking,types
>  fi])
> --- gcc/configure.ac.jj 2016-01-27 19:47:35.0 +0100
> +++ gcc/configure.ac2016-04-25 17:56:40.789041032 +0200
> @@ -516,12 +516,12 @@ AC_ARG_ENABLE(checking,
> [enable expensive run-time checks.  With LIST,
>  enable only specific categories of checks.
>  Categories are: yes,no,all,none,release.
> -Flags are: assert,df,fold,gc,gcac,gimple,misc,
> +Flags are: assert,df,extra,fold,gc,gcac,gimple,misc,
>  rtlflag,rtl,runtime,tree,valgrind,types])],
>  [ac_checking_flags="${enableval}"],[
>  # Determine the default checks.
>  if test x$is_release = x ; then
> -  ac_checking_flags=yes
> +  ac_checking_flags=yes,extra
>  else
>ac_checking_flags=release
>  fi])
> @@ -531,32 +531,33 @@ do
> case $check in
> # these set all the flags to specific states
> yes)ac_assert_checking=1 ; ac_checking=1 ; 
> ac_df_checking= ;
> -   ac_fold_checking= ; ac_gc_checking=1 ;
> +   ac_fold_checking= ; ac_gc_checking=1 ; 
> ac_extra_checking= ;
> ac_gc_always_collect= ; ac_gimple_checking=1 ; 
> ac_rtl_checking= ;
> ac_rtlflag_checking=1 ; ac_runtime_checking=1 ;
> ac_tree_checking=1 ; ac_valgrind_checking= ;
> a

Re: [PATCH][AArch64][wwwdocs] Summarise some more AArch64 changes for GCC6

2016-04-25 Thread Jim Wilson
On Thu, Apr 21, 2016 at 1:15 AM, Kyrill Tkachov
 wrote:
> Jim, you added support for the qdf24xx identifier to -mcpu and -mtune.
> Could you please suggest an appropriate entry to describe it?
> I think the same format as the Cortex-A35 entry in this patch would be
> appropriate.

This is tricky, as I'm working under an NDA, and the NDA requires
pre-approval from Qualcomm for each patch I contribute that is related
to this project.  It is actually easier if someone else can add this
text.
 The Cortex-A35 entry does look appropriate, with "ARM Cortex-A35"
replaced with "Qualcomm QDF24xx".  If you want me to write the text, I
will have to go through the approval process, which may take some
time.

Jim


Re: [PATCH] Prevent LTO wrappers to process a recursive execution

2016-04-25 Thread Andi Kleen
Martin Liška  writes:
>  #endif
> +  /* Do not search original location in the same folder.  */
> +  char *exe_folder = lrealpath (av[0]);
> +  exe_folder[strlen (exe_folder) - strlen (lbasename (exe_folder))] = 
> '\0';
> +  char *location = concat (exe_folder, PERSONALITY, NULL);

Does that really work? When the executable is found in $PATH
av[0] does not contain the full path name. But you seem to assume
it does?

-Andi

> +
> +  if (access (location, X_OK) == 0)
> + remove_prefix (exe_folder, &path);



[PATCH] Fix bootstrap on powerpc*-aix* (PR bootstrap/70704)

2016-04-25 Thread Jakub Jelinek
Hi!

As mentioned in the PR, some checking code, in particular the one
in C++ FE's build_non_dependent_expr, may affect code generation, as it can
instantiate templates that aren't instantiated otherwise, which affects
the various counters like cfun->funcdef_no, DECL_UID etc.

I'd like to commit the attached (shorter, safer) patch to gcc 6 branch,
which just disables this checking.

The larger, included patch, makes -fchecking a 3 state option, -fno-checking
aka -fchecking=0, no checking, -fchecking aka -fchecking=1 the previous
-fchecking except for the build_non_dependent_expr bits, and -fchecking=2
as checking that might affect code generation and tweaks the configury,
so that for non-release builds it defaults to this -fchecking=2, while for
release checking builds stage1 defaults to -fchecking=1 and stage2+
defaults to -fno-checking.

The shorter patch is currently being bootstrapped on powerpc-ibm-aix7.1.3.0,
the larger patch I've so far bootstrapped/regtested on x86_64-linux and
i686-linux on gcc-6 branch with implicit --enable-checking=release (where
stage1 performs yes but no extra checking), --disable-checking,
--enable-checking=yes,rtl and on the trunk so far bootstrapped and regtest
--pending with --enable-checking=yes,extra,rtl.  Will still check
also the implicit --enable-checking=yes,extra.

Ok for trunk (the included patch) and 6 branch (the attached patch)?

2016-04-25  Jakub Jelinek  

PR bootstrap/70704
* configure.ac (--enable-stage1-checking): For --disable-checking or
implicit --enable-checking, make sure extra flag matches in between
stage1 and later checking.
* configure: Regenerated.
gcc/
* configure.ac (--enable-checking): Document extra flag, for
non-release builds default to --enable-checking=yes,extra.
If misc checking and extra checking, define CHECKING_P to 2 instead
of 1.
* common.opt (fchecking=): Add.
* doc/invoke.texi (-fchecking=): Document.
* doc/install.texi: Document --enable-checking changes.
* configure: Regenerated.
* config.in: Regenerated.
gcc/cp/
* pt.c (build_non_dependent_expr): Use flag_checking > 1 instead of
just flag_checking.

--- configure.ac.jj 2016-03-17 23:58:35.0 +0100
+++ configure.ac2016-04-25 18:15:34.703842886 +0200
@@ -3530,16 +3530,17 @@ AC_ARG_ENABLE(stage1-checking,
   # For --disable-checking or implicit --enable-checking=release, avoid
   # setting --enable-checking=gc in the default stage1 checking for LTO
   # bootstraps.  See PR62077.
-  stage1_checking=--enable-checking=release,misc,gimple,rtlflag,tree,types
   case $BUILD_CONFIG in
 *lto*)
-  if test "x$enable_checking" = x && \
-test -d ${srcdir}/gcc && \
-test x"`cat ${srcdir}/gcc/DEV-PHASE`" = xexperimental; then
-   stage1_checking=--enable-checking=yes,types
-  fi;;
-*) stage1_checking=--enable-checking=yes,types;;
+  
stage1_checking=--enable-checking=release,misc,gimple,rtlflag,tree,types;;
+*)
+  stage1_checking=--enable-checking=yes,types;;
   esac
+  if test "x$enable_checking" = x && \
+ test -d ${srcdir}/gcc && \
+ test x"`cat ${srcdir}/gcc/DEV-PHASE`" = xexperimental; then
+stage1_checking=yes,types,extra
+  fi
 else
   stage1_checking=--enable-checking=$enable_checking,types
 fi])
--- gcc/configure.ac.jj 2016-01-27 19:47:35.0 +0100
+++ gcc/configure.ac2016-04-25 17:56:40.789041032 +0200
@@ -516,12 +516,12 @@ AC_ARG_ENABLE(checking,
[enable expensive run-time checks.  With LIST,
 enable only specific categories of checks.
 Categories are: yes,no,all,none,release.
-Flags are: assert,df,fold,gc,gcac,gimple,misc,
+Flags are: assert,df,extra,fold,gc,gcac,gimple,misc,
 rtlflag,rtl,runtime,tree,valgrind,types])],
 [ac_checking_flags="${enableval}"],[
 # Determine the default checks.
 if test x$is_release = x ; then
-  ac_checking_flags=yes
+  ac_checking_flags=yes,extra
 else
   ac_checking_flags=release
 fi])
@@ -531,32 +531,33 @@ do
case $check in
# these set all the flags to specific states
yes)ac_assert_checking=1 ; ac_checking=1 ; ac_df_checking= ;
-   ac_fold_checking= ; ac_gc_checking=1 ;
+   ac_fold_checking= ; ac_gc_checking=1 ; 
ac_extra_checking= ;
ac_gc_always_collect= ; ac_gimple_checking=1 ; 
ac_rtl_checking= ;
ac_rtlflag_checking=1 ; ac_runtime_checking=1 ;
ac_tree_checking=1 ; ac_valgrind_checking= ;
ac_types_checking=1 ;;
no|none)ac_assert_checking= ; ac_checking= ; ac_df_checking= ;
-   ac_fold_checking= ; ac_gc_checking= ;
+   ac_fold_checking= ; ac_gc_checking= ; 
ac_extra_checking= ;
ac_gc_always_coll

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes

On 04/25/16 14:21, Wilco Dijkstra wrote:

Evandro Menezes wrote:

I assume that you mean that such improvements are true for
-mcpu=generic, yes?  On which target, A53 or A57 or other?

It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows improvements on
all targets I have access to. The GCC switch expansion is awful, so
even with a good indirect predictor it is better to use conditional
branches.


I agree with your assessment, but I'm more curious to understand how 
this change affects code built with the default -mcpu=generic when run 
on both A53 and A57, the typical configuration of big.LITTLE machines.


Thank you,

--
Evandro Menezes



Re: [PATCH] Fix PR c++/70241 (inconsistent access with in-class enumeration)

2016-04-25 Thread Jason Merrill

OK.

Jason


[PATCH, i386]: Do not match (const_int 0) and (const_int 1) with const_wide_int

2016-04-25 Thread Uros Bizjak
Hello!

(const_int 0) and (const_int 1) are never const_wide_int.

2016-04-25  Uros Bizjak  

* config/i386/predicates.md (const0_operand): Do not match
const_wide_int code.
(const1_operand): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: predicates.md
===
--- predicates.md   (revision 235411)
+++ predicates.md   (working copy)
@@ -659,7 +659,7 @@
 
 ;; Match exactly zero.
 (define_predicate "const0_operand"
-  (match_code "const_int,const_wide_int,const_double,const_vector")
+  (match_code "const_int,const_double,const_vector")
 {
   if (mode == VOIDmode)
 mode = GET_MODE (op);
@@ -668,7 +668,7 @@
 
 ;; Match one or a vector with all elements equal to one.
 (define_predicate "const1_operand"
-  (match_code "const_int,const_wide_int,const_double,const_vector")
+  (match_code "const_int,const_double,const_vector")
 {
   if (mode == VOIDmode)
 mode = GET_MODE (op);


Re: [PING][PATCH] New plugin event when evaluating a constexpr call

2016-04-25 Thread Jason Merrill

On 04/25/2016 10:08 AM, Andres Tiraboschi wrote:

 *gcc/cp/constexpr.c (constexpr_fundef): Moved to gcc/cp/cp-tree.h.
 *gcc/cp/constexpr.c (constexpr_call): Ditto.
 *gcc/cp/constexpr.c (constexpr_ctx): Ditto.


Let's create a constexpr.h rather than expose constexpr internals to all 
of the front end.  Really, I'd prefer to avoid exposing them at all. 
Why does what you want to do require all this implementation detail?



bool non_constant_args = false;
cxx_bind_parameters_in_call (ctx, t, &new_call,
 non_constant_p, overflow_p, &non_constant_args);
+
+  constexpr_call_info call_info;
+  call_info.function = t;
+  call_info.lval = lval;
+  call_info.call = &new_call;
+  call_info.call_stack = call_stack;
+  call_info.non_constant_args = &non_constant_args;
+  call_info.non_const_p = non_constant_p;
+  call_info.ctx = ctx;
+  call_info.result = NULL_TREE;
+  invoke_plugin_callbacks (PLUGIN_EVAL_CALL_CONSTEXPR, &call_info);
+  if (call_info.result != NULL_TREE)
+{
+  return call_info.result;
+}
+
if (*non_constant_p)
  return t;


This is a curious place to invoke the callback.  Why before the 
*non_constant_p?  More generally, why between evaluating the arguments 
and evaluating the function body?


Jason



match.pd: x+x -> 2*x

2016-04-25 Thread Marc Glisse

Hello,

a simple transform to replace a more complicated one in fold-const.c.

This patch breaks the testcase gcc.dg/gomp/loop-1.c. Indeed, the C 
front-end folds too eagerly

  newrhs = c_fully_fold (newrhs, false, NULL);
in build_modify_expr, and by the time the OMP code checks that the 
increment in the for loop has the right form, it sees i=i*2 instead of 
i=i+i. Since the original code is apparently illegal, I guess it isn't 
that bad... The C++ front-end seems fine.


Testcase no-strict-overflow-6.c also breaks. ivcanon is clever enough to 
count how many iterations there are before i*=2 makes i negative, which I 
guess would be great with -fwrapv, but I find it a bit suspicious with 
just -fno-strict-overflow. I adjusted the testcase assuming the ivcanon 
was doing the right thing (with -fstrict-overflow we generate an infinite 
loop instead, so it is still testing that).


Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-04-26  Marc Glisse  

gcc/
* genmatch.c (write_predicate): Add ATTRIBUTE_UNUSED.
* fold-const.c (fold_binary_loc): Remove 2 transformations
superseded by match.pd.
* match.pd (x+x -> x*2): Generalize to integers.

gcc/testsuite/
* gcc.dg/fold-plusmult.c: Adjust.
* gcc.dg/no-strict-overflow-6.c: Adjust.
* gcc.dg/gomp/loop-1.c: Xfail some tests.

--
Marc GlisseIndex: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 235411)
+++ gcc/fold-const.c(working copy)
@@ -9949,39 +9949,20 @@ fold_binary_loc (location_t loc,
  /* Transform x * -C into -x * C if x is easily negatable.  */
  if (TREE_CODE (op1) == INTEGER_CST
  && tree_int_cst_sgn (op1) == -1
  && negate_expr_p (op0)
  && (tem = negate_expr (op1)) != op1
  && ! TREE_OVERFLOW (tem))
return fold_build2_loc (loc, MULT_EXPR, type,
fold_convert_loc (loc, type,
  negate_expr (op0)), tem);
 
- /* (A + A) * C -> A * 2 * C  */
- if (TREE_CODE (arg0) == PLUS_EXPR
- && TREE_CODE (arg1) == INTEGER_CST
- && operand_equal_p (TREE_OPERAND (arg0, 0),
- TREE_OPERAND (arg0, 1), 0))
-   return fold_build2_loc (loc, MULT_EXPR, type,
-   omit_one_operand_loc (loc, type,
- TREE_OPERAND (arg0, 0),
- TREE_OPERAND (arg0, 1)),
-   fold_build2_loc (loc, MULT_EXPR, type,
-build_int_cst (type, 2) , arg1));
-
- /* ((T) (X /[ex] C)) * C cancels out if the conversion is
-sign-changing only.  */
- if (TREE_CODE (arg1) == INTEGER_CST
- && TREE_CODE (arg0) == EXACT_DIV_EXPR
- && operand_equal_p (arg1, TREE_OPERAND (arg0, 1), 0))
-   return fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-
  strict_overflow_p = false;
  if (TREE_CODE (arg1) == INTEGER_CST
  && 0 != (tem = extract_muldiv (op0, arg1, code, NULL_TREE,
 &strict_overflow_p)))
{
  if (strict_overflow_p)
fold_overflow_warning (("assuming signed overflow does not "
"occur when simplifying "
"multiplication"),
   WARN_STRICT_OVERFLOW_MISC);
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 235411)
+++ gcc/genmatch.c  (working copy)
@@ -3549,21 +3549,21 @@ decision_tree::gen (FILE *f, bool gimple
 
 /* Output code to implement the predicate P from the decision tree DT.  */
 
 void
 write_predicate (FILE *f, predicate_id *p, decision_tree &dt, bool gimple)
 {
   fprintf (f, "\nbool\n"
   "%s%s (tree t%s%s)\n"
   "{\n", gimple ? "gimple_" : "tree_", p->id,
   p->nargs > 0 ? ", tree *res_ops" : "",
-  gimple ? ", tree (*valueize)(tree)" : "");
+  gimple ? ", tree (*valueize)(tree) ATTRIBUTE_UNUSED" : "");
   /* Conveniently make 'type' available.  */
   fprintf_indent (f, 2, "tree type = TREE_TYPE (t);\n");
 
   if (!gimple)
 fprintf_indent (f, 2, "if (TREE_SIDE_EFFECTS (t)) return false;\n");
   dt.root->gen_kids (f, 2, gimple);
 
   fprintf_indent (f, 2, "return false;\n"
   "}\n");
 }
Index: gcc/match.pd
===
--- gcc/match.pd(revision 235411)
+++ gcc/match.pd(working copy)
@@ -1621,25 +1621,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* Canonicalization of binary operations.  */
 
 /* Convert X + -C into X - C.  */
 (simplify
  (plus @0 REAL_C

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Wilco Dijkstra
Evandro Menezes wrote:
> I assume that you mean that such improvements are true for
> -mcpu=generic, yes?  On which target, A53 or A57 or other?

It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows improvements on 
all targets I have access to. The GCC switch expansion is awful, so
even with a good indirect predictor it is better to use conditional
branches.

Wilco






Re: [PATCH][AArch64] Replace insn to zero up SIMD registers

2016-04-25 Thread Evandro Menezes

On 03/10/16 10:37, James Greenhalgh wrote:

On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote:

I agree to postpone until GCC 7.

[AArch64] Replace insn to zero up SIMD registers

gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register.
(*movsf_aarch64): Likewise and add "simd" attributes.
(*movdf_aarch64): Likewise.

This patch removes the FP attributes from the HF, SF, DF, TF moves.

Thanks for sticking with it. This is OK for GCC 7 when development
opens.

Remember to mention the most recent changes in your Changelog entry
(Remove "fp" attribute from *movhf_aarch64 and *movtf_aarch64).


   gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register and
remove the "fp" attributes.
(*movsf_aarch64): Add "movi %0, #0" to zero up register and
add the "simd" attributes.
(*movdf_aarch64): Likewise.
(*movtf_aarch64): Remove the "fp" attributes.

OK to commit?

Thank you,

--
Evandro Menezes

>From b319dead6f72eb36ebedbf27547b2f86f2f9d41f Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] [AArch64] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register and
	remove the "fp" attributes.
	(*movsf_aarch64): Add "movi %0, #0" to zero up register and
	add the "simd" attributes.
	(*movdf_aarch64): Likewise.
	(*movtf_aarch64): Remove the "fp" attributes.
---
 gcc/config/aarch64/aarch64.md | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f423284..9b282f1 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1178,11 +1178,12 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "Y ,?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
+   movi\\t%0.4h, #0
mov\\t%0.h[0], %w1
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
@@ -1191,18 +1192,18 @@
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
+   movi\\t%0.2s, #0
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
@@ -1212,16 +1213,18 @@
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"  "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
 || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
+   movi\\t%d0, #0
fmov\\t%d0, %x1
fmov\\t%x0, %d1
fmov\\t%d0, %d1
@@ -1231,8 +1234,9 @@
ldr\\t%x0, %1
str\\t%x1, %0
mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
- f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
+ f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
@@ -1257,7 +1261,6 @@
   [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\
  f_loadd,f_stored,load2,store2,store2")
(set_attr "length" "4,8,8,8,4,4,4,4,4,4,4")
-   (set_attr "fp" "*,*,yes,yes,*,yes,yes,

[PATCH, i386]: Simplify emission of SSE constant (-1) load.

2016-04-25 Thread Uros Bizjak
2016-04-25  Uros Bizjak  

* config/i386/i386.md (*movoi_internal_avx): Set mode attribute to XI
for SSE constm1 operands and TARGET_AVX512VL.
(*movti_internal): Ditto.
(*mov_or): Use constm1_operand predicate.
* config/i386/sse.md (*mov_internal): Set mode attribute to XI
for SSE vector_all_ones operands and TARGET_AVX512VL.
* config/i386/predicates.md (constm1_operand): New predicate.
* config/i386/i386.c (standard_sse_constant_opcode): Simplify
emission of constant -1 load.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.c
===
--- i386.c  (revision 235411)
+++ i386.c  (working copy)
@@ -10868,30 +10868,24 @@ standard_sse_constant_opcode (rtx_insn *insn, rtx
case MODE_V8DF:
case MODE_V16SF:
  gcc_assert (TARGET_AVX512F);
- break;
+ return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";
+
case MODE_OI:
case MODE_V4DF:
case MODE_V8SF:
  gcc_assert (TARGET_AVX2);
- break;
+ /* FALLTHRU */
case MODE_TI:
case MODE_V2DF:
case MODE_V4SF:
  gcc_assert (TARGET_SSE2);
- break;
+ return (TARGET_AVX
+ ? "vpcmpeqd\t%0, %0, %0"
+ : "pcmpeqd\t%0, %0");
+
default:
  gcc_unreachable ();
}
-
-  if (TARGET_AVX512VL
- || insn_mode == MODE_XI
- || insn_mode == MODE_V8DF
- || insn_mode == MODE_V16SF)
-   return "vpternlogd\t{$0xFF, %g0, %g0, %g0|%g0, %g0, %g0, 0xFF}";
-  else if (TARGET_AVX)
-   return "vpcmpeqd\t%0, %0, %0";
-  else
-   return "pcmpeqd\t%0, %0";
}
 
   gcc_unreachable ();
Index: i386.md
===
--- i386.md (revision 235411)
+++ i386.md (working copy)
@@ -1960,10 +1960,9 @@
 
 (define_insn "*mov_or"
   [(set (match_operand:SWI48 0 "register_operand" "=r")
-   (match_operand:SWI48 1 "const_int_operand"))
+   (match_operand:SWI48 1 "constm1_operand"))
(clobber (reg:CC FLAGS_REG))]
-  "reload_completed
-   && operands[1] == constm1_rtx"
+  "reload_completed"
   "or{}\t{%1, %0|%0, %1}"
   [(set_attr "type" "alu1")
(set_attr "mode" "")
@@ -2039,11 +2038,14 @@
(cond [(ior (match_operand 0 "ext_sse_reg_operand")
(match_operand 1 "ext_sse_reg_operand"))
 (const_string "XI")
-  (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+  (and (eq_attr "alternative" "0")
+   (and (match_test "TARGET_AVX512VL")
+(match_operand 1 "constm1_operand")))
+(const_string "XI")
+  (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (and (eq_attr "alternative" "3")
+(match_test "TARGET_SSE_TYPELESS_STORES")))
 (const_string "V8SF")
-  (and (eq_attr "alternative" "3")
-   (match_test "TARGET_SSE_TYPELESS_STORES"))
-(const_string "V8SF")
  ]
  (const_string "OI")))])
 
@@ -2099,17 +2101,20 @@
(const_string "maybe_vex")
(const_string "orig")))
(set (attr "mode")
-   (cond [(ior (match_operand 0 "ext_sse_reg_operand")
+   (cond [(eq_attr "alternative" "0,1")
+(const_string "DI")
+  (ior (match_operand 0 "ext_sse_reg_operand")
(match_operand 1 "ext_sse_reg_operand"))
 (const_string "XI")
-  (eq_attr "alternative" "0,1")
-(const_string "DI")
+  (and (eq_attr "alternative" "2")
+   (and (match_test "TARGET_AVX512VL")
+(match_operand 1 "constm1_operand")))
+(const_string "XI")
   (ior (not (match_test "TARGET_SSE2"))
-   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
+   (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+(and (eq_attr "alternative" "5")
+ (match_test "TARGET_SSE_TYPELESS_STORES"
 (const_string "V4SF")
-  (and (eq_attr "alternative" "5")
-   (match_test "TARGET_SSE_TYPELESS_STORES"))
-(const_string "V4SF")
   (match_test "TARGET_AVX")
 (const_string "TI")
   (match_test "optimize_function_for_size_p (cfun)")
Index: predicates.md
===
--- predicates.md   (revision 235411)
+++ predicates.md   (working copy)
@@ -675,6 +675,11 @@
   return op == CONST1_RTX (mode);
 })
 
+;; Match exactly -1.
+(define_predicate "constm1_operand"
+  (and (match_code "const_int")
+   (match_test "op = constm1_rtx"))

Re: [PATCH][AArch64] Adjust SIMD integer preference

2016-04-25 Thread Evandro Menezes

On 04/22/16 10:35, Wilco Dijkstra wrote:

OK for trunk?


LGTM

--
Evandro Menezes



Re: [PATCH][AArch64][wwwdocs] Summarise some more AArch64 changes for GCC6

2016-04-25 Thread Evandro Menezes

On 04/21/16 03:15, Kyrill Tkachov wrote:

Ok to commit?


LGTM

--
Evandro Menezes



Re: [PATCH] Replace old AWK script (utilizing bc) with Python implementation

2016-04-25 Thread Matthias Klose

On 25.04.2016 16:57, Martin Liška wrote:

Hello.

As I've been playing with branch predictions and contrib/analyze_brprob script,
I've decided to replace the old script with a Python implementation.
Improvements:

+ fixed horizontal formatting
+ remove ugly utilization of bc that is used for arithmetics
+ script is a bit faster (tramp3d dump file): real 0m0.670s / 0m0.807s
+ usage of the script is more precisely explained and script comments are 
updated


please could you make the shebang python3? Not sure if it's good to replace one 
old implementation with a soon to become old implementation.


Matthias



Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes

On 04/22/16 11:15, Wilco Dijkstra wrote:

This patch fixes that by setting the default aarch64_case_values_threshold to
16 when the per-CPU tuning is not set.  On SPEC2006 this improves the switch
heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size
(0.5-1% smaller).


I assume that you mean that such improvements are true for 
-mcpu=generic, yes?  On which target, A53 or A57 or other?


Otherwise, it seems to be a sensible change, but I'm trying to 
understand how generally beneficial it is.


Thank you,

--
Evandro Menezes



Re: [PATCH, GCC 5] PR 70613, -fabi-version docs don't match implementation

2016-04-25 Thread Bernd Schmidt

On 04/25/2016 08:44 PM, Jim Wilson wrote:

On 04/18/2016 01:12 PM, Jim Wilson wrote:

On 04/11/2016 01:41 PM, Jim Wilson wrote:

Here is a patch to correct the -fabi-version docs on the GCC 5 branch.



https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00480.html


ping^2


Cc'ing Jason as the most likely person to know whether this is right.


Bernd



Re: [PATCH, GCC 5] PR 70613, -fabi-version docs don't match implementation

2016-04-25 Thread Jim Wilson

On 04/18/2016 01:12 PM, Jim Wilson wrote:

On 04/11/2016 01:41 PM, Jim Wilson wrote:

Here is a patch to correct the -fabi-version docs on the GCC 5 branch.



https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00480.html


ping^2

Jim



Re: C, C++: New warning for memset without multiply by elt size

2016-04-25 Thread Jason Merrill

On 04/25/2016 05:07 AM, Bernd Schmidt wrote:

+   if (TREE_CODE (arg2) == CONST_DECL)
+ arg2 = DECL_INITIAL (arg2);
+   int literal_mask = ((!!integer_zerop (arg1) << 1)
+   | (!!integer_zerop (arg2) << 2));


Are you deliberately treating an enumerator as a literal 0?  I'd think 
we should set literal_mask before stripping CONST_DECL.  OK with that 
change.


Jason



Re: [PATCH] add support for placing variables in shared memory

2016-04-25 Thread Alexander Monakov
On Mon, 25 Apr 2016, Nathan Sidwell wrote:
> On 04/22/16 10:04, Alexander Monakov wrote:
> > echo 'int v __attribute__((section("foo")));' |
> >x86_64-pc-linux-gnu-accel-nvptx-none-gcc -xc - -o /dev/null
> > :1:5: error: section attributes are not supported for this target
> 
> Presumably it's missing a necessary hook?  Couldn't such a hook check the
> section name is acceptable?

No, that really doesn't sound viable.  You'd need to somehow take into account
every instance where the compiler attempts to switch sections internally
(.text/.data/.bss, -ffunction-sections/-fdata-sections etc.).

> > > Why can it not apply to  variables of auto storage?  I.e. function scope,
> > > function lifetime?  That would seem to be a useful property.
> >
> > Because PTX does not support auto storage semantics for .shared data.  It's
> > statically allocated at link time.
> 
> I  suppose it's not worth going through hoops to define such function-scoped
> variables if PTX isn't going to take advantage of that.

It's not even about 'taking advantage', basic correctness expectations would be
violated (with auto storage you get new instances of the variable when
reentering function scope recursively).

> > > What happens if an initializer is present, is it silently ignored?
> >
> > GCC accepts and reemits it in assembly output (if non-zero), and ptxas
> > rejects
> > it ("syntax error").
> 
> ptx errors are inscrutable.
> 
> It would be better for nvptx_assemble_decl_end to check if an initializer has
> been output and emit an error (you'll need to record the decl itself in the
> initializer structure to do that).  Record the  decl in
> nvptx_assemble_decl_begin if the symbol's data area is .shared, and then check
> in NADE?

Ugh.  Checking DECL_INITIAL in nvptx_encode_section_info would be much
simpler (and that's how other backends perform a similar test).

Note, rejecting zero-initializers is debatable:
C and C++ don't have a concept of uninitialized global-scope data; if
the initializer is missing, it's exactly as if it was 0.  However, GCC has
-fcommon enabled by default (which, btw, shouldn't we change on NVPTX?), and
that makes a difference: 'int v = 0;' is a strong definition, while 'int v;'
becomes a common symbol, and ultimately a weak definition on NVPTX.

So if all-zeros initializers are rejected, to make a strong definition of a
shared variable one would have to write:

  int v __attribute__((shared,nocommon));

With -fno-common enabled by default on this target that wouldn't be an issue.

Thanks.
Alexander


Re: [PATCH] Fix PR c++/70241 (inconsistent access with in-class enumeration)

2016-04-25 Thread Patrick Palka
On Sun, Apr 17, 2016 at 2:01 PM, Patrick Palka  wrote:
> When an in-class unscoped enumeration is defined out-of-line its
> enumerators currently don't inherit the access of the enumeration.  This
> patch makes the access of the enumerations defined out-of-line match the
> access of the enumerator.
>
> Also, we currently don't check that redeclarations of in-class
> enumerations have the same access, which this patch fixes as well.
>
> Bootstrapped + regtested on x86_64-pc-linux-gnu, does this look OK to
> commit?
>
> gcc/cp/ChangeLog:
>
> PR c++/70241
> * decl.c (build_enumerator): Set current_access_specifier when
> declaring an enumerator belonging to an in-class enumeration.
> * parser.c (cp_parser_check_access_in_redecleration): Also
> consider in-class enumerations.
>
> gcc/testsite/ChangeLog:
>
> PR c++/70241
> * g++.dg/cpp0x/enum32.C: New test.
> * g++.dg/cpp0x/enum33.C: New test.
> ---
>  gcc/cp/decl.c   | 28 
>  gcc/cp/parser.c |  8 +---
>  gcc/testsuite/g++.dg/cpp0x/enum32.C | 25 +
>  gcc/testsuite/g++.dg/cpp0x/enum33.C | 11 +++
>  4 files changed, 65 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum32.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum33.C
>
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index f9f12a7..0f217a5 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -13694,10 +13694,30 @@ incremented enumerator value is too large for 
> %");
>  cplus_decl_attributes (&decl, attributes, 0);
>
>if (context && context == current_class_type && !SCOPED_ENUM_P (enumtype))
> -/* In something like `struct S { enum E { i = 7 }; };' we put `i'
> -   on the TYPE_FIELDS list for `S'.  (That's so that you can say
> -   things like `S::i' later.)  */
> -finish_member_declaration (decl);
> +{
> +  /* In something like `struct S { enum E { i = 7 }; };' we put `i'
> +on the TYPE_FIELDS list for `S'.  (That's so that you can say
> +things like `S::i' later.)  */
> +
> +  /* The enumerator may be getting declared outside of its enclosing
> +class, like so:
> +
> +  class S { public: enum E : int; }; enum S::E : int { i = 7; };
> +
> +For which case we need to make sure that the access of `S::i'
> +matches the access of `S::E'.  */
> +  tree saved_cas = current_access_specifier;
> +  if (TREE_PRIVATE (TYPE_NAME (enumtype)))
> +   current_access_specifier = access_private_node;
> +  else if (TREE_PROTECTED (TYPE_NAME (enumtype)))
> +   current_access_specifier = access_protected_node;
> +  else
> +   current_access_specifier = access_public_node;
> +
> +  finish_member_declaration (decl);
> +
> +  current_access_specifier = saved_cas;
> +}
>else
>  pushdecl (decl);
>
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 5486129..f782d70 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -27228,13 +27228,15 @@ cp_parser_check_class_key (enum tag_types 
> class_key, tree type)
>
>  /* Issue an error message if DECL is redeclared with different
> access than its original declaration [class.access.spec/3].
> -   This applies to nested classes and nested class templates.
> -   [class.mem/1].  */
> +   This applies to nested classes, nested class templates and
> +   enumerations [class.mem/1].  */
>
>  static void
>  cp_parser_check_access_in_redeclaration (tree decl, location_t location)
>  {
> -  if (!decl || !CLASS_TYPE_P (TREE_TYPE (decl)))
> +  if (!decl
> +  || (!CLASS_TYPE_P (TREE_TYPE (decl))
> + && TREE_CODE (TREE_TYPE (decl)) != ENUMERAL_TYPE))
>  return;
>
>if ((TREE_PRIVATE (decl)
> diff --git a/gcc/testsuite/g++.dg/cpp0x/enum32.C 
> b/gcc/testsuite/g++.dg/cpp0x/enum32.C
> new file mode 100644
> index 000..9d7a7b5
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/enum32.C
> @@ -0,0 +1,25 @@
> +// PR c++/70241
> +// { dg-do compile { target c++11 } }
> +
> +class A {
> +public:
> +   enum B : int;
> +};
> +
> +enum A::B : int {
> +   x
> +};
> +
> +struct C {
> +private:
> +enum D : int;
> +};
> +
> +enum C::D : int {
> +   y
> +};
> +
> +int main() {
> +   A::x;
> +   C::y; // { dg-error "private" }
> +}
> diff --git a/gcc/testsuite/g++.dg/cpp0x/enum33.C 
> b/gcc/testsuite/g++.dg/cpp0x/enum33.C
> new file mode 100644
> index 000..ac39741
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/enum33.C
> @@ -0,0 +1,11 @@
> +// PR c++/70241
> +// { dg-do compile { target c++11 } }
> +
> +class A {
> +public:
> +  enum B : int;
> +  enum class C : int;
> +private:
> +  enum B : int { }; // { dg-error "different access" }
> +  enum class C : int { }; // { dg-error "different access" }
> +};
> --
> 2.8.1.231.g95ac767
>

Ping.


Re: C++ PATCH for c++/70744 (wrong-code with x ?: y extension)

2016-04-25 Thread Jason Merrill

On 04/25/2016 11:35 AM, Marek Polacek wrote:

On Fri, Apr 22, 2016 at 03:28:27PM -0400, Jason Merrill wrote:

On Fri, Apr 22, 2016 at 2:12 PM, Marek Polacek  wrote:

+cp_stabilize_reference (tree ref)
+{
+  if (TREE_CODE (ref) == PREINCREMENT_EXPR
+  || TREE_CODE (ref) == PREDECREMENT_EXPR)


I think we want to do this for anything stabilize_reference doesn't
handle specifically, not just pre..crement.


Which would mean something along the lines of the following, I hope.


Yes, except let's drop this case:


+case COMPOUND_EXPR:


because stabilize_expr will do the wrong thing for lvalue COMPOUND_EXPR.

OK with that change.

Jason



Re: RFA: PATCH to tell gdb to skip over is-a.h inlines

2016-04-25 Thread Jeff Law

On 04/25/2016 11:28 AM, Jason Merrill wrote:

There doesn't seem to be any need to step through the is-a inline
functions.  OK for trunk?

Yes, please :-)
jeff



RFA: PATCH to tell gdb to skip over is-a.h inlines

2016-04-25 Thread Jason Merrill
There doesn't seem to be any need to step through the is-a inline 
functions.  OK for trunk?
commit 1b74375b17e37ab7c5f96944148ff5a6bff3f8bc
Author: Jason Merrill 
Date:   Wed Apr 20 10:21:02 2016 -0400

	* gdbinit.in: Skip is-a.h.

diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index af7d51a..d221130 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -246,6 +246,9 @@ set check type off
 # See https://sourceware.org/gdb/current/onlinedocs/gdb/Skipping-Over-Functions-and-Files.html
 skip file tree.h
 
+# Also skip inline functions in is-a.h.
+skip file is-a.h
+
 # Likewise, skip various inline functions in rtl.h.
 skip rtx_expr_list::next
 skip rtx_expr_list::element


C++ PATCH to implement C++17 maybe_unused attribute

2016-04-25 Thread Jason Merrill
The C++17 maybe_unused attribute is mostly equivalent to the GNU unused 
attribute, except that it can also be applied to enumerators.


I was surprised to see that there currently isn't a table of C++ 
standard attributes; all the standard attributes we already support are 
handled by translating them into GNU attributes.  That doesn't work as 
well in this case because the name is different, and it seems to me we 
might as well start a table of standard attributes.  To make that work I 
needed to declare register_scoped_attributes in attribs.h and fix a 
logic error in that function.


Since the semantics are the same, I'm using the same 
handle_unused_attribute function and just allowing it to accept CONST_DECL.


Currently __has_cpp_attribute is implemented using the same hook as 
__has_attribute, so the function doesn't know whether we're asking about 
a GNU attribute or standard attribute.  For now let's leave that alone 
and just handle the special standard values without trying to look up 
the attribute_spec.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0c140407be1dbcb43fa4e2047d5aa65cb5ac5ff8
Author: Jason Merrill 
Date:   Tue Mar 8 16:27:45 2016 -0500

	Implement C++17 [[maybe_unused]] attribute.

gcc/
	* attribs.c (register_scoped_attributes): Fix logic.
	* attribs.h: Declare register_scoped_attributes.
c-family/
	* c-common.c (handle_unused_attribute): Accept CONST_DECL.
	No longer static.
	* c-common.h: Declare it.
	* c-lex.c (c_common_has_attribute): Add maybe_unused.
cp/
	* tree.c (std_attribute_table): New.
	(init_tree): Register it.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 16996e9..9a88621 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -130,7 +130,7 @@ register_scoped_attributes (const struct attribute_spec * attributes,
   /* We don't have any namespace NS yet.  Create one.  */
   scoped_attributes sa;
 
-  if (!attributes_table.is_empty ())
+  if (attributes_table.is_empty ())
 	attributes_table.create (64);
 
   memset (&sa, 0, sizeof (sa));
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 9e64a7a..23d3043 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -38,4 +38,7 @@ extern tree get_attribute_name (const_tree);
 extern void apply_tm_attr (tree, tree);
 extern tree make_attribute (const char *, const char *, tree);
 
+extern struct scoped_attributes* register_scoped_attributes (const struct attribute_spec *,
+			 const char *);
+
 #endif // GCC_ATTRIBS_H
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index cae2faf..1edc0bc 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -327,7 +327,6 @@ static tree handle_artificial_attribute (tree *, tree, tree, int, bool *);
 static tree handle_flatten_attribute (tree *, tree, tree, int, bool *);
 static tree handle_error_attribute (tree *, tree, tree, int, bool *);
 static tree handle_used_attribute (tree *, tree, tree, int, bool *);
-static tree handle_unused_attribute (tree *, tree, tree, int, bool *);
 static tree handle_externally_visible_attribute (tree *, tree, tree, int,
 		 bool *);
 static tree handle_no_reorder_attribute (tree *, tree, tree, int,
@@ -7033,7 +7032,7 @@ handle_used_attribute (tree *pnode, tree name, tree ARG_UNUSED (args),
 /* Handle a "unused" attribute; arguments as in
struct attribute_spec.handler.  */
 
-static tree
+tree
 handle_unused_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 			 int flags, bool *no_add_attrs)
 {
@@ -7044,6 +7043,7 @@ handle_unused_attribute (tree *node, tree name, tree ARG_UNUSED (args),
   if (TREE_CODE (decl) == PARM_DECL
 	  || VAR_OR_FUNCTION_DECL_P (decl)
 	  || TREE_CODE (decl) == LABEL_DECL
+	  || TREE_CODE (decl) == CONST_DECL
 	  || TREE_CODE (decl) == TYPE_DECL)
 	{
 	  TREE_USED (decl) = 1;
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 663e457..4c43a35 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -790,6 +790,7 @@ extern void check_function_arguments_recurse (void (*)
 	  unsigned HOST_WIDE_INT);
 extern bool check_builtin_function_arguments (tree, int, tree *);
 extern void check_function_format (tree, int, tree *);
+extern tree handle_unused_attribute (tree *, tree, tree, int, bool *);
 extern tree handle_format_attribute (tree *, tree, tree, int, bool *);
 extern tree handle_format_arg_attribute (tree *, tree, tree, int, bool *);
 extern bool attribute_takes_identifier_p (const_tree);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 96da4fc..6b020a4 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -340,23 +340,26 @@ c_common_has_attribute (cpp_reader *pfile)
 		  attr_name = NULL_TREE;
 		}
 	}
-	}
-  if (attr_name)
-	{
-	  init_attributes ();
-	  const struct attribute_spec *attr = lookup_attribute_spec (attr_name);
-	  if (attr)
+	  else
 	{
-	  if (TREE_CODE (attr_name) == TREE_LIST)
-		attr_name = TREE_VALUE (attr_nam

Re: [PATCH, fortran, v3] Use Levenshtein spelling suggestions in Fortran FE

2016-04-25 Thread David Malcolm
On Sat, 2016-04-23 at 20:21 +0200, Bernhard Reutner-Fischer wrote:
> On March 7, 2016 3:57:16 PM GMT+01:00, David Malcolm <
> dmalc...@redhat.com> wrote:
> > On Sat, 2016-03-05 at 23:46 +0100, Bernhard Reutner-Fischer wrote:
> > [...]
> > 
> > > diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c
> > > index 405bae0..72ed311 100644
> > > --- a/gcc/fortran/misc.c
> > > +++ b/gcc/fortran/misc.c
> > [...]
> > 
> > > @@ -274,3 +275,41 @@ get_c_kind(const char
> > > *c_kind_name,teropKind_tki
> > > nds_table[])
> > >  
> > >return ISOCBINDING_INVALID;
> > >  }
> > > +
> > > +
> > > +/* For a given name TYPO, determine the best candidate from
> > > CANDIDATES
> > > +   perusing Levenshtein distance.  Frees CANDIDATES before
> > > returning.  */
> > > +
> > > +const char *
> > > +gfc_closest_fuzzy_match (const char *typo, char **candidates)
> > > +{
> > > +  /* Determine closest match.  */
> > > +  const char *best = NULL;
> > > +  char **cand = candidates;
> > > +  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
> > > +
> > > +  while (cand && *cand)
> > > +{
> > > +  edit_distance_t dist = levenshtein_distance (typo, *cand);
> > > +  if (dist < best_distance)
> > > + {
> > > +best_distance = dist;
> > > +best = *cand;
> > > + }
> > > +  cand++;
> > > +}
> > > +  /* If more than half of the letters were misspelled, the
> > > suggestion is
> > > + likely to be meaningless.  */
> > > +  if (best)
> > > +{
> > > +  unsigned int cutoff = MAX (strlen (typo), strlen (best)) /
> > > 2;
> > > +
> > > +  if (best_distance > cutoff)
> > > + {
> > > +   XDELETEVEC (candidates);
> > > +   return NULL;
> > > + }
> > > +  XDELETEVEC (candidates);
> > > +}
> > > +  return best;
> > > +}
> > 
> > FWIW, there are two overloaded variants of levenshtein_distance in
> > gcc/spellcheck.h, the first of which takes a pair of strlen values;
> > your patch uses the second one:
> > 
> > extern edit_distance_t
> > levenshtein_distance (const char *s, int len_s,
> >   const char *t, int len_t);
> > 
> > extern edit_distance_t
> > levenshtein_distance (const char *s, const char *t);
> > 
> > So one minor tweak you may want to consider here is to calculate
> >  strlen (typo)
> > once at the top of gfc_closest_fuzzy_match, and then pass it in to
> > the
> > 4-arg variant of levenshtein_distance, which would avoid
> > recalculating
> > strlen (typo) for every candidate.
> 
> I've pondered this back then but came to the conclusion to use the
> variant without len because to use the 4 argument variant I would
> have stored the candidates strlen in the vector too

Why would you need to do that?  You can simply call strlen inside the
loop instead; something like:

  size_t strlen_typo = strlen (typo);
  while (cand && *cand)
{
  edit_distance_t dist = levenshtein_distance (typo, strlen_typo,
   *cand, strlen (*cand));

etc

>  and was not convinced about the memory footprint for that would be
> justified. Maybe it is, but I would prefer the following tweak in the
> 4 argument variant:
> If you would amend the 4 argument variant with a
> 
>   if (len_t == -1)
> len_t = strlen (t);
> before the
>if (len_s == 0)
>  return len_t;
>if (len_t == 0)
>  return len_s;
> 
> checks then I'd certainly use the 4 arg variant :)
> 
> WDYT?
> > 
> > I can't comment on the rest of the patch (I'm not a Fortran
> > expert),
> > though it seems sane to 
> > 
> > Hope this is constructive
> 
> It is, thanks for your thoughts!
> 
> cheers,
> 


[PATCH GCC]Cleanup tree ifcvt by renaming any_mask_load_store.

2016-04-25 Thread Bin Cheng
Hi,
This is a simple patch for tree ifcvt.  It renames variable any_mask_load_store 
to any_pred_load_store, as well as makes the variable visible in file scope.  
First rationale is name of that variable is confusing with masked load store.  
In fact, it also covers cases in which data race store is introduced, and 
that's not masked load store at all.  From the point of view of the variable's 
def/use, it's clear the variable indicates we introduces new load/store during 
if-conversion that needs to be predicated by some conditions.  The second 
rationale is the variable records a global flag information and is used in many 
places.  Together with patch at 
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01395.html, this patch resolves 
ambiguity of the variable and is good for next patch fixing PR56541

Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

2016-04-22  Bin Cheng  

* tree-if-conv.c (any_pred_load_store): New static variable.
(if_convertible_gimple_assign_stmt_p): Remove parameter.  Use
any_pred_load_store instead of and_mask_load_store.
(if_convertible_stmt_p, if_convertible_loop_p_1): Ditto.
(if_convertible_loop_p, insert_gimplified_predicates): Ditto.
(combine_blocks, tree_if_conversion): Ditto.
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 744d6f5..32ced16 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -113,6 +113,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "builtins.h"
 #include "params.h"
+ 
+/* Indicate if new load/store that needs to be predicated is introduced
+   during if conversion.  */
+static bool any_pred_load_store;
 
 /* Hash for struct innermost_loop_behavior.  It depends on the user to
free the memory.  */
@@ -867,8 +871,7 @@ ifcvt_can_use_mask_load_store (gimple *stmt)
 
 static bool
 if_convertible_gimple_assign_stmt_p (gimple *stmt,
-vec refs,
-bool *any_mask_load_store)
+vec refs)
 {
   tree lhs = gimple_assign_lhs (stmt);
 
@@ -906,7 +909,7 @@ if_convertible_gimple_assign_stmt_p (gimple *stmt,
   if (ifcvt_can_use_mask_load_store (stmt))
{
  gimple_set_plf (stmt, GF_PLF_2, true);
- *any_mask_load_store = true;
+ any_pred_load_store = true;
  return true;
}
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -917,7 +920,7 @@ if_convertible_gimple_assign_stmt_p (gimple *stmt,
   /* When if-converting stores force versioning, likewise if we
  ended up generating store data races.  */
   if (gimple_vdef (stmt))
-*any_mask_load_store = true;
+any_pred_load_store = true;
 
   return true;
 }
@@ -930,8 +933,7 @@ if_convertible_gimple_assign_stmt_p (gimple *stmt,
- it is builtins call.  */
 
 static bool
-if_convertible_stmt_p (gimple *stmt, vec refs,
-  bool *any_mask_load_store)
+if_convertible_stmt_p (gimple *stmt, vec refs)
 {
   switch (gimple_code (stmt))
 {
@@ -941,8 +943,7 @@ if_convertible_stmt_p (gimple *stmt, vec 
refs,
   return true;
 
 case GIMPLE_ASSIGN:
-  return if_convertible_gimple_assign_stmt_p (stmt, refs,
- any_mask_load_store);
+  return if_convertible_gimple_assign_stmt_p (stmt, refs);
 
 case GIMPLE_CALL:
   {
@@ -1248,9 +1249,7 @@ predicate_bbs (loop_p loop)
in if_convertible_loop_p.  */
 
 static bool
-if_convertible_loop_p_1 (struct loop *loop,
-vec *refs,
-bool *any_mask_load_store)
+if_convertible_loop_p_1 (struct loop *loop, vec *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
@@ -1354,8 +1353,7 @@ if_convertible_loop_p_1 (struct loop *loop,
   /* Check the if-convertibility of statements in predicated BBs.  */
   if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr))
- if (!if_convertible_stmt_p (gsi_stmt (itr), *refs,
- any_mask_load_store))
+ if (!if_convertible_stmt_p (gsi_stmt (itr), *refs))
return false;
 }
 
@@ -1389,7 +1387,7 @@ if_convertible_loop_p_1 (struct loop *loop,
- if its basic blocks and phi nodes are if convertible.  */
 
 static bool
-if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store)
+if_convertible_loop_p (struct loop *loop)
 {
   edge e;
   edge_iterator ei;
@@ -1427,7 +1425,7 @@ if_convertible_loop_p (struct loop *loop, bool 
*any_mask_load_store)
   return false;
 
   refs.create (5);
-  res = if_convertible_loop_p_1 (loop, &refs, any_mask_load_store);
+  res = if_convertible_loop_p_1 (loop, &refs);
 
   data_reference_p dr;
   unsigned int i;
@@ -1896,7 +1894,7 @@ predicate_all_scalar_phis (struct loop *loop)
gimplification of the predicates.  */
 
 static void
-insert_gimplified_

Re: [PATCH] AARCH64: Remove spurious attribute __unused__ from NEON intrinsic

2016-04-25 Thread James Greenhalgh
On Mon, Apr 25, 2016 at 05:39:45PM +0200, Wladimir J. van der Laan wrote:
> 
> Thanks for the info with regard to contributing,
> 
> On Fri, Apr 22, 2016 at 09:40:11AM +0100, James Greenhalgh wrote:
> > This patch will need a ChangeLog entry [1], please draft one that I can
> > use when I apply the patch.
> 
> * gcc/config/aarch64/arm_neon.h: Remove spurious attribute __unused__ from 
> parameter of vdupb_laneq_s intrinsic

Close... This should look like:

2016-04-25  Wladimir J. van der Laan  

* config/aarch64/arm_neon.h (vdupb_laneq_s): Remove spurious
attribute __unused__

Can you confirm that this is how you want your name and email address
to appear in the ChangeLog. If so, I'll commit the patch for you.

> > I'm guessing that you don't have a copyright assignment on file with the
> > FSF. While trivial changes like this don't generally need one, if you plan
> > to contribute more substantial changed to GCC in future, you may want to
> > start the process (see [2]).
> 
> I intend to do this, but indeed let's not hold this two-word change up
> on that.

Thanks,
James



Re: Document OpenACC status for GCC 6

2016-04-25 Thread Jakub Jelinek
On Fri, Apr 22, 2016 at 11:26:11AM +0200, Thomas Schwinge wrote:
> Index: htdocs/gcc-6/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
> retrieving revision 1.75
> diff -u -p -r1.75 changes.html

LGTM.

> --- htdocs/gcc-6/changes.html 21 Apr 2016 15:57:43 -  1.75
> +++ htdocs/gcc-6/changes.html 22 Apr 2016 09:22:19 -
> @@ -124,6 +124,52 @@ For more information, see the
>  
>  New Languages and Language specific improvements
>  
> +Compared to GCC 5, the GCC 6 release series includes a much 
> improved
> +implementation of the http://www.openacc.org/";>OpenACC 2.0a
> +  specification.  Highlights are:
> +
> +  In addition to single-threaded host-fallback execution, offloading 
> is
> + supported for nvptx (Nvidia GPUs) on x86_64 and PowerPC 64-bit
> + little-endian GNU/Linux host systems.  For nvptx offloading, with the
> + OpenACC parallel construct, the execution model allows for an arbitrary
> + number of gangs, up to 32 workers, and 32 vectors.
> +  Initial support for parallelized execution of OpenACC kernels
> + constructs:
> + 
> +   Parallelization of a kernels region is switched on
> + by -fopenacc combined with -O2 or
> + higher.
> +   Code is offloaded onto multiple gangs, but executes with just one
> + worker, and a vector length of 1.
> +   Directives inside a kernels region are not supported.
> +   Loops with reductions can be parallelized.
> +   Only kernels regions with one loop nest are parallelized.
> +   Only the outer-most loop of a loop nest can be parallelized.
> +   Loop nests containing sibling loops are not parallelized.
> + 
> + Typically, using the OpenACC parallel construct gives much better
> + performance, compared to the initial support of the OpenACC kernels
> + construct.
> +  The device_type clause is not supported.
> + The bind and nohost clauses are not
> + supported.  The host_data directive is not supported in
> + Fortran.
> +  Nested parallelism (cf. CUDA dynamic parallelism) is not
> + supported.
> +  Usage of OpenACC constructs inside multithreaded contexts (such as
> + created by OpenMP, or pthread programming) is not supported.
> +  If a call to the acc_on_device function has a
> + compile-time constant argument, the function call evaluates to a
> + compile-time constant value only for C and C++ but not for
> + Fortran.
> +
> +See the https://gcc.gnu.org/wiki/OpenACC";>OpenACC
> +and https://gcc.gnu.org/wiki/Offloading";>Offloading wiki 
> pages
> +for further information.
> +  
> +
>  
>  
>  C family

Jakub


Re: [PATCH GCC]Improve tree ifconv by handling virtual PHIs which can be degenerated.

2016-04-25 Thread Bin.Cheng
On Fri, Apr 22, 2016 at 11:47 AM, Richard Biener
 wrote:
> On Fri, Apr 22, 2016 at 12:33 PM, Bin.Cheng  wrote:
>> On Fri, Apr 22, 2016 at 11:25 AM, Richard Biener
>>  wrote:
>>> On Fri, Apr 22, 2016 at 12:07 PM, Bin Cheng  wrote:
 Hi,
 Tree if-conv has below code checking on virtual PHI nodes in 
 if_convertible__phi_p:

   if (any_mask_load_store)
 return true;

   /* When there were no if-convertible stores, check
  that there are no memory writes in the branches of the loop to be
  if-converted.  */
   if (virtual_operand_p (gimple_phi_result (phi)))
 {
   imm_use_iterator imm_iter;
   use_operand_p use_p;

   if (bb != loop->header)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "Virtual phi not on loop->header.\n");
   return false;
 }

   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi))
 {
   if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI
   && USE_STMT (use_p) != phi)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "Difficult to handle this virtual 
 phi.\n");
   return false;
 }
 }
 }

 After investigation, I think it's to bypass code in the form of:

 
   .MEM_2232 = PHI <.MEM_574(179), .MEM_1247(183)> //  // >>
>>> Doesn't this undo my fix for degenerate non-virtual PHIs?
>> No, since we already support degenerate non-virtual PHIs in
>> predicate_scalar_phi, your fix is also for virtual PHIs handling.
>
> Was it?  I don't remember ;)  I think it was for a non-virtual PHI.
> Anyway, you should
> see the PR70725 testcase fail again if not.
>
>>>
>>> I believe we can just drop virtual PHIs and rely on
>>>
>>>   if (any_mask_load_store)
>>> {
>>>   mark_virtual_operands_for_renaming (cfun);
>>>   todo |= TODO_update_ssa_only_virtuals;
>>> }
>>>
>>> re-creating them from scratch.  To do better than that we'd simply
>> I tried this, simply enable above code for all cases can't resolve
>> verify_ssa issue.  I haven't look into the details, looks like ssa
>> def-use chain is corrupted in if-conversion if we don't process it
>> explicitly.  Maybe it's possible along with your below suggestions,
>> but we need to handle uses outside of loop too.
>
> Yes.  I don't like all the new code to deal with virtual PHIs when doing
> it correctly would also avoid the above virtual SSA update ...
>
> After all the above seems to work for the case of if-converted stores
> (which is where virtual PHIs appear as well, even not degenerate).
> So I don't see exactly how it would break in the other case.  I suppose
> you may need to call mark_virtual_phi_result_for_renaming () on
> all virtual PHIs.
>
Hi Richard,
Here is the updated patch.  It also fixes PR70771 & PR70775. Root
cause for the ICE is in the fix to PR70725 because it forgot to
release single-argument PHI nodes after replacing uses.  In
combine_blocks, these PHIs are removed from basic block but are still
live in IR.  As a result, the ssa def/use chain for these PHIs are in
broken state, thus ICE is triggered whenever ssa use list is
accessed..
In this updated patch, I made below change to update virtual ssa
unconditionally.  With this change, we don't need to handle virtual
PHIs explicitly, and single-argument PHI related code in fix to
PR70725 can also be re

Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread Ilya Enkovich
2016-04-25 18:27 GMT+03:00 H.J. Lu :
>
> Ilya, can you take a look?
>
> Thanks.
>
> --
> H.J.

Hi,

Algorithmic part of the patch looks OK to me except the following piece of code.

+/* Check REF's chain to add new insns into a queue
+   and find registers requiring conversion.  */

Comment is wrong because you don't have any conversions required for
your candidates.

+
+void
+scalar_chain_64::analyze_register_chain (bitmap candidates, df_ref ref)
+{
+  df_link *chain;
+
+  gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
+ || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref)));
+  add_to_queue (DF_REF_INSN_UID (ref));
+
+  for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
+{
+  unsigned uid = DF_REF_INSN_UID (chain->ref);
+
+  if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
+   continue;
+
+  if (!DF_REF_REG_MEM_P (chain->ref))
+   continue;

I believe here you wrongly jump to the next ref intead of actually adding it
to a queue.  You may just use

gcc_assert (!DF_REF_REG_MEM_P (chain->ref));

because you should'n have a candidate used in address operand.

+
+  if (bitmap_bit_p (insns, uid))
+   continue;
+
+  if (bitmap_bit_p (candidates, uid))
+   add_to_queue (uid);

Probably gcc_assert (bitmap_bit_p (candidates, uid)) since no uses and defs
out of candidates list are allowed?

+}
+}

Thanks,
Ilya


[PATCH] Prevent LTO wrappers to process a recursive execution

2016-04-25 Thread Martin Liška
Hello.

To make LTO wrappers (gcc-nm, gcc-ar, gcc-ranlib) more smart, I would like to 
prevent execution
of the same binary by these wrapper. For LTO testing I symlink ar (nm, ranlib) 
to these wrappers instead
of hacking a build system to respect NM (AR, RANLIB) environment variables. The 
only problem with that solution
is that these wrappers recursively executes themselves as the first folder in 
PATH is set to the location with wrappers.

Following patch presents such recursion.

Patch can bootstrap®test on x86_64-linux-gnu.

Ready for trunk?
Thanks,
Martin
>From dfe0486ad7babe3d6de349001d4790684dc94bfb Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 22 Apr 2016 17:57:23 +0200
Subject: [PATCH] Prevent LTO wrappers to process a recursive execution

gcc/ChangeLog:

2016-04-22  Martin Liska  

	* file-find.c (remove_prefix): New function.
	* file-find.h (remove_prefix): Declare the function.
	* gcc-ar.c (main): Skip a folder of the wrapper if
	a wrapped binary would point to the same file.
---
 gcc/file-find.c | 35 +++
 gcc/file-find.h |  1 +
 gcc/gcc-ar.c|  8 
 3 files changed, 44 insertions(+)

diff --git a/gcc/file-find.c b/gcc/file-find.c
index 289ef28..1066da9 100644
--- a/gcc/file-find.c
+++ b/gcc/file-find.c
@@ -208,3 +208,38 @@ prefix_from_string (const char *p, struct path_prefix *pprefix)
 }
   free (nstore);
 }
+
+void
+remove_prefix (const char *prefix, struct path_prefix *pprefix)
+{
+  struct prefix_list *remove, **prev, **remove_prev = NULL;
+  int max_len = 0;
+
+  if (pprefix->plist)
+{
+  prev = &pprefix->plist;
+  for (struct prefix_list *pl = pprefix->plist; pl->next; pl = pl->next)
+	{
+	  if (strcmp (prefix, pl->prefix) == 0)
+	{
+	  remove = pl;
+	  remove_prev = prev;
+	  continue;
+	}
+
+	  int l = strlen (pl->prefix);
+	  if (l > max_len)
+	max_len = l;
+
+	  prev = &pl;
+	}
+
+  if (remove_prev)
+	{
+	  *remove_prev = remove->next;
+	  free (remove);
+	}
+
+  pprefix->max_len = max_len;
+}
+}
diff --git a/gcc/file-find.h b/gcc/file-find.h
index 5ad9a5f..19a4746 100644
--- a/gcc/file-find.h
+++ b/gcc/file-find.h
@@ -41,6 +41,7 @@ extern void find_file_set_debug (bool);
 extern char *find_a_file (struct path_prefix *, const char *, int);
 extern void add_prefix (struct path_prefix *, const char *);
 extern void add_prefix_begin (struct path_prefix *, const char *);
+extern void remove_prefix (const char *prefix, struct path_prefix *);
 extern void prefix_from_env (const char *, struct path_prefix *);
 extern void prefix_from_string (const char *, struct path_prefix *);
 
diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c
index 45ba361..a02dccb 100644
--- a/gcc/gcc-ar.c
+++ b/gcc/gcc-ar.c
@@ -194,6 +194,14 @@ main (int ac, char **av)
 #ifdef CROSS_DIRECTORY_STRUCTURE
   real_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
 #endif
+  /* Do not search original location in the same folder.  */
+  char *exe_folder = lrealpath (av[0]);
+  exe_folder[strlen (exe_folder) - strlen (lbasename (exe_folder))] = '\0';
+  char *location = concat (exe_folder, PERSONALITY, NULL);
+
+  if (access (location, X_OK) == 0)
+	remove_prefix (exe_folder, &path);
+
   exe_name = find_a_file (&path, real_exe_name, X_OK);
   if (!exe_name)
 	{
-- 
2.8.1



Re: C++ PATCH for c++/70744 (wrong-code with x ?: y extension)

2016-04-25 Thread Marek Polacek
On Fri, Apr 22, 2016 at 03:28:27PM -0400, Jason Merrill wrote:
> On Fri, Apr 22, 2016 at 2:12 PM, Marek Polacek  wrote:
> > +cp_stabilize_reference (tree ref)
> > +{
> > +  if (TREE_CODE (ref) == PREINCREMENT_EXPR
> > +  || TREE_CODE (ref) == PREDECREMENT_EXPR)
> 
> I think we want to do this for anything stabilize_reference doesn't
> handle specifically, not just pre..crement.

Which would mean something along the lines of the following, I hope.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-04-25  Marek Polacek  

PR c++/70744
* call.c (build_conditional_expr_1): Call cp_stabilize_reference
instead of stabilize_reference.
(build_over_call): Likewise.
* cp-tree.h (cp_stabilize_reference): Declare.
* tree.c (cp_stabilize_reference): New function.
* typeck.c (cp_build_unary_op): Call cp_stabilize_reference instead of
stabilize_reference.
(unary_complex_lvalue): Likewise.
(cp_build_modify_expr): Likewise.

* g++.dg/ext/cond2.C: New test.

diff --git gcc/cp/call.c gcc/cp/call.c
index 11f2d42..476e806 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -4634,7 +4634,7 @@ build_conditional_expr_1 (location_t loc, tree arg1, tree 
arg2, tree arg3,
 
   /* Make sure that lvalues remain lvalues.  See g++.oliva/ext1.C.  */
   if (real_lvalue_p (arg1))
-   arg2 = arg1 = stabilize_reference (arg1);
+   arg2 = arg1 = cp_stabilize_reference (arg1);
   else
arg2 = arg1 = save_expr (arg1);
 }
@@ -7644,8 +7644,9 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
   || (TREE_CODE (arg) == TARGET_EXPR
   && !unsafe_copy_elision_p (fa, arg)))
{
- tree to = stabilize_reference (cp_build_indirect_ref (fa, RO_NULL,
-   complain));
+ tree to = cp_stabilize_reference (cp_build_indirect_ref (fa,
+  RO_NULL,
+  complain));
 
  val = build2 (INIT_EXPR, DECL_CONTEXT (fn), to, arg);
  return val;
@@ -7655,7 +7656,7 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
   && trivial_fn_p (fn)
   && !DECL_DELETED_FN (fn))
 {
-  tree to = stabilize_reference
+  tree to = cp_stabilize_reference
(cp_build_indirect_ref (argarray[0], RO_NULL, complain));
   tree type = TREE_TYPE (to);
   tree as_base = CLASSTYPE_AS_BASE (type);
diff --git gcc/cp/cp-tree.h gcc/cp/cp-tree.h
index ec92718..0e46ae1 100644
--- gcc/cp/cp-tree.h
+++ gcc/cp/cp-tree.h
@@ -6494,6 +6494,7 @@ extern cp_lvalue_kind real_lvalue_p   
(const_tree);
 extern cp_lvalue_kind lvalue_kind  (const_tree);
 extern bool lvalue_or_rvalue_with_address_p(const_tree);
 extern bool xvalue_p   (const_tree);
+extern tree cp_stabilize_reference (tree);
 extern bool builtin_valid_in_constant_expr_p(const_tree);
 extern tree build_min  (enum tree_code, tree, ...);
 extern tree build_min_nt_loc   (location_t, enum tree_code,
diff --git gcc/cp/tree.c gcc/cp/tree.c
index 112c8c7..137186f 100644
--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -296,6 +296,46 @@ xvalue_p (const_tree ref)
   return (lvalue_kind (ref) == clk_rvalueref);
 }
 
+/* C++-specific version of stabilize_reference.  */
+
+tree
+cp_stabilize_reference (tree ref)
+{
+  switch (TREE_CODE (ref))
+{
+/* We need to treat specially anything stabilize_reference doesn't
+   handle specifically.  */
+case VAR_DECL:
+case PARM_DECL:
+case RESULT_DECL:
+CASE_CONVERT:
+case FLOAT_EXPR:
+case FIX_TRUNC_EXPR:
+case INDIRECT_REF:
+case COMPONENT_REF:
+case BIT_FIELD_REF:
+case ARRAY_REF:
+case ARRAY_RANGE_REF:
+case COMPOUND_EXPR:
+case ERROR_MARK:
+  break;
+default:
+  cp_lvalue_kind kind = lvalue_kind (ref);
+  if ((kind & ~clk_class) != clk_none)
+   {
+ tree type = unlowered_expr_type (ref);
+ bool rval = !!(kind & clk_rvalueref);
+ type = cp_build_reference_type (type, rval);
+ /* This inhibits warnings in, eg, cxx_mark_addressable
+(c++/60955).  */
+ warning_sentinel s (extra_warnings);
+ ref = build_static_cast (type, ref, tf_error);
+   }
+}
+
+  return stabilize_reference (ref);
+}
+
 /* Test whether DECL is a builtin that may appear in a
constant-expression. */
 
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index cef5604..7e12009 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -5912,7 +5912,7 @@ cp_build_unary_op (enum tree_code code, tree xarg, int 
noconvert,
{
  tree real, imag;
 
- arg = stabilize_reference (arg);
+ arg = cp_stabilize_reference (arg);
  r

Re: [PATCH] AARCH64: Remove spurious attribute __unused__ from NEON intrinsic

2016-04-25 Thread Wladimir J. van der Laan

Thanks for the info with regard to contributing,

On Fri, Apr 22, 2016 at 09:40:11AM +0100, James Greenhalgh wrote:
> This patch will need a ChangeLog entry [1], please draft one that I can
> use when I apply the patch.

* gcc/config/aarch64/arm_neon.h: Remove spurious attribute __unused__ from 
parameter of vdupb_laneq_s intrinsic

> I'm guessing that you don't have a copyright assignment on file with the
> FSF. While trivial changes like this don't generally need one, if you plan
> to contribute more substantial changed to GCC in future, you may want to
> start the process (see [2]).

I intend to do this, but indeed let's not hold this two-word change up on that.

Regards,
Wladimir van der Laan


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread H.J. Lu
On Mon, Apr 25, 2016 at 8:27 AM, H.J. Lu  wrote:
> On Mon, Apr 25, 2016 at 8:10 AM, Uros Bizjak  wrote:
>> On Mon, Apr 25, 2016 at 4:47 PM, H.J. Lu  wrote:
>>> On Mon, Apr 25, 2016 at 7:18 AM, Uros Bizjak  wrote:
 On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu  wrote:
> Tested on Linux/x86-64.  OK for trunk?

> +  /* FIXME: Since the CSE pass may change dominance info, which isn't
> + expected by the fwprop pass, call free_dominance_info to
> + invalidate dominance info.  Otherwise, the fwprop pass may crash
> + when dominance info is changed.  */
> +  if (TARGET_64BIT)
> +free_dominance_info (CDI_DOMINATORS);
> +

 Please resolve the above problem first, target-dependent sources are
 not the place to apply band-aids for middle-end problems. The thread
 with the proposed fix died in [1].

 [1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html
>>>
>>> free_dominance_info (CDI_DOMINATORS) has been called in other
>>> places to avoid this middle-end issue.   I don't know when the middle-end
>>> will be fixed.  I don't think this target optimization should be penalized 
>>> by
>>> the middle-end issue.
>>
>> Let's ask Richard if he is OK with the workaround...
>>
>> One more thing:
>>
>> @@ -3551,6 +3874,7 @@ convert_scalars_to_vector ()
>>basic_block bb;
>>bitmap candidates;
>>int converted_insns = 0;
>> +  rtx zero, minus_one;
>>
>>bitmap_obstack_initialize (NULL);
>>candidates = BITMAP_ALLOC (NULL);
>> @@ -3585,22 +3909,40 @@ convert_scalars_to_vector ()
>>  if (dump_file)
>>fprintf (dump_file, "There are no candidates for optimization.\n");
>>
>> +  if (TARGET_64BIT)
>> +{
>> +  zero = gen_reg_rtx (V1TImode);
>> +  minus_one = gen_reg_rtx (V1TImode);
>> +}
>> +  else
>> +{
>> +  zero = NULL_RTX;
>> +  minus_one = NULL_RTX;
>> +}
>> +
>>
>> Do we *really* need to crate registers here? They are only used as a
>> temporary in scalar_chain_64::convert_insn, and I think that any CSE
>> worth its name should find out when the same immediate is loaded to
>> different temporaries.
>
> I will double check. Last time when I tried,  CSE didn't work on this for a
> reason.  For
>
> [hjl@gnu-6 pr67400]$ cat z.i
> extern void bar (void);
>
> void *
> foo (void)
> {
>   return &bar;
> }
> [hjl@gnu-6 pr67400]$ gcc -S -fPIC -O2 z.i
> [hjl@gnu-6 pr67400]$
>
> CSE sees:
>
> (insn 5 2 6 2 (set (reg:DI 89)
> (mem/u/c:DI (const:DI (unspec:DI [
> (symbol_ref:DI ("bar") [flags 0x41]
> )
> ] UNSPEC_GOTPCREL)) [1  S8 A8])) z.i:6 89 
> {*movdi_internal}
>  (nil))
> (insn 6 5 10 2 (set (reg:DI 87 [  ])
> (reg:DI 89)) z.i:6 89 {*movdi_internal}
>  (expr_list:REG_EQUAL (symbol_ref:DI ("bar") [flags 0x41]
> )
> (nil)))
>
> CSE will not change it to
>
> (insn 6 5 10 2 (set (reg:DI 89 [  ])
> (reg:DI 89)) z.i:6 89 {*movdi_internal}
>  (expr_list:REG_EQUAL (symbol_ref:DI ("bar") [flags 0x41]
> )
> (nil)))

I meant CSE won't change it to

insn 5 2 9 2 (set (reg:DI 87 [  ])
(symbol_ref:DI ("bar") [flags 0x41]  )) z.i:6 89 {*movdi_internal}
 (nil))

>> BTW: I'd really like if Ilya can review the functionality of the
>> patch, he is an expert in this conversion stuff.
>>
>> Uros.
>
> Ilya, can you take a look?
>
> Thanks.
>
> --
> H.J.



-- 
H.J.


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread H.J. Lu
On Mon, Apr 25, 2016 at 8:10 AM, Uros Bizjak  wrote:
> On Mon, Apr 25, 2016 at 4:47 PM, H.J. Lu  wrote:
>> On Mon, Apr 25, 2016 at 7:18 AM, Uros Bizjak  wrote:
>>> On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu  wrote:
 Tested on Linux/x86-64.  OK for trunk?
>>>
 +  /* FIXME: Since the CSE pass may change dominance info, which isn't
 + expected by the fwprop pass, call free_dominance_info to
 + invalidate dominance info.  Otherwise, the fwprop pass may crash
 + when dominance info is changed.  */
 +  if (TARGET_64BIT)
 +free_dominance_info (CDI_DOMINATORS);
 +
>>>
>>> Please resolve the above problem first, target-dependent sources are
>>> not the place to apply band-aids for middle-end problems. The thread
>>> with the proposed fix died in [1].
>>>
>>> [1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html
>>
>> free_dominance_info (CDI_DOMINATORS) has been called in other
>> places to avoid this middle-end issue.   I don't know when the middle-end
>> will be fixed.  I don't think this target optimization should be penalized by
>> the middle-end issue.
>
> Let's ask Richard if he is OK with the workaround...
>
> One more thing:
>
> @@ -3551,6 +3874,7 @@ convert_scalars_to_vector ()
>basic_block bb;
>bitmap candidates;
>int converted_insns = 0;
> +  rtx zero, minus_one;
>
>bitmap_obstack_initialize (NULL);
>candidates = BITMAP_ALLOC (NULL);
> @@ -3585,22 +3909,40 @@ convert_scalars_to_vector ()
>  if (dump_file)
>fprintf (dump_file, "There are no candidates for optimization.\n");
>
> +  if (TARGET_64BIT)
> +{
> +  zero = gen_reg_rtx (V1TImode);
> +  minus_one = gen_reg_rtx (V1TImode);
> +}
> +  else
> +{
> +  zero = NULL_RTX;
> +  minus_one = NULL_RTX;
> +}
> +
>
> Do we *really* need to crate registers here? They are only used as a
> temporary in scalar_chain_64::convert_insn, and I think that any CSE
> worth its name should find out when the same immediate is loaded to
> different temporaries.

I will double check. Last time when I tried,  CSE didn't work on this for a
reason.  For

[hjl@gnu-6 pr67400]$ cat z.i
extern void bar (void);

void *
foo (void)
{
  return &bar;
}
[hjl@gnu-6 pr67400]$ gcc -S -fPIC -O2 z.i
[hjl@gnu-6 pr67400]$

CSE sees:

(insn 5 2 6 2 (set (reg:DI 89)
(mem/u/c:DI (const:DI (unspec:DI [
(symbol_ref:DI ("bar") [flags 0x41]
)
] UNSPEC_GOTPCREL)) [1  S8 A8])) z.i:6 89 {*movdi_internal}
 (nil))
(insn 6 5 10 2 (set (reg:DI 87 [  ])
(reg:DI 89)) z.i:6 89 {*movdi_internal}
 (expr_list:REG_EQUAL (symbol_ref:DI ("bar") [flags 0x41]
)
(nil)))

CSE will not change it to

(insn 6 5 10 2 (set (reg:DI 89 [  ])
(reg:DI 89)) z.i:6 89 {*movdi_internal}
 (expr_list:REG_EQUAL (symbol_ref:DI ("bar") [flags 0x41]
)
(nil)))

> BTW: I'd really like if Ilya can review the functionality of the
> patch, he is an expert in this conversion stuff.
>
> Uros.

Ilya, can you take a look?

Thanks.

-- 
H.J.


Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-25 Thread Jakub Jelinek
On Mon, Apr 25, 2016 at 09:09:03AM -0500, Bill Schmidt wrote:
> Hi Segher,
> 
> Here's the fix for the obvious pasto separated out.  CCing Richi and
> Jakub as I'd appreciate release manager approval to include this in
> gcc-6-branch.  This fixes some cases where built-in functions are
> connected to the wrong expanders because of copy-paste issues.  These
> tend not to be used anyway because the vec_st interface is friendlier,
> but we should clean this up.  Is that ok?

Ok for 6.2 (i.e. after 6.1 is released), from what I can see, already 5.x
has the same bug.

Jakub


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread Uros Bizjak
On Mon, Apr 25, 2016 at 4:47 PM, H.J. Lu  wrote:
> On Mon, Apr 25, 2016 at 7:18 AM, Uros Bizjak  wrote:
>> On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu  wrote:
>>> Tested on Linux/x86-64.  OK for trunk?
>>
>>> +  /* FIXME: Since the CSE pass may change dominance info, which isn't
>>> + expected by the fwprop pass, call free_dominance_info to
>>> + invalidate dominance info.  Otherwise, the fwprop pass may crash
>>> + when dominance info is changed.  */
>>> +  if (TARGET_64BIT)
>>> +free_dominance_info (CDI_DOMINATORS);
>>> +
>>
>> Please resolve the above problem first, target-dependent sources are
>> not the place to apply band-aids for middle-end problems. The thread
>> with the proposed fix died in [1].
>>
>> [1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html
>
> free_dominance_info (CDI_DOMINATORS) has been called in other
> places to avoid this middle-end issue.   I don't know when the middle-end
> will be fixed.  I don't think this target optimization should be penalized by
> the middle-end issue.

Let's ask Richard if he is OK with the workaround...

One more thing:

@@ -3551,6 +3874,7 @@ convert_scalars_to_vector ()
   basic_block bb;
   bitmap candidates;
   int converted_insns = 0;
+  rtx zero, minus_one;

   bitmap_obstack_initialize (NULL);
   candidates = BITMAP_ALLOC (NULL);
@@ -3585,22 +3909,40 @@ convert_scalars_to_vector ()
 if (dump_file)
   fprintf (dump_file, "There are no candidates for optimization.\n");

+  if (TARGET_64BIT)
+{
+  zero = gen_reg_rtx (V1TImode);
+  minus_one = gen_reg_rtx (V1TImode);
+}
+  else
+{
+  zero = NULL_RTX;
+  minus_one = NULL_RTX;
+}
+

Do we *really* need to crate registers here? They are only used as a
temporary in scalar_chain_64::convert_insn, and I think that any CSE
worth its name should find out when the same immediate is loaded to
different temporaries.

BTW: I'd really like if Ilya can review the functionality of the
patch, he is an expert in this conversion stuff.

Uros.


Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-25 Thread Bill Schmidt
On Mon, 2016-04-25 at 17:06 +0200, Jakub Jelinek wrote:
> On Mon, Apr 25, 2016 at 09:09:03AM -0500, Bill Schmidt wrote:
> > Hi Segher,
> > 
> > Here's the fix for the obvious pasto separated out.  CCing Richi and
> > Jakub as I'd appreciate release manager approval to include this in
> > gcc-6-branch.  This fixes some cases where built-in functions are
> > connected to the wrong expanders because of copy-paste issues.  These
> > tend not to be used anyway because the vec_st interface is friendlier,
> > but we should clean this up.  Is that ok?
> 
> Ok for 6.2 (i.e. after 6.1 is released), from what I can see, already 5.x
> has the same bug.

Makes sense, thanks!  Yes, 5.x and 4.9.x are both borked as well.
Preparing backports for them also.

Bill

> 
>   Jakub
> 




Re: [PATCH, i386, AVX-512] Fix PR target/70728.

2016-04-25 Thread Jakub Jelinek
On Mon, Apr 25, 2016 at 05:06:34PM +0300, Kirill Yukhin wrote:
> On 21 Apr 18:29, Kirill Yukhin wrote:
> > Hello,
> > On 21 Apr 14:50, Kirill Yukhin wrote:
> > > Hello,
> > > Patch in the bottom fixes mentioned PR by separating
> > > AVX and AVX-512BW constraints.
> > > 
> > > gcc/
> > >   * gcc/config/i386/sse.md (define_insn "3"):
> > >   Extract AVX-512BW constraint from AVX.
> > > gcc/testsuite/
> > >   * gcc.target/i386/pr70728.c: New test.
> > > 
> > > Bootsrap and regtest is in progress for i?86|x86_64.
> > > 
> > > I'll check it into main trunk if it'll pass.
> > Checked into maint trunk.
> > 
> > Is it OK to check into gcc-6?
> Ping?

Ok for 6.2 (i.e. after 6.1 is released).

Jakub


Re: [Patch] Fix PR 60040

2016-04-25 Thread Bernd Schmidt

On 04/15/2016 02:52 PM, Senthil Kumar Selvaraj wrote:


For both testcases in the PR, reload fails to take into account that
FP-SP elimination can no longer be performed, and tries to find reload
regs for an rtx generated when FP-SP elimination was valid.

1. reload initializes elim table with FP->SP elimination enabled.
2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets
reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets
something_was_spilled to true.
3. The main reload loop starts, and it resets something_was_spilled to false.
4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to
(mem(SP + offset)).
5. calculate_needs_all_insns pushes a reload for SP (for the AVR target,
SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs).
6. update_eliminables_and_spill calls targetm.frame_pointer_required,
which returns true. That causes can_eliminate for FP->SP to be reset
to zero, and FP to be added to bad_spill_regs_global. For the AVR,
FP is Y, one of the 3 pointer regs. reload also notes that something
has changed, and that the loop needs to run again.
7. reload still calls select_reload_regs, and find_regs fails to find a
pointer reg to reload SP, which is unnecessary as FP->SP elimination
had been disabled anyway in (6).

IOW, reload fails to find pointer regs for an RTL expression that was
created when FP->SP elimination was true, even after it turns out that
the elimination can't be done after all. The patch tries to detect that
- if it knows the loop is going to run again, it silences the failure.

Also note that at a different point in the loop, the reload loop starts
over if something_was_spilled (line 982-986). If set outside the reload
loop by alter_reg, it gets reset at (3) - not sure why. I'd think a
"continue" after update_eliminables_and_spill (line 1019-1022) would
also work - haven't tested it though.


That's what I was going to ask next. I think if that works for you, I 
think that's an approach that would make more sense.


However, it looks like after this call, and the other one in the same 
loop, should probably be calling finish_spills. It looks like an 
oversight in the existing code that this doesn't happen for the existing 
continue. Please try adding it in both places.



Bernd


[PATCH] Replace old AWK script (utilizing bc) with Python implementation

2016-04-25 Thread Martin Liška
Hello.

As I've been playing with branch predictions and contrib/analyze_brprob script,
I've decided to replace the old script with a Python implementation.
Improvements:

+ fixed horizontal formatting
+ remove ugly utilization of bc that is used for arithmetics
+ script is a bit faster (tramp3d dump file): real 0m0.670s / 0m0.807s
+ usage of the script is more precisely explained and script comments are 
updated

Ready for trunk?
Thanks,
Martin
>From e8965e67e9a6ce90fc0cb97704540501eb0df85f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 25 Apr 2016 16:42:42 +0200
Subject: [PATCH] Replace AWK script with the python script.

contrib/ChangeLog:

2016-04-25  Martin Liska  

	* analyze_brprob: Remove.
	* analyze_brprob.py: New file.
---
 contrib/analyze_brprob| 147 --
 contrib/analyze_brprob.py | 128 
 2 files changed, 128 insertions(+), 147 deletions(-)
 delete mode 100755 contrib/analyze_brprob
 create mode 100755 contrib/analyze_brprob.py

diff --git a/contrib/analyze_brprob b/contrib/analyze_brprob
deleted file mode 100755
index 5702834..000
--- a/contrib/analyze_brprob
+++ /dev/null
@@ -1,147 +0,0 @@
-#!/usr/bin/awk -f
-# Script to analyze experimental results of our branch prediction heuristics
-# Contributed by Jan Hubicka, SuSE Inc.
-# Copyright (C) 2001, 2003 Free Software Foundation, Inc.
-#
-# This file is part of GCC.
-#
-# GCC is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 3, or (at your option)
-# any later version.
-#
-# GCC is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with GCC; see the file COPYING.  If not, write to
-# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
-# Boston, MA 02110-1301, USA.
-#
-#
-# This script is used to calculate two basic properties of the branch prediction
-# heuristics - coverage and hitrate.  Coverage is number of executions of a given
-# branch matched by the heuristics and hitrate is probability that once branch is
-# predicted as taken it is really taken.
-#
-# These values are useful to determine the quality of given heuristics.  Hitrate
-# may be directly used in predict.c.
-#
-# Usage:
-#  Step 1: Compile and profile your program.  You need to use -fprofile-arcs
-#flag to get the profiles
-#  Step 2: Generate log files.  The information about given heuristics are
-#saved into ipa-profile dumps.  You need to pass the -fdimp-ipa-profile switch
-#to the compiler as well
-#as -fbranch-probabilities to get the results of profiling noted in the dumps.
-#Ensure that there are no "Arc profiling: some edge counts were bad." warnings.
-#  Step 3: Run this script to concatenate all *.profile files:
-#analyze_brprob `find . -name *.profile`
-#the information is collected and print once all files are parsed.  This
-#may take a while.
-#Note that the script does use bc to perform long arithmetic.
-#  Step 4: Read the results.  Basically the following table is printed:
-#  (this is just an example from a very early stage of branch prediction pass
-#   development, so please don't take these numbers seriously)
-#
-#HEURISTICS  BRANCHES  (REL)  HITRATE COVERAGE  (REL)
-#opcode  2889  83.7%  94.96%/ 97.62%  7516383  75.3%
-#pointer  246   7.1%  99.69%/ 99.86%   118791   1.2%
-#loop header  449  13.0%  98.32%/ 99.07%43553   0.4%
-#first match 3450 100.0%  89.92%/ 97.27%  9979782 100.0%
-#loop exit924  26.8%  88.95%/ 95.58%  9026266  90.4%
-#error return 150   4.3%  64.48%/ 86.81%   453542   4.5%
-#call 803  23.3%  51.66%/ 98.61%  3614037  36.2%
-#loop branch   51   1.5%  99.26%/ 99.27%26854   0.3%
-#noreturn call951  27.6% 100.00%/100.00%  1759809  17.6%
-#
-#  The heuristic called "first match" is a heuristic used by GCC branch
-#  prediction pass and it predicts 89.92% branches correctly.
-#
-#  The quality of heuristics can be rated using both, coverage and hitrate
-#  parameters.  For example "loop branch" heuristics (predicting loopback edge
-#  as taken) have both very high hitrate and coverage, so it is very useful.
-#  On the other hand, "exit block" heuristics (predicting exit edges as not
-#  taken) have good hitrate, but poor coverage, so only 3 branches have been
-#  predicted.  The "loop header" heuristic has problems, since it tends to
-#  misspredict.
-#
-#  The imp

Re: [PATCH GCC]Refactor IVOPT.

2016-04-25 Thread Bin.Cheng
On Mon, Apr 25, 2016 at 3:49 PM, Martin Liška  wrote:
> Hello.
>
> Please consider application of the following patch, it fixes
> a coding style issue and a memory leak.
Hi Martin,
Will do, thanks very much for the help.

Thanks,
bin
>
> Thanks,
> Martin


Re: [PATCH GCC]Refactor IVOPT.

2016-04-25 Thread Martin Liška
Hello.

Please consider application of the following patch, it fixes
a coding style issue and a memory leak.

Thanks,
Martin
>From 6afc975de0b6de76aa51b8c2ef741cd72c76dc75 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 25 Apr 2016 13:50:41 +0200
Subject: [PATCH 1/4] Fix coding style and a memory leak in IVOPTS

gcc/ChangeLog:

2016-04-25  Martin Liska  

	* tree-ssa-loop-ivopts.c (iv_ca_dump): Fix level of indentation.
	(free_loop_data): Release vuses of groups.
---
 gcc/tree-ssa-loop-ivopts.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 18c1773..9314363 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -6311,15 +6311,15 @@ iv_ca_dump (struct ivopts_data *data, FILE *file, struct iv_ca *ivs)
ivs->cand_cost, ivs->cand_use_cost.cost, ivs->cand_use_cost.complexity);
   bitmap_print (file, ivs->cands, "  candidates: ","\n");
 
-   for (i = 0; i < ivs->upto; i++)
+  for (i = 0; i < ivs->upto; i++)
 {
   struct iv_group *group = data->vgroups[i];
   struct cost_pair *cp = iv_ca_cand_for_group (ivs, group);
   if (cp)
-fprintf (file, "   group:%d --> iv_cand:%d, cost=(%d,%d)\n",
- group->id, cp->cand->id, cp->cost.cost, cp->cost.complexity);
+	fprintf (file, "   group:%d --> iv_cand:%d, cost=(%d,%d)\n",
+		 group->id, cp->cand->id, cp->cost.cost, cp->cost.complexity);
   else
-fprintf (file, "   group:%d --> ??\n", group->id);
+	fprintf (file, "   group:%d --> ??\n", group->id);
 }
 
   for (i = 1; i <= data->max_inv_id; i++)
@@ -7503,6 +7503,7 @@ free_loop_data (struct ivopts_data *data)
 
   for (j = 0; j < group->vuses.length (); j++)
 	free (group->vuses[j]);
+  group->vuses.release ();
 
   BITMAP_FREE (group->related_cands);
   for (j = 0; j < group->n_map_members; j++)
-- 
2.8.1



Re: [PATCH] add support for placing variables in shared memory

2016-04-25 Thread Nathan Sidwell

On 04/22/16 10:04, Alexander Monakov wrote:


echo 'int v __attribute__((section("foo")));' |
   x86_64-pc-linux-gnu-accel-nvptx-none-gcc -xc - -o /dev/null
:1:5: error: section attributes are not supported for this target


Presumably it's missing a necessary hook?  Couldn't such a hook check the 
section name is acceptable?




Why can it not apply to  variables of auto storage?  I.e. function scope,
function lifetime?  That would seem to be a useful property.


Because PTX does not support auto storage semantics for .shared data.  It's
statically allocated at link time.


I  suppose it's not worth going through hoops to define such function-scoped 
variables if PTX isn't going to take advantage of that.



What happens if an initializer is present, is it silently ignored?


GCC accepts and reemits it in assembly output (if non-zero), and ptxas rejects
it ("syntax error").


ptx errors are inscrutable.

It would be better for nvptx_assemble_decl_end to check if an initializer has 
been output and emit an error (you'll need to record the decl itself in the 
initializer structure to do that).  Record the  decl in 
nvptx_assemble_decl_begin if the symbol's data area is .shared, and then check 
in NADE?


nathan


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread H.J. Lu
On Mon, Apr 25, 2016 at 7:18 AM, Uros Bizjak  wrote:
> On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu  wrote:
>> Tested on Linux/x86-64.  OK for trunk?
>
>> +  /* FIXME: Since the CSE pass may change dominance info, which isn't
>> + expected by the fwprop pass, call free_dominance_info to
>> + invalidate dominance info.  Otherwise, the fwprop pass may crash
>> + when dominance info is changed.  */
>> +  if (TARGET_64BIT)
>> +free_dominance_info (CDI_DOMINATORS);
>> +
>
> Please resolve the above problem first, target-dependent sources are
> not the place to apply band-aids for middle-end problems. The thread
> with the proposed fix died in [1].
>
> [1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html

free_dominance_info (CDI_DOMINATORS) has been called in other
places to avoid this middle-end issue.   I don't know when the middle-end
will be fixed.  I don't think this target optimization should be penalized by
the middle-end issue.

> Also, I find _32 and _64 suffixes confusing, maybe better would be to
> use timode_ and dimode_ prefixes everywhere?
>

I will make the change.

-- 
H.J.


Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-25 Thread Segher Boessenkool
On Mon, Apr 25, 2016 at 09:09:03AM -0500, Bill Schmidt wrote:
> Here's the fix for the obvious pasto separated out.  CCing Richi and
> Jakub as I'd appreciate release manager approval to include this in
> gcc-6-branch.  This fixes some cases where built-in functions are
> connected to the wrong expanders because of copy-paste issues.  These
> tend not to be used anyway because the vec_st interface is friendlier,
> but we should clean this up.  Is that ok?

> 2016-04-25  Bill Schmidt  
> 
>   * rs6000-builtin.def: Correct pasto error for stxvd2x and stxvw4x
>   built-in functions.

Hi Bill,

Approved for trunk.  Thanks!


Segher


Re: [PATCH 07/18] loop-iv.c: make cond_list a vec

2016-04-25 Thread Bernd Schmidt

On 04/25/2016 04:21 PM, Bernd Schmidt wrote:

On 04/25/2016 03:30 PM, Trevor Saunders wrote:

On Mon, Apr 25, 2016 at 02:28:51PM +0200, Bernd Schmidt wrote:

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 



+  unsigned int len = cond_list.length ();
+  for (unsigned int i = len - 1; i < len; i--)


This is a really icky way to write a loop, the i < len condition
makes it
look like a forward one. We have FOR_EACH_VEC_ELT{,_REVERSE}, any
reason not
to use these?


I'll agree that depending on unsigned wrapping is a tad wierd, but
personally I think FOR_EACH_VEC_* are pretty icky, and just forget to
think about them before writing a loop.


They're standard inside gcc though, and readability-wise much
preferrable to the above IMO.

I noticed this pattern in a lot of these patches; at this point I think
the best thing to do would be for you to go through all of them, address
review comments across the whole set, and then start a new thread with
v2 patches of all of them so we can retire this thread.

Oh, and also - I would prefer for each nontrivial such loop some sort of 
analysis about what order elements were inserted/processed in before 
this change, and what order afterwards. This could be a source of subtle 
errors with this patch series. When removing elements in a loop, we also 
need to pay attention to whether that's still safe.



Bernd



Re: match.pd patch: min(-x, -y), min(~x, ~y)

2016-04-25 Thread Kyrill Tkachov


On 22/04/16 12:20, Kyrill Tkachov wrote:


On 22/04/16 11:34, Marc Glisse wrote:

On Fri, 22 Apr 2016, Kyrill Tkachov wrote:



On 22/04/16 10:43, Kyrill Tkachov wrote:


On 22/04/16 10:42, Marc Glisse wrote:

On Fri, 22 Apr 2016, Kyrill Tkachov wrote:


2016-04-21  Marc Glisse 

gcc/
* match.pd (min(-x, -y), max(-x, -y), min(~x, ~y), max(~x, ~y)):
New transformations.

gcc/testsuite/
* gcc.dg/tree-ssa/minmax-2.c: New testcase.




I see the new testcase failing on aarch64:
FAIL: gcc.dg/tree-ssa/minmax-2.c scan-tree-dump optimized "__builtin_fmin"


Strange, it seems to work in 
https://gcc.gnu.org/ml/gcc-testresults/2016-04/msg02120.html

Is that on some freestanding kind of setup where the builtin might be disabled?



Ah, this is aarch64-none-elf which uses newlib as the C library.
Let me check on aarch64-none-linux-gnu and get back to you.



Yeah, I see it passing on aarch64-none-linux-gnu.
Do we have an appropriate effective target check to gate this test on?


I don't know, I have a hard time finding something related. I am not even convinced the test should be skipped. It looks like __builtin_fmax was recognized, otherwise you would get a warning and a conversion int-double. Maybe 
gimple_call_combined_fn rejects it? Ah, builtins.def declares it with DEF_C99_BUILTIN, which checks targetm.libc_has_function (function_c99_misc). I assume newlib fails that check? That would make c99_runtime a relevant target check.




Yeah, adding the below makes this test UNSUPPORTED on aarch64-none-elf.
/* { dg-add-options c99_runtime } */
/* { dg-require-effective-target c99_runtime } */

I'll prepare a patch.



Sorry for the delay, here it is.
Ok to commit?

Thanks,
Kyrill

2016-04-25  Kyrylo Tkachov  

* gcc.dg/tree-ssa/minmax-2.c: Require c99_runtime and add the
associated options.


Thanks,
Kyrill


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-2.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-2.c
index 98c38b1aa773d04a5d7cb36df73db8924d83ed65..87ff94cef1f3b178b315358f246c9a3f32383945 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/minmax-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-2.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O -fstrict-overflow -fdump-tree-optimized" } */
+/* { dg-add-options c99_runtime } */
+/* { dg-require-effective-target c99_runtime } */
 
 static int max(int a,int b){return (a

Re: [PATCH 06/18] move reg_equivs out of gc memory

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

It used the gc vector type, but isn't marked as a gc route, and appears
to be manually managed, so it should be safe to use the normal heap
vector.

gcc/ChangeLog:

2016-04-19  Trevor Saunders  

* ira-emit.c (emit_move_list): Adjust.
* ira.c (fix_reg_equiv_init): Likewise.
(update_equiv_regs): Likewise.
(ira): Likewise.
(do_reload): Likewise.
* reload.c (subst_reloads): Likewise.
* reload.h (reg_equivs): Move to the normal heap.
* reload1.c (grow_reg_equivs): Adjust.
(reload): Likewise.
(eliminate_regs_1): Likewise.
(elimination_effects): Likewise.
(init_eliminable_invariants): Likewise.
(free_reg_equiv): Likewise.


I wonder whether this is really worth it. The improvement is marginal, 
and there is a real cost associated with patches such as this one when 
people need to work across multiple branches.



Bernd



Re: [PATCH 07/18] loop-iv.c: make cond_list a vec

2016-04-25 Thread Bernd Schmidt

On 04/25/2016 03:30 PM, Trevor Saunders wrote:

On Mon, Apr 25, 2016 at 02:28:51PM +0200, Bernd Schmidt wrote:

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 



+ unsigned int len = cond_list.length ();
+ for (unsigned int i = len - 1; i < len; i--)


This is a really icky way to write a loop, the i < len condition makes it
look like a forward one. We have FOR_EACH_VEC_ELT{,_REVERSE}, any reason not
to use these?


I'll agree that depending on unsigned wrapping is a tad wierd, but
personally I think FOR_EACH_VEC_* are pretty icky, and just forget to
think about them before writing a loop.


They're standard inside gcc though, and readability-wise much 
preferrable to the above IMO.


I noticed this pattern in a lot of these patches; at this point I think 
the best thing to do would be for you to go through all of them, address 
review comments across the whole set, and then start a new thread with 
v2 patches of all of them so we can retire this thread.



Bernd


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread Uros Bizjak
On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu  wrote:
> Tested on Linux/x86-64.  OK for trunk?

> +  /* FIXME: Since the CSE pass may change dominance info, which isn't
> + expected by the fwprop pass, call free_dominance_info to
> + invalidate dominance info.  Otherwise, the fwprop pass may crash
> + when dominance info is changed.  */
> +  if (TARGET_64BIT)
> +free_dominance_info (CDI_DOMINATORS);
> +

Please resolve the above problem first, target-dependent sources are
not the place to apply band-aids for middle-end problems. The thread
with the proposed fix died in [1].

[1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html

Also, I find _32 and _64 suffixes confusing, maybe better would be to
use timode_ and dimode_ prefixes everywhere?

Uros.


[PING][PATCH] New plugin event when evaluating a constexpr call

2016-04-25 Thread Andres Tiraboschi
Hi
 This patch adds a plugin event when evaluating a call expression in constexpr.
 The goal of this patch is to allow the plugins to analyze and or
modify the evaluation of constant expressions.


Changelog 2016-4-25  Andres Tiraboschi

*gcc/plugin.c (PLUGIN_EVAL_CALL_CONSTEXPR): New event.
*gcc/plugin.def (PLUGIN_EVAL_CALL_CONSTEXPR): New event.
*gcc/cp/constexpr.c (constexpr_fundef): Moved to gcc/cp/cp-tree.h.
*gcc/cp/constexpr.c (constexpr_call): Ditto.
*gcc/cp/constexpr.c (constexpr_ctx): Ditto.
*gcc/cp/constexpr.c (cxx_eval_constant_expression): Not static anymore.
*gcc/pc/cp-tree.h (constexpr_call_info): New Type.
*gcc/pc/cp-tree.h (constexpr_fundef): Moved type from gcc/cp/constexpr.c.
*gcc/pc/cp-tree.h (constexpr_call): Ditto.
*gcc/pc/cp-tree.h (constexpr_ctx): Ditto.
*gcc/pc/cp-tree.h (cxx_eval_constant_expression): Declared.




diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 5f97c9d..5562e44 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -31,6 +31,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-inline.h"
 #include "ubsan.h"
+#include "plugin-api.h"
+#include "plugin.h"

 static bool verify_constant (tree, bool, bool *, bool *);
 #define VERIFY_CONSTANT(X)\
@@ -112,13 +114,6 @@ ensure_literal_type_for_constexpr_object (tree decl)
   return decl;
 }

-/* Representation of entries in the constexpr function definition table.  */
-
-struct GTY((for_user)) constexpr_fundef {
-  tree decl;
-  tree body;
-};
-
 struct constexpr_fundef_hasher : ggc_ptr_hash
 {
   static hashval_t hash (constexpr_fundef *);
@@ -856,62 +851,12 @@ explain_invalid_constexpr_fn (tree fun)
   input_location = save_loc;
 }

-/* Objects of this type represent calls to constexpr functions
-   along with the bindings of parameters to their arguments, for
-   the purpose of compile time evaluation.  */
-
-struct GTY((for_user)) constexpr_call {
-  /* Description of the constexpr function definition.  */
-  constexpr_fundef *fundef;
-  /* Parameter bindings environment.  A TREE_LIST where each TREE_PURPOSE
- is a parameter _DECL and the TREE_VALUE is the value of the parameter.
- Note: This arrangement is made to accommodate the use of
- iterative_hash_template_arg (see pt.c).  If you change this
- representation, also change the hash calculation in
- cxx_eval_call_expression.  */
-  tree bindings;
-  /* Result of the call.
-   NULL means the call is being evaluated.
-   error_mark_node means that the evaluation was erroneous;
-   otherwise, the actuall value of the call.  */
-  tree result;
-  /* The hash of this call; we remember it here to avoid having to
- recalculate it when expanding the hash table.  */
-  hashval_t hash;
-};
-
 struct constexpr_call_hasher : ggc_ptr_hash
 {
   static hashval_t hash (constexpr_call *);
   static bool equal (constexpr_call *, constexpr_call *);
 };

-/* The constexpr expansion context.  CALL is the current function
-   expansion, CTOR is the current aggregate initializer, OBJECT is the
-   object being initialized by CTOR, either a VAR_DECL or a _REF.  VALUES
-   is a map of values of variables initialized within the expression.  */
-
-struct constexpr_ctx {
-  /* The innermost call we're evaluating.  */
-  constexpr_call *call;
-  /* Values for any temporaries or local variables within the
- constant-expression. */
-  hash_map *values;
-  /* SAVE_EXPRs that we've seen within the current LOOP_EXPR.  NULL if we
- aren't inside a loop.  */
-  hash_set *save_exprs;
-  /* The CONSTRUCTOR we're currently building up for an aggregate
- initializer.  */
-  tree ctor;
-  /* The object we're building the CONSTRUCTOR for.  */
-  tree object;
-  /* Whether we should error on a non-constant expression or fail quietly.  */
-  bool quiet;
-  /* Whether we are strictly conforming to constant expression rules or
- trying harder to get a constant value.  */
-  bool strict;
-};
-
 /* A table of all constexpr calls that have been evaluated by the
compiler in this translation unit.  */

@@ -1303,6 +1248,22 @@ cxx_eval_call_expression (const constexpr_ctx
*ctx, tree t,
   bool non_constant_args = false;
   cxx_bind_parameters_in_call (ctx, t, &new_call,
non_constant_p, overflow_p, &non_constant_args);
+
+  constexpr_call_info call_info;
+  call_info.function = t;
+  call_info.lval = lval;
+  call_info.call = &new_call;
+  call_info.call_stack = call_stack;
+  call_info.non_constant_args = &non_constant_args;
+  call_info.non_const_p = non_constant_p;
+  call_info.ctx = ctx;
+  call_info.result = NULL_TREE;
+  invoke_plugin_callbacks (PLUGIN_EVAL_CALL_CONSTEXPR, &call_info);
+  if (call_info.result != NULL_TREE)
+{
+  return call_info.result;
+}
+
   if (*non_constant_p)
 return t;

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 15b004d..00856ec 100644
--- a/gcc/cp/cp-tree.h
+++ b

Re: [PATCH, i386, AVX-512] Fix PR target/70728.

2016-04-25 Thread Kirill Yukhin
On 21 Apr 18:29, Kirill Yukhin wrote:
> Hello,
> On 21 Apr 14:50, Kirill Yukhin wrote:
> > Hello,
> > Patch in the bottom fixes mentioned PR by separating
> > AVX and AVX-512BW constraints.
> > 
> > gcc/
> > * gcc/config/i386/sse.md (define_insn "3"):
> > Extract AVX-512BW constraint from AVX.
> > gcc/testsuite/
> > * gcc.target/i386/pr70728.c: New test.
> > 
> > Bootsrap and regtest is in progress for i?86|x86_64.
> > 
> > I'll check it into main trunk if it'll pass.
> Checked into maint trunk.
> 
> Is it OK to check into gcc-6?
Ping?
> 
> --
> Thanks, K


Re: [PATCH 12/18] haifa-sched.c: make insn_queue[] a vec

2016-04-25 Thread Trevor Saunders
On Mon, Apr 25, 2016 at 03:55:15PM +0200, Bernd Schmidt wrote:
> On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:
> >-/* Remove INSN from queue.  */
> >+/* Remove INSN at idx from queue.  */
> >+static void
> >+queue_remove (unsigned int q, unsigned int idx)
> >+{
> >+  QUEUE_INDEX (insn_queue[q][idx]) = QUEUE_NOWHERE;
> >+  insn_queue[q].ordered_remove (idx);
> >+  q_size--;
> 
> I think I'm nacking this one, sorry. I don't think ordered_removes in the
> scheduler queues are going to fly.

So, we're going from a linear walk through a linked list to linear scan
of part of a vector and memcpy of the rest.  That's certainly not great,
but the linked list walk doesn't seem great either.  Is unordered_remove
here safe?  Or do you have some other idea?  I guess we could write a
forward_list.

Trev

> 
> 
> Bernd
> 


Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-25 Thread Bill Schmidt
Hi Segher,

Here's the fix for the obvious pasto separated out.  CCing Richi and
Jakub as I'd appreciate release manager approval to include this in
gcc-6-branch.  This fixes some cases where built-in functions are
connected to the wrong expanders because of copy-paste issues.  These
tend not to be used anyway because the vec_st interface is friendlier,
but we should clean this up.  Is that ok?

Thanks,
Bill


2016-04-25  Bill Schmidt  

* rs6000-builtin.def: Correct pasto error for stxvd2x and stxvw4x
built-in functions.

Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 235411)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -1391,13 +1391,13 @@ BU_VSX_X (LXVW4X_V4SI,"lxvw4x_v4si",MEM)
 BU_VSX_X (LXVW4X_V8HI,"lxvw4x_v8hi",   MEM)
 BU_VSX_X (LXVW4X_V16QI,  "lxvw4x_v16qi",   MEM)
 BU_VSX_X (STXSDX,"stxsdx", MEM)
-BU_VSX_X (STXVD2X_V1TI,  "stxsdx_v1ti",MEM)
-BU_VSX_X (STXVD2X_V2DF,  "stxsdx_v2df",MEM)
-BU_VSX_X (STXVD2X_V2DI,  "stxsdx_v2di",MEM)
-BU_VSX_X (STXVW4X_V4SF,  "stxsdx_v4sf",MEM)
-BU_VSX_X (STXVW4X_V4SI,  "stxsdx_v4si",MEM)
-BU_VSX_X (STXVW4X_V8HI,  "stxsdx_v8hi",MEM)
-BU_VSX_X (STXVW4X_V16QI,  "stxsdx_v16qi",  MEM)
+BU_VSX_X (STXVD2X_V1TI,  "stxvd2x_v1ti",   MEM)
+BU_VSX_X (STXVD2X_V2DF,  "stxvd2x_v2df",   MEM)
+BU_VSX_X (STXVD2X_V2DI,  "stxvd2x_v2di",   MEM)
+BU_VSX_X (STXVW4X_V4SF,  "stxvw4x_v4sf",   MEM)
+BU_VSX_X (STXVW4X_V4SI,  "stxvw4x_v4si",   MEM)
+BU_VSX_X (STXVW4X_V8HI,  "stxvw4x_v8hi",   MEM)
+BU_VSX_X (STXVW4X_V16QI,  "stxvw4x_v16qi", MEM)
 BU_VSX_X (XSABSDP,   "xsabsdp",CONST)
 BU_VSX_X (XSADDDP,   "xsadddp",FP)
 BU_VSX_X (XSCMPODP,  "xscmpodp",   FP)



On Sun, 2016-04-24 at 15:52 -0500, Segher Boessenkool wrote:
> On Sun, Apr 24, 2016 at 02:06:47PM -0500, Bill Schmidt wrote:
> > ISA 3.0 adds the lvxh8x, lvxb16x, stvxh8x, and stvxb16x instructions,
> 
> lxvh8x etc.  It looks like you only swapped things in this message,
> not in the actual patch :-)
> 
> > (While working on this patch, I happened to notice that the existing
> > entries in rs6000-builtin.def for STXVD2X_ and STXVW4X_ are
> > mapped to stxsdx instead of stxvd2x/stxvw4x.  I took the opportunity to
> > correct that as an obvious bug.)
> 
> Does that part need backporting?
> 
> Should the new builtins be documented?
> 
> Looks fine otherwise.
> 
> 
> Segher
> 




Re: [PATCH 12/18] haifa-sched.c: make insn_queue[] a vec

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

-/* Remove INSN from queue.  */
+/* Remove INSN at idx from queue.  */
+static void
+queue_remove (unsigned int q, unsigned int idx)
+{
+  QUEUE_INDEX (insn_queue[q][idx]) = QUEUE_NOWHERE;
+  insn_queue[q].ordered_remove (idx);
+  q_size--;


I think I'm nacking this one, sorry. I don't think ordered_removes in 
the scheduler queues are going to fly.



Bernd



Re: [PATCH 07/18] loop-iv.c: make cond_list a vec

2016-04-25 Thread Trevor Saunders
On Mon, Apr 25, 2016 at 02:28:51PM +0200, Bernd Schmidt wrote:
> On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> 
> >+  unsigned int len = cond_list.length ();
> >+  for (unsigned int i = len - 1; i < len; i--)
> 
> This is a really icky way to write a loop, the i < len condition makes it
> look like a forward one. We have FOR_EACH_VEC_ELT{,_REVERSE}, any reason not
> to use these?

I'll agree that depending on unsigned wrapping is a tad wierd, but
personally I think FOR_EACH_VEC_* are pretty icky, and just forget to
think about them before writing a loop.  The vec::iterate () methods are
a little slower than they need to be since they check the vector length
each iteration instead of caching it (and pretty sure the compiler can't
save you in a bunch of these places).  Its Unfortunate you need to
declare a temporary for the vector item, and worse that temporary can't
be scoped by the loop.  Finally it seems like more work to remember
the order of arguments to FOR_EACH_VEC_ than just to write the loop.
That all said if people really feel strongly and I'll grant consistancy
matters I can try and change loops to use FOR_EACH_VEC_*.

Trev

> 
> 
> Bernd


Re: [PATCH 15/18] make nonlocal_goto_handler_labels a vec

2016-04-25 Thread Trevor Saunders
On Mon, Apr 25, 2016 at 02:43:27PM +0200, Bernd Schmidt wrote:
> On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:
> >-  remove_node_from_insn_list (insn, &nonlocal_goto_handler_labels);
> >+
> >+  unsigned int len = vec_safe_length (nonlocal_goto_handler_labels);
> >+  for (unsigned int i = 0; i < len; i++)
> >+if ((*nonlocal_goto_handler_labels)[i] == insn)
> >+  {
> >+nonlocal_goto_handler_labels->ordered_remove (i);
> >+break;
> >+  }
> 
> Why not unordered_remove?

I believe it was to keep the same behavior as what was done with lists.
I'm not sure that matters, but I don't think that I understand
everything involved well enough to say it definitely does not matter so
I wanted to be safe and avoid introducing bugs.

> Should there be a vec-based version of remove-node_from_*_list?

Maybe, but I will echo Richard's concern about making O(N) operations
easy.

> >+  rtx_insn *temp;
> >+  unsigned int i;
> >+  FOR_EACH_VEC_SAFE_ELT_REVERSE (nonlocal_goto_handler_labels, i, temp)
> >+  BLOCK_FOR_INSN (temp)->flags |= BB_NON_LOCAL_GOTO_TARGET;
> 
> Indentation looks wrong.

Sorry

> >
> >/* Process non-local goto edges.  */
> >if (can_nonlocal_goto (insn))
> >-for (rtx_insn_list *lab = nonlocal_goto_handler_labels;
> >- lab;
> >- lab = lab->next ())
> >-  maybe_record_trace_start_abnormal (lab->insn (), insn);
> >+{
> 
> Unnecessary brace?

Will check

> >/* Re-insert the EH_REGION notes.  */
> >-  if (eh_note || (was_call && nonlocal_goto_handler_labels))
> >+  if (eh_note || (was_call && vec_safe_length 
> >(nonlocal_goto_handler_labels)))
> 
> I'm not a big fan of omitting the > 0 and using the integer as a boolean.
> Multiple occurrences.

ok, I spend a lot of time working on code where that is the style, so
its habbit sorry.

Trev



Re: [PATCH 01/18] stop using rtx_insn_list in reorg.c

2016-04-25 Thread Trevor Saunders
On Mon, Apr 25, 2016 at 11:35:16AM +0200, Bernd Schmidt wrote:
> On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:
> >-  rtx_insn_list *merged_insns = 0;
> >+  auto_vec, 10> merged_insns;
> 
> I see Jeff has already acked this, but some of the expressions here are
> getting unwieldy. can we maybe shorten some of this using typedefs?

I guess you could shorten std::pair to insn_bool_pair
or something, I don't have a great descriptive name.  It was only used 3
times I think, so I wasn't terribly concerned, but I'd be happy to
change it.

Trev

> 
> 
> Bernd


Re: [PATCH 09/18] make pattern_regs a vec

2016-04-25 Thread Trevor Saunders
On Mon, Apr 25, 2016 at 02:56:07PM +0200, Bernd Schmidt wrote:
> On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:
> >
> >-static rtx_expr_list *
> >+static vec
> >  extract_mentioned_regs (rtx x)
> >  {
> >-  rtx_expr_list *mentioned_regs = NULL;
> >+  vec mentioned_regs = vNULL;
> >subrtx_var_iterator::array_type array;
> >FOR_EACH_SUBRTX_VAR (iter, array, x, NONCONST)
> >  {
> >rtx x = *iter;
> >if (REG_P (x))
> >-mentioned_regs = alloc_EXPR_LIST (0, x, mentioned_regs);
> >+mentioned_regs.safe_push (x);
> >  }
> >return mentioned_regs;
> >  }
> 
> Is it really such a great idea to return a vec by value? I'd rather pass a
> pointer to it into the function and operate on that.

Well, at the moment its harmless since vec is POD and so you just copy
the pointer to the data (note there are other places that do this).
However I agree it would be nice to stop returning vec<>.

Trev

> 
> 
> Bernd


[PATCH] Verify that context of local DECLs is the current function

2016-04-25 Thread Martin Jambor
Hi,

the patch below moves an assert from expand_expr_real_1 to gimple
verification.  It triggers when we do a sloppy job outlining stuff
from one function to another (or perhaps inlining too) and leave in
the IL of a function a local declaration that belongs to a different
function.

Like I wrote above, such cases usually ICE in expand anyway, but I
think it is worth bailing out sooner, if nothing because bugs like PR
70348 would not be assigned to me ;-) ...well, actually, I found this
helpful when working on OpenMP gridification.

In the process, I think that the verifier would not catch a
SSA_NAME_IN_FREE_LIST when such an SSA_NAME is a base of a MEM_REF so
I added that check too.

Bootstrapped and tested on x86_64-linux, OK for trunk?

Thanks,

Martin



2016-04-21  Martin Jambor  

* tree-cfg.c (verify_var_parm_result_decl): New function.
(verify_address): Call it on PARM_DECL bases.
(verify_expr): Likewise, also verify SSA_NAME bases of MEM_REFs.
---
 gcc/tree-cfg.c | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 3385164..c917967 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -2764,6 +2764,23 @@ gimple_split_edge (edge edge_in)
   return new_bb;
 }
 
+/* Verify that a VAR, PARM_DECL or RESULT_DECL T is from the current function,
+   and if not, return true.  If it is, return false.  */
+
+static bool
+verify_var_parm_result_decl (tree t)
+{
+  tree context = decl_function_context (t);
+  if (context != cfun->decl
+  && !SCOPE_FILE_SCOPE_P (context)
+  && !TREE_STATIC (t)
+  && !DECL_EXTERNAL (t))
+{
+  error ("Local declaration from a different function");
+  return true;
+}
+  return NULL;
+}
 
 /* Verify properties of the address expression T with base object BASE.  */
 
@@ -2798,6 +2815,8 @@ verify_address (tree t, tree base)
|| TREE_CODE (base) == RESULT_DECL))
 return NULL_TREE;
 
+  if (verify_var_parm_result_decl (base))
+return base;
   if (DECL_GIMPLE_REG_P (base))
 {
   error ("DECL_GIMPLE_REG_P set on a variable with address taken");
@@ -2834,6 +2853,13 @@ verify_expr (tree *tp, int *walk_subtrees, void *data 
ATTRIBUTE_UNUSED)
}
   break;
 
+case PARM_DECL:
+case VAR_DECL:
+case RESULT_DECL:
+  if (verify_var_parm_result_decl (t))
+   return t;
+  break;
+
 case INDIRECT_REF:
   error ("INDIRECT_REF in gimple IL");
   return t;
@@ -2852,9 +2878,25 @@ verify_expr (tree *tp, int *walk_subtrees, void *data 
ATTRIBUTE_UNUSED)
  error ("invalid offset operand of MEM_REF");
  return TREE_OPERAND (t, 1);
}
+  if (TREE_CODE (x) == SSA_NAME)
+   {
+ if (SSA_NAME_IN_FREE_LIST (x))
+   {
+ error ("SSA name in freelist but still referenced");
+ return x;
+   }
+ if (SSA_NAME_VAR (x))
+   x = SSA_NAME_VAR (x);;
+   }
+  if ((TREE_CODE (x) == PARM_DECL
+  || TREE_CODE (x) == VAR_DECL
+  || TREE_CODE (x) == RESULT_DECL)
+ && verify_var_parm_result_decl (x))
+   return x;
   if (TREE_CODE (x) == ADDR_EXPR
  && (x = verify_address (x, TREE_OPERAND (x, 0
return x;
+
   *walk_subtrees = 0;
   break;
 
@@ -3010,6 +3052,11 @@ verify_expr (tree *tp, int *walk_subtrees, void *data 
ATTRIBUTE_UNUSED)
 
  t = TREE_OPERAND (t, 0);
}
+  if ((TREE_CODE (t) == PARM_DECL
+  || TREE_CODE (t) == VAR_DECL
+  || TREE_CODE (t) == RESULT_DECL)
+ && verify_var_parm_result_decl (t))
+   return t;
 
   if (!is_gimple_min_invariant (t) && !is_gimple_lvalue (t))
{
-- 
2.8.1



Re: [PATCH 09/18] make pattern_regs a vec

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:


-static rtx_expr_list *
+static vec
  extract_mentioned_regs (rtx x)
  {
-  rtx_expr_list *mentioned_regs = NULL;
+  vec mentioned_regs = vNULL;
subrtx_var_iterator::array_type array;
FOR_EACH_SUBRTX_VAR (iter, array, x, NONCONST)
  {
rtx x = *iter;
if (REG_P (x))
-   mentioned_regs = alloc_EXPR_LIST (0, x, mentioned_regs);
+   mentioned_regs.safe_push (x);
  }
return mentioned_regs;
  }


Is it really such a great idea to return a vec by value? I'd rather pass 
a pointer to it into the function and operate on that.



Bernd


[PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-25 Thread H.J. Lu
Tested on Linux/x86-64.  OK for trunk?

BTW, I have a followup patch to use SSE for TImode bitwise operation.

H.J.

128-bit SSE load and store instructions can be used for load and store
of 128-bit integers if they are the only operations on 128-bit integers.
To convert load and store of 128-bit integers to 128-bit SSE load and
store, the original STV pass, which is designed to convert 64-bit integer
operations to SSE2 operations in 32-bit mode, is extended to 64-bit mode
in the following ways:

1. Class scalar_chain is turned into base class.  The 32-bit specific
member functions are moved to the new derived class, scalar_chain_32.
The new derived class, scalar_chain_64, is added to convert oad and
store of 128-bit integers to 128-bit SSE load and store.
2. Add the 64-bit version of scalar_to_vector_candidate_p and
remove_non_convertible_regs.  Only TImode load and store are allowed
for conversion.  If one instruction on the chain of dependent
instructions aren't TImode load or store, the chain of instructions
won't be converted.
3. In 64-bit, we only convert from TImode to V1TImode, which have the
same size.  The difference is only vector registers are allowed in
TImode so that 128-bit SSE load and store instructions will be used
for load and store of 128-bit integers.
4. Put the 64-bit STV pass before the CSE pass so that instructions
changed or generated by the STV pass can be CSEed.

gcc/

PR target/70155
* config/i386/i386.c (scalar_to_vector_candidate_p): Renamed
to ...
(scalar_to_vector_candidate_p_32): This.
(scalar_to_vector_candidate_p_64): New function.
(scalar_to_vector_candidate_p): Likewise.
(check_non_convertible_regs_64): Likewise.
(remove_non_convertible_regs_64): Likewise.
(remove_non_convertible_regs): Likewise.
(remove_non_convertible_regs): Renamed to ...
(remove_non_convertible_regs_32): This.
(scalar_chain::~scalar_chain): Make it virtual.
(scalar_chain::compute_convert_gain): Make it pure virtual.
(scalar_chain::convert_insn): Likewise.
(scalar_chain::convert_registers): Likewise.
(scalar_chain::analyze_register_chain): Likewise.
(scalar_chain::add_to_queue): Make it protected.
(scalar_chain::emit_conversion_insns): Likewise.
(scalar_chain::mark_dual_mode_def): Moved to scalar_chain_32.
(scalar_chain::replace_with_subreg): Likewise.
(scalar_chain::replace_with_subreg_in_insn): Likewise.
(scalar_chain::convert_op): Likewise.
(scalar_chain::convert_reg): Likewise.
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_registers): New pure virtual function.
(class scalar_chain_32): New class.
(class scalar_chain_64): Likewise.
(scalar_chain::mark_dual_mode_def): Renamed to ...
(scalar_chain_32::mark_dual_mode_def): This.
(scalar_chain::analyze_register_chain): Renamed to ...
(scalar_chain_32::analyze_register_chain ): This.
(scalar_chain_64::analyze_register_chain): New function.
(scalar_chain::compute_convert_gain): Renamed to ...
(scalar_chain_32::compute_convert_gain): This.
(scalar_chain::replace_with_subreg): Renamed to ...
(scalar_chain_32::replace_with_subreg): This.
(scalar_chain::replace_with_subreg_in_insn): Renamed to ...
(scalar_chain_32::replace_with_subreg_in_insn): This.
(scalar_chain::make_vector_copies): Renamed to ...
(scalar_chain_32::make_vector_copies): This.
(scalar_chain::convert_reg): Renamed to ...
(scalar_chain_32::convert_reg ): This.
(scalar_chain::convert_op): Renamed to ...
(scalar_chain_32::convert_op): This.
(scalar_chain::convert_insn): Renamed to ...
(scalar_chain_32::convert_insn): This.
(scalar_chain_64::convert_insn): New function.
(scalar_chain_32::convert_registers): Likewise.
(scalar_chain::convert): Call convert_registers.
(convert_scalars_to_vector): Change to scalar_chain pointer to
use scalar_chain_64 in 64-bit mode and scalar_chain_32 in 32-bit
mode.  Delete scalar_chain pointer.  Call free_dominance_info in
64-bit mode.
(pass_stv::gate): Remove TARGET_64BIT check.
(ix86_option_override): Put the 64-bit STV pass before the CSE
pass.

gcc/testsuite/

PR target/70155
* gcc.target/i386/pr55247-2.c: Updated to check movti_internal
and movv1ti_internal patterns
* gcc.target/i386/pr70155-1.c: New test.
* gcc.target/i386/pr70155-2.c: Likewise.
* gcc.target/i386/pr70155-3.c: Likewise.
* gcc.target/i386/pr70155-4.c: Likewise.
* gcc.target/i386/pr70155-5.c: Likewise.
* gcc.target/i386/pr70155-6.c: Likewise.
* gcc.target/i386/pr70155-7.c: Likewise.
* gcc.target/i386/pr70155-8.c: Likewise.
* gcc.targ

Re: [PATCH 15/18] make nonlocal_goto_handler_labels a vec

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

-  remove_node_from_insn_list (insn, &nonlocal_goto_handler_labels);
+
+  unsigned int len = vec_safe_length (nonlocal_goto_handler_labels);
+  for (unsigned int i = 0; i < len; i++)
+   if ((*nonlocal_goto_handler_labels)[i] == insn)
+ {
+   nonlocal_goto_handler_labels->ordered_remove (i);
+   break;
+ }


Why not unordered_remove?

Should there be a vec-based version of remove-node_from_*_list?


+  rtx_insn *temp;
+  unsigned int i;
+  FOR_EACH_VEC_SAFE_ELT_REVERSE (nonlocal_goto_handler_labels, i, temp)
+  BLOCK_FOR_INSN (temp)->flags |= BB_NON_LOCAL_GOTO_TARGET;


Indentation looks wrong.



/* Process non-local goto edges.  */
if (can_nonlocal_goto (insn))
-   for (rtx_insn_list *lab = nonlocal_goto_handler_labels;
-lab;
-lab = lab->next ())
- maybe_record_trace_start_abnormal (lab->insn (), insn);
+   {


Unnecessary brace?


/* Re-insert the EH_REGION notes.  */
-  if (eh_note || (was_call && nonlocal_goto_handler_labels))
+  if (eh_note || (was_call && vec_safe_length (nonlocal_goto_handler_labels)))


I'm not a big fan of omitting the > 0 and using the integer as a 
boolean. Multiple occurrences.



+  FOR_EACH_VEC_SAFE_ELT_REVERSE (nonlocal_goto_handler_labels, i, insn)
+  set_label_offsets (insn, NULL, 1);


Indentation.


Bernd


Re: [PATCH 00/18] towards removing rtx_insn_list and rtx_expr_list

2016-04-25 Thread Bernd Schmidt

On 04/21/2016 01:24 AM, Trevor Saunders wrote:

On Wed, Apr 20, 2016 at 06:03:01AM -0700, Andi Kleen wrote:



A vector can have very different performance than a list, depending how
it is used. Do your patches cause any measure performance difference for
the compiler?


I haven't measured, but I am aware of that and did consider it when
writing these patches.  I expect they'll help perf some since I went
through some hoops to not move elements around the vector unnecessarily.
I'm not really sure what work load is most effected by each of these
patches, and they don't really seem that risky to me so I'd rather notdo
tons of testing on the off chance they slow something down, in the worst
case we can always revert something to a list without using rtx.


Well, at least post before/after numbers (with all patches applied for 
the "after") for compiling a large source file, including time and 
memory reports.



Bernd


Re: [PATCH 07/18] loop-iv.c: make cond_list a vec

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 08:22 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 



+ unsigned int len = cond_list.length ();
+ for (unsigned int i = len - 1; i < len; i--)


This is a really icky way to write a loop, the i < len condition makes 
it look like a forward one. We have FOR_EACH_VEC_ELT{,_REVERSE}, any 
reason not to use these?



Bernd


Commit: MSP430: Optimize 1 bit shifts

2016-04-25 Thread Nick Clifton
Hi Guys,

  I am applying this patch, developed by DJ, to improve the code
  generated for the MSP430 when performing a shift by a single bit.
  Normally a helper function is used to perform N-bit shifts, but
  for one bit we can save time, and not use up any more space, by
  performing the shift inline.

Cheers
  Nick

gcc/ChangeLog
2016-04-25  DJ Delorie  

* config/msp430/msp430.md (ashlhi3): Optimize one bit shifts.
(ashrhi3): Likewise.
(lshrhi3): Likewise.

Index: gcc/config/msp430/msp430.md
===
--- gcc/config/msp430/msp430.md (revision 235409)
+++ gcc/config/msp430/msp430.md (working copy)
@@ -756,6 +756,9 @@
 && REG_P (operands[1])
 && CONST_INT_P (operands[2]))
   emit_insn (gen_430x_shift_left (operands[0], operands[1], operands[2]));
+else if (CONST_INT_P (operands[2])
+&& INTVAL (operands[2]) == 1)
+  emit_insn (gen_slli_1 (operands[0], operands[1]));
 else
   msp430_expand_helper (operands, \"__mspabi_slli\", true);
 DONE;
@@ -825,6 +828,9 @@
 && REG_P (operands[1])
 && CONST_INT_P (operands[2]))
   emit_insn (gen_430x_arithmetic_shift_right (operands[0], operands[1], 
operands[2]));
+else if (CONST_INT_P (operands[2])
+&& INTVAL (operands[2]) == 1)
+  emit_insn (gen_srai_1 (operands[0], operands[1]));
 else
msp430_expand_helper (operands, \"__mspabi_srai\", true);
DONE;
@@ -910,6 +916,9 @@
 && REG_P (operands[1])
 && CONST_INT_P (operands[2]))
   emit_insn (gen_430x_logical_shift_right (operands[0], operands[1], 
operands[2]));
+else if (CONST_INT_P (operands[2])
+&& INTVAL (operands[2]) == 1)
+  emit_insn (gen_srli_1 (operands[0], operands[1]));
 else
   msp430_expand_helper (operands, \"__mspabi_srli\", true);
 DONE;


Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-25 Thread Prathamesh Kulkarni
On 6 April 2016 at 14:54, Richard Biener  wrote:
> On Wed, 6 Apr 2016, Richard Biener wrote:
>
>> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>>
>> > On 6 April 2016 at 13:44, Richard Biener  wrote:
>> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>> > >
>> > >> On 5 April 2016 at 18:28, Richard Biener  wrote:
>> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> > >> >
>> > >> >> On 5 April 2016 at 16:58, Richard Biener  wrote:
>> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> > >> >> >
>> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
>> > >> >> >> >
>> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> > >> >> >> >> index 9eb63c2..bc0c612 100644
>> > >> >> >> >> --- a/gcc/lto/lto-partition.c
>> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
>> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> > >> >> >> >>varpool_order.qsort (varpool_node_cmp);
>> > >> >> >> >>
>> > >> >> >> >>/* Compute partition size and create the first partition.  
>> > >> >> >> >> */
>> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
>> > >> >> >> >> (MAX_PARTITION_SIZE))
>> > >> >> >> >> +fatal_error (input_location, "min partition size cannot 
>> > >> >> >> >> be greater than max partition size");
>> > >> >> >> >> +
>> > >> >> >> >>partition_size = total_size / n_lto_partitions;
>> > >> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> > >> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> > >> >> >> >> +{
>> > >> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
>> > >> >> >> >> (MAX_PARTITION_SIZE);
>> > >> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> > >> >> >> >> + n_lto_partitions++;
>> > >> >> >> >> +  partition_size = total_size / n_lto_partitions;
>> > >> >> >> >> +}
>> > >> >> >> >
>> > >> >> >> > lto_balanced_map actually works in a way that looks for 
>> > >> >> >> > cheapest cutpoint in range
>> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest 
>> > >> >> >> > range.
>> > >> >> >> > Setting partition_size to this value will thus not cause 
>> > >> >> >> > partitioner to produce smaller
>> > >> >> >> > partitions only.  I suppose modify the conditional:
>> > >> >> >> >
>> > >> >> >> >   /* Partition is too large, unwind into step when best 
>> > >> >> >> > cost was reached and
>> > >> >> >> >  start new partition.  */
>> > >> >> >> >   if (partition->insns > 2 * partition_size)
>> > >> >> >> >
>> > >> >> >> > and/or in the code above set the partition_size to half of 
>> > >> >> >> > total_size/max_size.
>> > >> >> >> >
>> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut 
>> > >> >> >> > implementation
>> > >> >> >> > many years ago. I expected to reimplement it marter soon, but 
>> > >> >> >> > then there was
>> > >> >> >> > never really a need for it (I am trying to avoid late IPA 
>> > >> >> >> > optimizations so the
>> > >> >> >> > partitioning decisions should mostly affect compile time 
>> > >> >> >> > performance only).
>> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make 
>> > >> >> >> > sense to try to
>> > >> >> >> > look for something smarter.
>> > >> >> >> >
>> > >> >> >> >> +
>> > >> >> >> >>npartitions = 1;
>> > >> >> >> >>partition = new_partition ("");
>> > >> >> >> >>if (symtab->dump_file)
>> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> > >> >> >> >> index 9dd513f..294b8a4 100644
>> > >> >> >> >> --- a/gcc/lto/lto.c
>> > >> >> >> >> +++ b/gcc/lto/lto.c
>> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> > >> >> >> >>timevar_pop (TV_WHOPR_WPA);
>> > >> >> >> >>
>> > >> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
>> > >> >> >> >> +
>> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> > >> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> > >> >> >> >> +fatal_error (input_location, "--param max-lto-partition 
>> > >> >> >> >> should only"
>> > >> >> >> >> +  " be used with balanced partitioning\n");
>> > >> >> >> >> +
>> > >> >> >> >
>> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default. 
>> > >> >> >> >  THe value you
>> > >> >> >> > found experimentally may be a good start. For that reason we 
>> > >> >> >> > can't really
>> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it 
>> > >> >> >> > as parameter for
>> > >> >> >> > balanced partitioning only and add a parameter to 
>> > >> >> >> > lto_balanced_map specifying whether
>> > >> >> >> > this param should be honored (because the same path is used for 
>> > >> >> >> > partitioning to one partition)
>> > >> >> >> >
>> > >> >> >> > Otherwise the patch looks good to me modulo missing 
>> > >> >> >> > docume

Re: An abridged "Writing C" for the gcc web pages

2016-04-25 Thread Bernd Schmidt

On 04/22/2016 09:45 PM, Sandra Loosemore wrote:

On 04/22/2016 10:42 AM, paul_kon...@dell.com wrote:



Would you expect people to conform to the abridged version or the
full standard?  If the full standard, then publishing an abridged
version is not a good idea, it will just cause confusion.  Let the
full standard be the rule, make people read it, and if they didn't
bother that's their problem.


I agree; let's not have two documents that can conflict or get out of
sync with each other, unless you can figure out how to extract the
abridged document automatically from the full version.


Hmm. As for being out-of-date, I'd say the official document is guilty: 
it talks about how we can use C89 now that it is prevalent enough. Now 
that I've looked at it again after many years, the document really does 
seem poorly organized and contains lots of irrelevant information (a 
huge table of long option names?)


I think if we limit our local document to just the basics (and maybe 
call it a Getting Started guide rather than an abridged form of the GNU 
coding standards), there's little danger of it going out of date, and I 
still think having it would improve our documentation.



Bernd


Re: [PATCH, www] Fix typo in htdocs/develop.html

2016-04-25 Thread Bernd Schmidt

On 04/21/2016 02:16 PM, Kirill Yukhin wrote:

Hello,
This looks like a typo to me.

   GCC 6 Stage 4 (starts 2016-01-20)GCC 5.3 release (2015-12-04)
|
+-- GCC 5 branch created +
| \
v  v

Patch in the bottom. Is it ok to install?


Sure.


Bernd



Commit: MSP430: Update prototypes in libgcc

2016-04-25 Thread Nick Clifton
Hi Guys,

  I am applying the following patch to update the prototypes in the
  MSP430 specific part of libgcc.  It adds missing prototypes for
  exported ABI functions, and it changes the prototypes for the
  arithmetic shift functions so that they explicitly take a signed
  char parameter.

Cheers
  Nick

libgcc/ChangeLog
2016-04-25  Nick Clifton  

* config/msp430/cmpd.c (__mspabi_cmpf): Add prototype.
(__mspabi_cmpd): Likewise.
* config/msp430/floathidf.c (__floathidf): Likewise.
* config/msp430/floathisf.c (__floathisf): Likewise
* config/msp430/floatunhidf.c (__floatunssidf): Likewise.
* config/msp430/floatunhisf.c (__floatunshisf): Likewise.
* config/msp430/lib2shift.c (__ashlsi3): Take a signed char as the
second parameter.
(__ashrsi3): Likewise.

Index: libgcc/config/msp430/cmpd.c
===
--- libgcc/config/msp430/cmpd.c (revision 235408)
+++ libgcc/config/msp430/cmpd.c (working copy)
@@ -1,4 +1,7 @@
 /* Public domain.  */
+
+int __mspabi_cmpf (float, float);
+
 int
 __mspabi_cmpf (float x, float y)
 {
@@ -8,6 +11,9 @@
 return 1;
   return 0;
 }
+
+int __mspabi_cmpd (double, double);
+
 int
 __mspabi_cmpd (double x, double y)
 {
Index: libgcc/config/msp430/floathidf.c
===
--- libgcc/config/msp430/floathidf.c(revision 235408)
+++ libgcc/config/msp430/floathidf.c(working copy)
@@ -1,6 +1,8 @@
 /* Public domain.  */
 extern double __floatsidf (long);
 
+double __floathidf (int);
+
 double
 __floathidf (int u)
 {
Index: libgcc/config/msp430/floathisf.c
===
--- libgcc/config/msp430/floathisf.c(revision 235408)
+++ libgcc/config/msp430/floathisf.c(working copy)
@@ -4,6 +4,8 @@
 
 extern SFtype __floatsisf (unsigned long);
 
+SFtype __floathisf (HItype);
+
 SFtype
 __floathisf (HItype u)
 {
Index: libgcc/config/msp430/floatunhidf.c
===
--- libgcc/config/msp430/floatunhidf.c  (revision 235408)
+++ libgcc/config/msp430/floatunhidf.c  (working copy)
@@ -5,6 +5,8 @@
 
 extern DFtype __floatunsidf (unsigned long);
 
+DFtype __floatunhidf (UHItype);
+
 DFtype
 __floatunhidf (UHItype u)
 {
Index: libgcc/config/msp430/floatunhisf.c
===
--- libgcc/config/msp430/floatunhisf.c  (revision 235408)
+++ libgcc/config/msp430/floatunhisf.c  (working copy)
@@ -5,6 +5,8 @@
 
 extern SFtype __floatunsisf (unsigned long);
 
+SFtype __floatunhisf (UHItype);
+
 SFtype
 __floatunhisf (UHItype u)
 {
Index: libgcc/config/msp430/lib2shift.c
===
--- libgcc/config/msp430/lib2shift.c(revision 235408)
+++ libgcc/config/msp430/lib2shift.c(working copy)
@@ -28,10 +28,10 @@
 typedef  int  sint16_type   __attribute__ ((mode (HI)));
 typedef unsigned int  uint16_type   __attribute__ ((mode (HI)));
 
-uint32_type __ashlsi3 (uint32_type in, char bit);
-sint32_type __ashrsi3 (sint32_type in, char bit);
-int __clrsbhi2 (sint16_type x);
-extern int __clrsbsi2 (sint32_type x);
+uint32_type __ashlsi3 (uint32_type, signed char);
+sint32_type __ashrsi3 (sint32_type, signed char);
+int __clrsbhi2 (sint16_type);
+extern int  __clrsbsi2 (sint32_type);
 
 typedef struct
 {
@@ -43,7 +43,7 @@
 } dd;
 
 uint32_type
-__ashlsi3 (uint32_type in, char bit)
+__ashlsi3 (uint32_type in, signed char bit)
 {
   uint16_type h, l;
   dd d;
@@ -77,7 +77,7 @@
 }
 
 sint32_type
-__ashrsi3 (sint32_type in, char bit)
+__ashrsi3 (sint32_type in, signed char bit)
 {
   sint16_type h;
   uint16_type l;


Re: [PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3

2016-04-25 Thread Eric Botcazou
> Did you manage to do this yet?  I'm flushing my stage1 queue of
> "simple cleanups" right now.

No, I'm going to have a look this week.

-- 
Eric Botcazou


[PATCH] Fix PR70780

2016-04-25 Thread Richard Biener

The following patch fixes PR70780 uncovered by a mistake I made when
updating the iteration scheme in PRE antic compute.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-04-25  Richard Biener  

PR tree-optimization/70780
* tree-ssa-pre.c (compute_antic_aux): Also return true if the block
wasn't visited yet.
(compute_antic): Mark blocks with abnormal preds as visited as
they have a final empty antic-in solution already.

* gcc.dg/torture/pr70780.c: New testcase.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 235404)
--- gcc/tree-ssa-pre.c  (working copy)
*** compute_antic_aux (basic_block block, bo
*** 2081,2086 
--- 2081,2087 
unsigned int bii;
edge e;
edge_iterator ei;
+   bool was_visited = BB_VISITED (block);
  
old = ANTIC_OUT = S = NULL;
BB_VISITED (block) = 1;
*** compute_antic_aux (basic_block block, bo
*** 2171,2177 
  
clean (ANTIC_IN (block));
  
!   if (!bitmap_set_equal (old, ANTIC_IN (block)))
  changed = true;
  
   maybe_dump_sets:
--- 2172,2178 
  
clean (ANTIC_IN (block));
  
!   if (!was_visited || !bitmap_set_equal (old, ANTIC_IN (block)))
  changed = true;
  
   maybe_dump_sets:
*** compute_antic (void)
*** 2349,2363 
  
FOR_ALL_BB_FN (block, cfun)
  {
FOR_EACH_EDGE (e, ei, block->preds)
if (e->flags & EDGE_ABNORMAL)
  {
bitmap_set_bit (has_abnormal_preds, block->index);
break;
  }
  
-   BB_VISITED (block) = 0;
- 
/* While we are here, give empty ANTIC_IN sets to each block.  */
ANTIC_IN (block) = bitmap_set_new ();
PA_IN (block) = bitmap_set_new ();
--- 2350,2367 
  
FOR_ALL_BB_FN (block, cfun)
  {
+   BB_VISITED (block) = 0;
+ 
FOR_EACH_EDGE (e, ei, block->preds)
if (e->flags & EDGE_ABNORMAL)
  {
bitmap_set_bit (has_abnormal_preds, block->index);
+ 
+   /* We also anticipate nothing.  */
+   BB_VISITED (block) = 1;
break;
  }
  
/* While we are here, give empty ANTIC_IN sets to each block.  */
ANTIC_IN (block) = bitmap_set_new ();
PA_IN (block) = bitmap_set_new ();
Index: gcc/testsuite/gcc.dg/torture/pr70780.c
===
*** gcc/testsuite/gcc.dg/torture/pr70780.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr70780.c  (working copy)
***
*** 0 
--- 1,26 
+ /* { dg-do run } */
+ 
+ int a, b, c, *d, e;
+ 
+ static int
+ fn1 () 
+ {
+   if (a)
+ goto l1;
+ l2: while (b)
+   if (*d)
+   return c;
+ for (e = 0; e; e++)
+   {
+   goto l2;
+ l1:;
+   }
+ return 0; 
+ }
+ 
+ int
+ main ()
+ {
+   fn1 ();
+   return 0; 
+ }


Re: Please include ada-hurd.diff upstream (try2)

2016-04-25 Thread Arnaud Charlet
> Attaching the modified ada-hurd.diff. Maybe it is ready for inclusion in
> upstream now?

Patch is OK, go ahead and commit it, thanks.

> 2016-03-31  Svante Signell  
> 
> * gcc-interface/Makefile.in: Add support for x86 GNU/Hurd.
> * s-osinte-gnu.ads: New specification file.
> * s-osinte-gnu.adb: New body file, basically s-osinte-posix.adb
>   adding dummy implementation of functions not yet implemented.


Re: Please include ada-hurd.diff upstream (try2)

2016-04-25 Thread Svante Signell
On Mon, 2016-04-25 at 11:50 +0200, Arnaud Charlet wrote:
> > 
> > Is replacing
> > 
> > +-- Copyright (C) 1991-1994, Florida State
> > University--
> > +-- Copyright (C) 1995-2014,
> > AdaCore --
> > +--  Copyright (C) 2015-2016, Free Software Foundation,
> > Inc. --
> > 
> > with only
> > 
> > +--  Copyright (C) 2015-2016, Free Software Foundation,
> > Inc. --
> > 
> > OK?
> For this specific new file, yes that's fine.

Attaching the modified ada-hurd.diff. Maybe it is ready for inclusion in
upstream now?

2016-03-31  Svante Signell  

* gcc-interface/Makefile.in: Add support for x86 GNU/Hurd.
* s-osinte-gnu.ads: New specification file.
* s-osinte-gnu.adb: New body file, basically s-osinte-posix.adb
  adding dummy implementation of functions not yet implemented.
---
Index: gcc-5-5.3.1/src/gcc/ada/s-osinte-gnu.adb
===
--- /dev/null
+++ gcc-5-5.3.1/src/gcc/ada/s-osinte-gnu.adb
@@ -0,0 +1,144 @@
+--
+--  --
+-- GNAT RUN-TIME LIBRARY (GNARL) COMPONENTS --
+--  --
+--   S Y S T E M . O S _ I N T E R F A C E  --
+--  --
+--   B o d y--
+--  --
+--  Copyright (C) 2015-2016, Free Software Foundation, Inc. --
+--  --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE. --
+--  --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.   --
+--  --
+-- You should have received a copy of the GNU General Public License and--
+-- a copy of the GCC Runtime Library Exception along with this program; --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
+-- .  --
+--  --
+-- GNARL was developed by the GNARL team at Florida State University.   --
+-- Extensive contributions were provided by Ada Core Technologies, Inc. --
+--  --
+--
+
+--  This is the GNU/Hurd version of this package.
+
+pragma Polling (Off);
+--  Turn off polling, we do not want ATC polling to take place during
+--  tasking operations. It causes infinite loops and other problems.
+
+--  This package encapsulates all direct interfaces to OS services
+--  that are needed by children of System.
+
+package body System.OS_Interface is
+
+   
+   -- Get_Stack_Base --
+   
+
+   function Get_Stack_Base (thread : pthread_t) return Address is
+  pragma Warnings (Off, thread);
+
+   begin
+  return Null_Address;
+   end Get_Stack_Base;
+
+   --
+   -- pthread_init --
+   --
+
+   procedure pthread_init is
+   begin
+  null;
+   end pthread_init;
+
+   --
+   -- pthread_mutexattr_setprioceiling --
+   --
+
+   function pthread_mutexattr_setprioceiling
+ (attr : access pthread_mutexattr_t;
+  prioceiling : int) return int is
+  pragma Unreferenced (attr, prioceiling);
+   begin
+  return 0;
+   end pthread_mutexattr_setprioceiling;
+
+   --
+   -- pthread_mutexattr_getprioceiling --
+   --
+
+   function pthread_mutexattr_getprioceiling
+ (attr : access pthread_mutexattr_t;
+  prioceiling : access int) return int is
+  pragma Unreferenced (attr, prioceiling);
+   begin
+  return 0;
+  

Re: [Patch] Fix PR 60040

2016-04-25 Thread Senthil Kumar Selvaraj

Ping!

Regards
Senthil

Senthil Kumar Selvaraj writes:

> Bernd Schmidt writes:
>
>> On 04/07/2016 01:52 PM, Senthil Kumar Selvaraj wrote:
>>>The below patch fixes PR 60040 by not halting with a hard error on
>>>a spill failure, if reload knows that it has to run again anyway.
>>
>> Some additional information as to how this situation creates a spill 
>> failure would be useful. It's hard to tell whether this patch just 
>> papers over a problem that can still trigger in other circumstances.
>
> For both testcases in the PR, reload fails to take into account that
> FP-SP elimination can no longer be performed, and tries to find reload
> regs for an rtx generated when FP-SP elimination was valid.
>
> 1. reload initializes elim table with FP->SP elimination enabled.
> 2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets
>reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets
>something_was_spilled to true.
> 3. The main reload loop starts, and it resets something_was_spilled to false.
> 4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to
>(mem(SP + offset)).
> 5. calculate_needs_all_insns pushes a reload for SP (for the AVR target,
>SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs).
> 6. update_eliminables_and_spill calls targetm.frame_pointer_required,
>which returns true. That causes can_eliminate for FP->SP to be reset
>to zero, and FP to be added to bad_spill_regs_global. For the AVR,
>FP is Y, one of the 3 pointer regs. reload also notes that something
>has changed, and that the loop needs to run again.
> 7. reload still calls select_reload_regs, and find_regs fails to find a
>pointer reg to reload SP, which is unnecessary as FP->SP elimination
>had been disabled anyway in (6).
>
> IOW, reload fails to find pointer regs for an RTL expression that was
> created when FP->SP elimination was true, even after it turns out that
> the elimination can't be done after all. The patch tries to detect that
> - if it knows the loop is going to run again, it silences the failure.
>
> Also note that at a different point in the loop, the reload loop starts
> over if something_was_spilled (line 982-986). If set outside the reload
> loop by alter_reg, it gets reset at (3) - not sure why. I'd think a
> "continue" after update_eliminables_and_spill (line 1019-1022) would
> also work - haven't tested it though.
>
> What do you think?
>
>
>>
>>> -   spill_failure (chain->insn, rld[r].rclass);
>>> -   failure = 1;
>>> -   return;
>>> +   if (!tentative)
>>> +   {
>>> +   spill_failure (chain->insn, rld[r].rclass);
>>> +   failure = 1;
>>> +   return;
>>> +   }
>>>   }
>>
>> The indentation looks all wrong.
>>
>
> Fixed now - mixed up tabs and spaces.
>
> gcc/ChangeLog
>
> 2016-04-07  Joern Rennecke  
> Senthil Kumar Selvaraj  
>
> PR target/60040
> * reload1.c (find_reload_regs): Add tentative parameter.
> and don't report spill failure if param set.
> (reload): Propagate something_changed to
> select_reload_regs.
> (select_reload_regs): Add tentative parameter.
>
> gcc/testsuite/ChangeLog
>
> 2016-04-07  Sebastian Huber  
> Matthijs Kooijman  
> Senthil Kumar Selvaraj  
>
> PR target/60040
> * gcc.target/avr/pr60040-1.c: New.
> * gcc.target/avr/pr60040-2.c: Likewise.
>
> diff --git gcc/reload1.c gcc/reload1.c
> index c2800f8..58993a3 100644
> --- gcc/reload1.c
> +++ gcc/reload1.c
> @@ -346,8 +346,8 @@ static void maybe_fix_stack_asms (void);
>  static void copy_reloads (struct insn_chain *);
>  static void calculate_needs_all_insns (int);
>  static int find_reg (struct insn_chain *, int);
> -static void find_reload_regs (struct insn_chain *);
> -static void select_reload_regs (void);
> +static void find_reload_regs (struct insn_chain *, bool);
> +static void select_reload_regs (bool);
>  static void delete_caller_save_insns (void);
>  
>  static void spill_failure (rtx_insn *, enum reg_class);
> @@ -1022,7 +1022,7 @@ reload (rtx_insn *first, int global)
> something_changed = 1;
>   }
>  
> -  select_reload_regs ();
> +  select_reload_regs (something_changed);
>if (failure)
>   goto failed;
>  
> @@ -1960,10 +1960,13 @@ find_reg (struct insn_chain *chain, int order)
> is given by CHAIN.
> Do it by ascending class number, since otherwise a reg
> might be spilled for a big class and might fail to count
> -   for a smaller class even though it belongs to that class.  */
> +   for a smaller class even though it belongs to that class.
> +   TENTATIVE means that we had some changes that might have invalidated
> +   the reloads and that we are going to loop again anyway, so don't give
> +   a hard error on failure to find a reload reg. */
>  
>  static v

Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-04-25 Thread Bernd Schmidt

On 04/18/2016 02:26 PM, Dhole wrote:

A few months ago I submited a patch to allow the embedded timestamps by
C/C++ macros to be set externally [2], which was already an improvement
over [1].  I was told to wait until the GCC 7 stage 1 started to send
this patch again.



+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+long long
+get_source_date_epoch()


Always have a space before open-paren. Maybe this should return time_t. 
See below.



+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+extern long long get_source_date_epoch();


Double space after the end of a sentence. Space before open paren.


+  source_date_epoch = get_source_date_epoch();
+  cpp_init_source_date_epoch(parse_in, source_date_epoch);


Spaces.


+/* Initialize the source_date_epoch value.  */
+extern void cpp_init_source_date_epoch (cpp_reader *, long long);


Also thinking we should be using time_t here.


  /* Sanity-checks are dependent on command-line options, so it is
 called as a subroutine of cpp_read_main_file ().  */


We don't write () to mark function names.


+tb = gmtime ((time_t*) &pfile->source_date_epoch);


Space before the "*". But this cast looks ugly and unreliable (think 
big-endian). This is why I would prefer to move to a time_t 
representation sooner.



2016-04-18  Eduard Sanou
Matthias Klose
* c-common.c (get_source_date_epoch): New function, gets the environment
variable SOURCE_DATE_EPOCH and parses it as long long with error
handling.
* c-common.h (get_source_date_epoch): Prototype.
* c-lex.c (c_lex_with_flags): set parse_in->source_date_epoch.


Add blank lines after the end of the names in ChangeLogs.


Bernd


Re: [PATCH] Fix missed DSE opportunity with operator delete.

2016-04-25 Thread Richard Biener
On Fri, Apr 22, 2016 at 11:37 PM, Mikhail Maltsev  wrote:
> On 04/20/2016 05:12 PM, Richard Biener wrote:
>> You have
>>
>> +static tree
>> +handle_free_attribute (tree *node, tree name, tree /*args*/, int /*flags*/,
>> +  bool *no_add_attrs)
>> +{
>> +  tree decl = *node;
>> +  if (TREE_CODE (decl) == FUNCTION_DECL
>> +  && type_num_arguments (TREE_TYPE (decl)) != 0
>> +  && POINTER_TYPE_P (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (decl)
>> +DECL_ALLOC_FN_KIND (decl) = ALLOC_FN_FREE;
>> +  else
>> +{
>> +  warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes,
>> + "%qE attribute ignored", name);
>> +  *no_add_attrs = true;
>> +}
>>
>> so one can happily apply the attribute to
>>
>>  void foo (void *, void *);
>>
>> but then
>>
>> @@ -2117,6 +2127,13 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
>>   /* Fallthru to general call handling.  */;
>>}
>>
>> +  if (callee != NULL_TREE
>> +  && (flags_from_decl_or_type (callee) & ECF_FREE) != 0)
>> +{
>> +  tree ptr = gimple_call_arg (call, 0);
>> +  return ptr_deref_may_alias_ref_p_1 (ptr, ref);
>> +}
>>
>> will ignore the 2nd argument.  I think it's better to ignore the attribute
>> if type_num_arguments () != 1.
>
> Actually, the C++ standard ([basic.stc.dynamic]/2) defines the following 4
> deallocation functions implicitly:
>
> void operator delete(void*);
> void operator delete[](void*);
> void operator delete(void*, std::size_t) noexcept;
> void operator delete[](void*, std::size_t) noexcept;
>
> And the standard library also has:
>
> void operator delete(void*, const std::nothrow_t&);
> void operator delete[](void*, const std::nothrow_t&);
> void operator delete(void*, std::size_t, const std::nothrow_t&);
> void operator delete[](void*, std::size_t, const std::nothrow_t&);
>
> IIUC, 'delete(void*, std::size_t)' is used by default in C++14
> (https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01266.html). How should we 
> handle
> this?

Hmm.  I guess by adjusting the documentation of the attribute to
explicitely mention
the behavior on the rest of the argument pointed-to memory (the
function is assumed
to neither write nor read from that memory).  Also explicitely mention
that 'this' is
always the first argument if present.

Richard.

> --
> Regards,
> Mikhail Maltsev


Re: [PATCH] Fix missed DSE opportunity with operator delete.

2016-04-25 Thread Richard Biener
On Mon, Apr 25, 2016 at 11:02 AM, Bernd Schmidt  wrote:
> On 04/19/2016 10:48 PM, Mikhail Maltsev wrote:
>>
>> On 04/18/2016 12:14 PM, Richard Biener wrote:
>>>
>>>
>>> Enlarging tree_function_decl is bad.
>>
>> Probably using 3 bits for malloc_flag, operator_new_flag and free_flag is
>> redundant. I packed the state into 2 bits.
>>>
>>>
>>> Passes should get at the info via flags_from_decl_or_type () and a new
>>> ECF_FREE.
>>
>> Fixed.
>
>
> I think we should also have a testcase that verifies that no DSE happens for
> something that has a destructor.

Well, we should verify that the FE emits the call to the attribute
annotated delete
_after_ emitting the call to the destructor.  Which it already should do.

Richard.

>
> Bernd
>


Re: [PATCH] Don't build 32-bit libgomp with -march=i486 on x86-64

2016-04-25 Thread Jakub Jelinek
On Wed, Apr 20, 2016 at 07:56:16AM -0700, H.J. Lu wrote:
> On Wed, Apr 20, 2016 at 7:53 AM, Jakub Jelinek  wrote:
> > On Wed, Apr 20, 2016 at 07:43:27AM -0700, H.J. Lu wrote:
> >> From 12c6ddcf67593ed7137764ca74043f1a9c2d8fda Mon Sep 17 00:00:00 2001
> >> From: "H.J. Lu" 
> >> Date: Wed, 30 Mar 2016 05:56:08 -0700
> >> Subject: [PATCH 2/3] Don't build 32-bit libgomp with -march=i486 on x86-64
> >>
> >> Gcc uses the same -march= for both -m32 and -m64 on x86-64 unless
> >> --with-arch-32= is used.  There is no need for -march=i486 to compile
> >> 32-bit libgomp on x86-64.
> >>
> >>   PR target/70454
> >>   * configure.tgt (XCFLAGS): Don't add -march=i486 to compile
> >>   32-bit target library on x86-64.
> >
> > That is wrong.  It could be --with-arch-32=i386 build.
> 
> libgomp/configure.tgt has
> 
># Note that bare i386 is not included here.  We need cmpxchg.
> i[456]86-*-linux*)
> config_path="linux/x86 linux posix"
> case " ${CC} ${CFLAGS} " in
>  *" -m64 "*|*" -mx32 "*)
>;;
>  *)
>if test -z "$with_arch"; then
> ^^^
> 
> --with-arch overrides everything.  I just follow the same practice.
> 
>  XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
>fi
> esac
> ;;

Yes, and even the -m32 practice is not good.
We should do a preprocessor and/or compile time test in each of these cases
to find out if the default needs to be tweaked and tweak only in that case.

Jakub


Re: [PATCH] Verify __builtin_unreachable and __builtin_trap are not called with arguments

2016-04-25 Thread Richard Biener
On Fri, Apr 22, 2016 at 9:40 PM, Martin Jambor  wrote:
> Hi,
>
> On Fri, Apr 22, 2016 at 09:24:31PM +0200, Richard Biener wrote:
>> On April 22, 2016 7:04:31 PM GMT+02:00, Martin Jambor  
>> wrote:
>> >Hi,
>> >
>> >this patch adds verification that __builtin_unreachable and
>> >__builtin_trap are not called with arguments.  The problem with calls
>> >to them with arguments is that functions like gimple_call_builtin_p
>> >return false on them, because they return true only when
>> >gimple_builtin_call_types_compatible_p does.  One manifestation of
>> >that was PR 61591 where undefined behavior sanitizer did not replace
>> >such calls with its thing as it should, but there might be others.
>> >
>> >I have included __builtin_trap in the verification because they often
>> >seem to be handled together but can either remove it or add more
>> >builtins if people think it better.  I concede it is a bit arbitrary.
>> >
>> >Honza said he has seen __builtin_unreachable calls with parameters in
>> >LTO builds of Firefox, so it seems this might actually trigger, but I
>> >also think we do not want such calls in the IL.
>> >
>> >I have bootstrapped and tested this on x86_64-linux (with all
>> >languages and Ada) and have also run a C, C++ and Fortran LTO
>> >bootstrap with the patch on the same architecture.  OK for trunk?
>>
>> Shouldn't we simply error in the FEs for this given the builtins
>> essentially have a prototype?  That is, error for non-matching args
>> for the __built-in_ variant of _all_ builtins (treat them as
>> prototyped)?
>>
>
> We do that.  It is just that at times we produce a call to
> __builtin_unreachable internally.  The only instance I know of is IPA
> figuring out a call cannot happen in a legal program (for example
> because it would lead to a call of abstract virtual functions) but
> perhaps there are other places where we do it.

Ah, I see...

> I thought we have fixed the issue of IPA leaving behind arguments in
> the calls to __builtin_unreachable it produced and this verification
> would simply made sure the bug does not come back but Honza's
> observation suggests that it still sometimes happens.

... so the patch is ok if you put a comment before it resembling the above.

Thanks,
Richard.

> Martin
>
>> Richard.
>>
>> >Thanks,
>> >
>> >Martin
>> >
>> >
>> >2016-04-20  Martin Jambor  
>> >
>> > * tree-cfg.c (verify_gimple_call): Check that calls to
>> > __builtin_unreachable or __builtin_trap do not have actual arguments.
>> >---
>> > gcc/tree-cfg.c | 20 
>> > 1 file changed, 20 insertions(+)
>> >
>> >diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
>> >index 04e46fd..3385164 100644
>> >--- a/gcc/tree-cfg.c
>> >+++ b/gcc/tree-cfg.c
>> >@@ -3414,6 +3414,26 @@ verify_gimple_call (gcall *stmt)
>> >   return true;
>> > }
>> >
>> >+  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
>> >+{
>> >+  switch (DECL_FUNCTION_CODE (fndecl))
>> >+{
>> >+case BUILT_IN_UNREACHABLE:
>> >+case BUILT_IN_TRAP:
>> >+  if (gimple_call_num_args (stmt) > 0)
>> >+{
>> >+  /* Built-in unreachable with parameters might not be caught by
>> >+ undefined behavior santizer. */
>> >+  error ("__builtin_unreachable or __builtin_trap call with "
>> >+ "arguments");
>> >+  return true;
>> >+}
>> >+  break;
>> >+default:
>> >+  break;
>> >+}
>> >+}
>> >+
>> >   /* ???  The C frontend passes unpromoted arguments in case it
>> >  didn't see a function declaration before the call.  So for now
>> >  leave the call arguments mostly unverified.  Once we gimplify
>>
>>


Re: [PATCH] Don't build 32-bit libatomic with -march=i486 on x86-64

2016-04-25 Thread Uros Bizjak
On Mon, Apr 25, 2016 at 11:50 AM, Uros Bizjak  wrote:
> On Mon, Apr 25, 2016 at 11:46 AM, Bernd Schmidt  wrote:
>> On 04/20/2016 04:57 PM, H.J. Lu wrote:
>>>
>>> On Wed, Apr 20, 2016 at 7:54 AM, Jakub Jelinek  wrote:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01080.html


 This is wrong, see my other comment on the libgomp patch.

>>> See my reply to your reply on the libgomp patch.
>>
>>
>> Since Jakub has said it is wrong, please revert.
>
> I agree.

(sent the message too fast...)

These patches obviously need some more discussion.

Uros.


Re: [DOC Patch] Add sample for @cc constraint

2016-04-25 Thread Bernd Schmidt

On 04/16/2016 01:12 AM, David Wohlferd wrote:

There were  basically 3 changes I was trying for in that doc patch. Are
any of them worth keeping?  Or are we done?

1) "Do not clobber flags if they are being used as outputs."
2) Output flags sample (with #if removed).
3) "On the x86 platform, flags are always treated as clobbered by
extended asm whether @code{"cc"} is specified or not."

I'm prepared to send an updated patch if there's anything here that
might get approved.


I think the updated flags sample would be nice to have.


Bernd



Re: [PATCH] Don't build 32-bit libatomic with -march=i486 on x86-64

2016-04-25 Thread Uros Bizjak
On Mon, Apr 25, 2016 at 11:46 AM, Bernd Schmidt  wrote:
> On 04/20/2016 04:57 PM, H.J. Lu wrote:
>>
>> On Wed, Apr 20, 2016 at 7:54 AM, Jakub Jelinek  wrote:

 https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01080.html
>>>
>>>
>>> This is wrong, see my other comment on the libgomp patch.
>>>
>> See my reply to your reply on the libgomp patch.
>
>
> Since Jakub has said it is wrong, please revert.

I agree.

Uros.


Re: Please include ada-hurd.diff upstream (try2)

2016-04-25 Thread Arnaud Charlet
> Is replacing
> 
> +-- Copyright (C) 1991-1994, Florida State
> University--
> +-- Copyright (C) 1995-2014,
> AdaCore --
> +--  Copyright (C) 2015-2016, Free Software Foundation,
> Inc. --
> 
> with only
> 
> +--  Copyright (C) 2015-2016, Free Software Foundation,
> Inc. --
> 
> OK?

For this specific new file, yes that's fine.


Re: [PATCH] Don't build 32-bit libatomic with -march=i486 on x86-64

2016-04-25 Thread Bernd Schmidt

On 04/20/2016 04:57 PM, H.J. Lu wrote:

On Wed, Apr 20, 2016 at 7:54 AM, Jakub Jelinek  wrote:

https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01080.html


This is wrong, see my other comment on the libgomp patch.


See my reply to your reply on the libgomp patch.


Since Jakub has said it is wrong, please revert.


Bernd


  1   2   >