[patch, doc] Document that new Perl version breaks required automake

2017-11-08 Thread Thomas Koenig

Hello world,

while PR 82856 remains unsolved so far, this documentation patch at
least points people into the right direction if --enable-maintainer-mode
fails due to the incompatibility of the latest Perl version with
the required automkake version.

Tested with "make dvi" and "make pdf". OK for trunk?

2017-11-09  Thomas Koenig  

PR bootstrap/82856
doc/install.texi: Document incompatibility of Perl >=5.6.26
with the required version of automake 1.11.6.

Index: doc/install.texi
===
--- doc/install.texi	(Revision 254408)
+++ doc/install.texi	(Arbeitskopie)
@@ -324,7 +324,7 @@ Necessary (only on some platforms) to untar the so
 systems' @command{tar} programs will also work, only try GNU
 @command{tar} if you have problems.
 
-@item Perl version 5.6.1 (or later)
+@item Perl version between 5.6.1 and 5.6.24
 
 Necessary when targeting Darwin, building @samp{libstdc++},
 and not using @option{--disable-symvers}.
@@ -338,6 +338,8 @@ Necessary when generating manpages from Texinfo ma
 Used by various scripts to generate some files included in SVN (mainly
 Unicode-related and rarely changing) from source tables.
 
+Used by @command{automake}.
+
 @end table
 
 Several support libraries are necessary to build GCC, some are required,
@@ -420,6 +422,9 @@ the 1.11 series, which is currently 1.11.6.  When
 to a newer version, please update all the directories using an older 1.11
 to the latest released version.
 
+Note that @command{automake} 1.11.6 is incompatible with
+@command{perl} version 5.6.26.
+
 @item gettext version 0.14.5 (or later)
 
 Needed to regenerate @file{gcc.pot}.


Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Martin Sebor

On 11/08/2017 11:28 AM, Richard Sandiford wrote:

Martin Sebor  writes:

On 11/08/2017 09:51 AM, Richard Sandiford wrote:

Martin Sebor  writes:

On 11/08/2017 02:32 AM, Richard Sandiford wrote:

Martin Sebor  writes:

I haven't done nearly a thorough review but the dtor followed by
the placement new in the POLY_SET_COEFF() macro caught my eye so
I thought I'd ask sooner rather than later.  Given the macro
definition:

+   The dummy comparison against a null C * is just a way of checking
+   that C gives the right type.  */
+#define POLY_SET_COEFF(C, RES, I, VALUE) \
+  ((void) (&(RES).coeffs[0] == (C *) 0), \
+   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
+   ? (void) ((RES).coeffs[I] = VALUE) \
+   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))

is the following use well-defined?

+template
+inline poly_int_pod&
+poly_int_pod::operator <<= (unsigned int a)
+{
+  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);

It looks to me as though the VALUE argument in the ctor invoked
by the placement new expression is evaluated after the dtor has
destroyed the very array element the VALUE argument expands to.


Good catch!  It should simply have been doing <<= on each coefficient --
I must have got carried away when converting to POLY_SET_COEFF.

I double-checked the other uses and think that's the only one.


Whether or not is, in fact, a problem, it seems to me that using
a function template rather than a macro would be a clearer and
safer way to do the same thing.  (Safer in that the macro also
evaluates its arguments multiple times, which is often a source
of subtle bugs.)


That would slow down -O0 builds though, by introducing an extra
function call and set of temporaries even when the coefficients
are primitive integers.


Would decorating the function template with attribute always_inline
help?


It would remove the call itself, but we'd still have the extra temporary
objects that were the function argument and return value.


Sorry, I do not want to get into another long discussion about
trade-offs between safety and efficiency but I'm not sure I see
what extra temporaries it would create.  It seems to me that
an inline function template that took arguments of user-defined
types by reference and others by value should be just as efficient
as a macro.

 From GCC's own manual:

   6.43 An Inline Function is As Fast As a Macro
   https://gcc.gnu.org/onlinedocs/gcc/Inline.html


You can see the difference with something like:

  inline
  void __attribute__((always_inline))
  f(int , const int ) { dst = src; }

  int g1(const int ) { int x; f(x, y); return x; }
  int g2(const int ) { int x; x = y; return x; }


Let me say at the outset that I struggle to comprehend that a few
instructions is even a consideration when not optimizing, especially
in light of the bug the macro caused that would have been prevented
by using a function instead.  But...

...I don't think your example above is representative of using
the POLY_SET_COEFF macro.  The function template I'm suggesting
might look something to this:

  template 
  inline void __attribute__ ((always_inline))
  poly_set_coeff (poly_int_pod *p, unsigned idx, C val)
  {
((void) (&(*p).coeffs[0] == (C *) 0), 
wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION ? (void) 
((*p).coeffs[0] = val) : (void) ((*p).coeffs[0].~C (), new 
(&(*p).coeffs[0]) C (val)));


if (N >= 2)
  for (unsigned int i = 1; i < N; i++)
((void) (&(*p).coeffs[0] == (C *) 0), 
wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION ? (void) 
((*p).coeffs[i] = val) : (void) ((*p).coeffs[i].~C (), new 
(&(*p).coeffs[i]) C (val)));

  }

To compare apples to apples I suggest to instead compare the shift
operator (or any other poly_int function that uses the macro) that
doesn't suffer from the bug vs one that makes use of the function
template.  I see a difference of 2 instructions on x86_64 (21 vs
23) for operator<<=.

Are two assembly instructions even worth talking about?


If that's not the case and there is a significant performance
penalty associated with inline functions at -O0 then GCC should
be fixed to avoid it.


I think those docs are really talking about inline functions being as
fast as macros when optimisation is enabled.  I don't think we make
any guarantees about -O0 code quality.


Sure, but you are using unsafe macros in preference to a safer
inline function even with optimization, introducing a bug as
a result, and making an argument that the performance impact
of a few instructions when not using optimization is what should
drive the decision between one and the other in all situations.
With all respect, I fail to see the logic in this like of
reasoning.  By that argument we would never be able to define
any inline functions.

That being said, if the performance implications of using inline
functions with no optimization are so serious 

Ping: [PATCH], Add rounding built-ins to the _Float and _FloatX built-in functions

2017-11-08 Thread Michael Meissner
I suspect this patch got lost among the FMA patches at the same time.  This
patch enables the rounding functions.  Segher has already approved the rs6000
bits.

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02124.html

On Fri, Oct 27, 2017 at 06:39:21PM -0400, Michael Meissner wrote:
> The power9 (running PowerPC ISA 3.0) has a round to integer instruction
> (XSRQPI) that does various flavors of round an IEEE 128-bit floating point to
> integeral values.  This patch adds the support to the machine independent
> portion of the compiler, and adds the necessary support for ceilf128,
> roundf128, truncf128, and roundf128 to the PowerPC backend when you use
> -mcpu=power9.
> 
> I have done bootstrap builds on both x86-64 and a little endian power8 system.
> Can I install these patches to the trunk?
> 
> [gcc]
> 2017-10-27  Michael Meissner  
> 
>   * builtins.def: (_Float and _FloatX BUILT_IN_CEIL): Add
>   _Float and _FloatX variants for rounding built-in
>   functions.
>   (_Float and _FloatX BUILT_IN_FLOOR): Likewise.
>   (_Float and _FloatX BUILT_IN_NEARBYINT): Likewise.
>   (_Float and _FloatX BUILT_IN_RINT): Likewise.
>   (_Float and _FloatX BUILT_IN_ROUND): Likewise.
>   (_Float and _FloatX BUILT_IN_TRUNC): Likewise.
>   * builtins.c (mathfn_built_in_2): Likewise.
>   * internal-fn.def (CEIL): Likewise.
>   (FLOOR): Likewise.
>   (NEARBYINT): Likewise.
>   (RINT): Likewise.
>   (ROUND): Likewise.
>   (TRUNC): Likewise.
>   * fold-const.c (tree_call_nonnegative_warnv_p): Likewise.
>   (integer_valued_real_call_p): Likewise.
>   * fold-const-call.c (fold_const_call_ss): Likewise.
>   * config/rs6000/rs6000.md (floor2): Add support for IEEE
>   128-bit round to integer instructions.
>   (ceil2): Likewise.
>   (btrunc2): Likewise.
>   (round2): Likewise.
> 
> [gcc/c]
> 2017-10-27  Michael Meissner  
> 
>   * c-decl.c (header_for_builtin_fn): Add integer rounding _Float
>   and _FloatX built-in functions.
> 
> [gcc/testsuite]
> 2017-10-27  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
>   floorf128, truncf128, and roundf128.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Fix dwarf2out ICE with UNSPEC_GOTOFF (PR debug/82837)

2017-11-08 Thread Jeff Law
On 11/06/2017 02:35 PM, Jakub Jelinek wrote:
> Hi!
> 
> My recent changes to const_ok_for_output_1 to allow UNSPEC if target hook
> says it is ok for debug regressed the following testcase, where creative
> simplify-rtx.c changes result in (const (neg (unspec ... UNSPEC_GOTOFF)))
> being emitted and the backend not being able to assemble that (assembler
> has no such relocations).
> We already have a hack to avoid ICEing on NOT in similar cases, this patch
> adds NEG to that.  And, in mem_loc_descriptor tries harder to handle these
> cases right - while if we have say (const (not (symbol_ref))) the current
> code would handle it right already, if there is the unspec, we really need
> it to be wrapped inside of CONST, otherwise it is dropped on the floor.
> 
> The patch in that case emits (const (neg (unspec ... UNSPEC_GOTOFF)))
> as (neg (const (unspec ... UNSPEC_GOTOFF))).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-11-06  Jakub Jelinek  
> 
>   PR debug/82837
>   * dwarf2out.c (const_ok_for_output_1): Reject NEG in addition to NOT.
>   (mem_loc_descriptor): Handle (const (neg (...))) as (neg (const (...)))
>   and similarly for not instead of neg.
> 
>   * gcc.dg/debug/dwarf2/pr82837.c: New test.
OK.
jeff


Re: [Diagnostic Patch] don't print column zero

2017-11-08 Thread Jeff Law
On 11/08/2017 10:36 AM, Nathan Sidwell wrote:
> On 11/02/2017 04:33 PM, Nathan Sidwell wrote:
>> On 10/26/2017 10:34 AM, David Malcolm wrote:
>>> [CCing Rainer and Mike for the gcc-dg.exp part]
>>
>>> My Tcl skills aren't great, so hopefully someone else can review this;
>>> CCing Rainer and Mike.
>>
>> Ping?
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01911.html
>>
> Ping? A change to the column-checking logic the diagnostic processing.
I'm not sure anyone's Tcl skills are great these days.  I think most of
us bash on it until it works the way we want and then try to return to
productive work :-)

If David's happy with the changes to the diagnostic machinery, then
let's go ahead and move forward with this patch.

jeff


RE: [PATCH 14/22] Enable building libsanitizer with Intel CET

2017-11-08 Thread Tsimbalist, Igor V
The revised patch is attached. The differences are in what options are defined 
and propagated to Makefiles for CET enabling.
 
Ok for trunk?

Igor

> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, October 18, 2017 1:37 AM
> To: Tsimbalist, Igor V ; gcc-
> patc...@gcc.gnu.org
> Cc: ja...@redhat.com
> Subject: Re: [PATCH 14/22] Enable building libsanitizer with Intel CET
> 
> On 10/12/2017 02:27 PM, Tsimbalist, Igor V wrote:
> > Enable building libsanitizer with Intel CET options.
> >
> > libsanitizer/
> > * acinclude.m4: Add enable.m4 and cet.m4.
> > * Makefile.in: Regenerate.
> > * asan/Makefile.am: Update AM_CXXFLAGS.
> > * asan/Makefile.in: Regenerate.
> > * configure: Likewise.
> > * configure.ac: Set CET_FLAGS. Update EXTRA_CFLAGS,
> > * EXTRA_CXXFLAGS.
> > * interception/Makefile.am: Update AM_CXXFLAGS.
> > * interception/Makefile.in: Regenerate.
> > * libbacktrace/Makefile.am: Update AM_CFLAGS, AM_CXXFLAGS.
> > * libbacktrace/Makefile.in: Regenerate.
> > * lsan/Makefile.am: Update AM_CXXFLAGS.
> > * lsan/Makefile.in: Regenerate.
> > * sanitizer_common/Makefile.am: Update AM_CXXFLAGS.
> > * sanitizer_common/Makefile.in: Regenerate.
> > * tsan/Makefile.am: Update AM_CXXFLAGS.
> > * tsan/Makefile.in: Regenerate.
> > * ubsan/Makefile.am: Update AM_CXXFLAGS.
> > * ubsan/Makefile.in: Regenerate.
> >
> 
> Same comments as with libcilkrts.
> Jeff


0014-Enable-building-libsanitizer-with-Intel-CET.PATCH
Description: 0014-Enable-building-libsanitizer-with-Intel-CET.PATCH


Re: [PATCH 07/22] Enable building libgcc with CET options.

2017-11-08 Thread H.J. Lu
On Wed, Nov 8, 2017 at 3:04 PM, Jeff Law  wrote:
> On 11/08/2017 03:06 PM, Tsimbalist, Igor V wrote:
> So the question I have WRT this patch is the default setting.  If I
>>> understand it correctly, if the assembler supports the appropriate
>>> insns, then we enable building target libraries with CET by default.
>>
>> That's right.
>>
>>> These libraries continue to work on older systems without CET
>>> capabilities because the CET specific instructions are interpreted as
>>> NOPs on older hardware, right?
>>
>> That's correct. One specific note though. The endbr and rdssp instructions
>> will be treated as NOPs. Incssp instruction generated by the compiler or
>> used in the library will be guarded not to be executed if CET features are
>> not enabled.
> OK.
>
>>
>>> What about cases where we're running on CET capable hardware, the main
>>> program gets compiled without CET, but links against a libgcc with CET.
>>> What happens in that case?
>>
>> All object files and libraries must have CET property set to make the whole
>> application to be CET capable. In your case the program will not be CET
>> capable.
> Doesn't this imply that other components (linker, dynamic linker) are
> working together to verify that the entire application and DSO are
> compiled with CET?  What happens when a CET capable application dl-opens
> a DSO which is not CET safe?  Does the dynamic linker disable CET at
> that point?

GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to
indicate that all executable sections are compatible with IBT when
ENDBR instruction starts each valid target where an indirect branch
instruction can land.  GNU_PROPERTY_X86_FEATURE_1_IBT is set on output
only if it is set on all relocatable inputs.

On an IBT capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable IBT and
lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable.
2. When loading an executable with an interpreter, enable IBT if
GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable,
 disable IBT.
  b. Lock IBT.
3. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
  a. If legacy interwork is allowed, then mark all pages in executable
 PT_LOAD segments in legacy code page bitmap.  Failure of legacy code
 page bitmap allocation causes an error.
  b. If legacy interwork isn't allowed, it causes an error.

GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to
indicate that all executable sections are compatible with SHSTK where
return address popped from shadow stack always matches return address
popped from normal stack.  GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on
output only if it is set on all relocatable inputs.

On a SHSTK capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable.
2. When loading an executable with an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable
 or any shared objects loaded via the DT_NEEDED tag, disable SHSTK.
  b. Otherwise lock SHSTK.
3. After SHSTK is enabled, it is an error to load a shared object
without GNU_PROPERTY_X86_FEATURE_1_SHSTK.

When glibc is built with a CET-enabled compiler, CET is enabled by
default, unless --disable-cet is used to configure glibc.  When CET is
enabled, both compiler and assembler must support CET.  Otherwise, it
is a configure-time error.




-- 
H.J.


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Jeff Law
On 11/08/2017 11:16 AM, Wilco Dijkstra wrote:
> Jeff Law wrote:
> 
>> I'd actually prefer to deprecate the H8 and M68k.  But assuming that's
>> not going to happen in the immediate future I think dropping frame
>> pointers on those targets is appropriate as long as we're generating
>> dwarf frame info.
> 
> Is there a way to check a target does not generate dwarf? I thought that
> was the default.
Hmm, looks like H8 is ELF only at this point and unconditionally turns
on dwarf2 debug records and has bits to enable dwarf2 CFI records.

I suspect -gstabs would likely turn all that off, but well, the more I
think about it, I don't care :-)


> 
>>> I believe in those cases targets already force the frame pointer as 
>>> required,
>>> for example msp430 sets the frame pointer if unwind tables are emitted
>>> irrespectively of the command-line or default setting. Various other targets
>>> don't even use frame_pointer_needed and just do their own thing.
>> I've had the "pleasure" of going round and round on this repeatedly
>> through the years with the kernel teams on this.  Essentially they
>> didn't want to embed the dwarf2 unwinder in the kernel or have all those
>> pages of unwind data.  Instead they strongly preferred to have a frame
>> pointer to facilitate easy and fast unwinding.
> 
> A frame pointer does not facilitate unwinding, it can give a backtrace at 
> best.
Sorry.  I was being imprecise in my choice of words.   A backtrace is
what the kernel guys need for various reasons.  In some cases the
backtrace is generated at interrupt time (handling of profiling events),
so it can't depend on the dwarf interpreter or the dwarf tables.

>> So  my concern is to make sure the kernel folks, particularly in the
>> ia32 world aren't going to get hosed by this change.  If we're changing
>> the default it needs to be signaled to them so that they can ensure that
>> if they want frame pointers that they still get them.
> 
> x86/x64 is not affected since it already omits the frame pointer by default 
> (like
> almost all targets in a target specific way). This patch is about making that
> the global default precisely because pretty much every target already has it
> as the default.
Yea, I guess we fixed 32bit x86 eons ago.

So I think the final conclusion is to go with your change.  If there's
any fallout on h8 or m68k we'll deal with it.

Jeff


Re: [PATCH 07/22] Enable building libgcc with CET options.

2017-11-08 Thread Jeff Law
On 11/08/2017 03:06 PM, Tsimbalist, Igor V wrote:
So the question I have WRT this patch is the default setting.  If I
>> understand it correctly, if the assembler supports the appropriate
>> insns, then we enable building target libraries with CET by default.
> 
> That's right.
> 
>> These libraries continue to work on older systems without CET
>> capabilities because the CET specific instructions are interpreted as
>> NOPs on older hardware, right?
> 
> That's correct. One specific note though. The endbr and rdssp instructions
> will be treated as NOPs. Incssp instruction generated by the compiler or
> used in the library will be guarded not to be executed if CET features are
> not enabled.
OK.

> 
>> What about cases where we're running on CET capable hardware, the main
>> program gets compiled without CET, but links against a libgcc with CET.
>> What happens in that case?
> 
> All object files and libraries must have CET property set to make the whole
> application to be CET capable. In your case the program will not be CET
> capable.
Doesn't this imply that other components (linker, dynamic linker) are
working together to verify that the entire application and DSO are
compiled with CET?  What happens when a CET capable application dl-opens
a DSO which is not CET safe?  Does the dynamic linker disable CET at
that point?

Jeff



[PATCH,committed] Fix PR Fortran 82841

2017-11-08 Thread Steve Kargl
Committed as obvious.  ChangeLog explains change.

2017-11-08  Steven G. Kargl  

PR Fortran/82841
* simplify.c(gfc_simplify_transfer): Do not dereference a NULL pointer.
Unwrap a short line.
 
2017-11-08  Steven G. Kargl  

PR Fortran/82841
* gfortran.dg/transfer_simplify_11.f90: new test.

Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c  (revision 254552)
+++ gcc/fortran/simplify.c  (working copy)
@@ -6576,8 +6576,7 @@ gfc_simplify_transfer (gfc_expr *source, gfc_expr *mol
 return NULL;
 
   /* Calculate the size of the source.  */
-  if (source->expr_type == EXPR_ARRAY
-  && !gfc_array_size (source, ))
+  if (source->expr_type == EXPR_ARRAY && !gfc_array_size (source, ))
 gfc_internal_error ("Failure getting length of a constant array.");
 
   /* Create an empty new expression with the appropriate characteristics.  */
@@ -6585,7 +6584,7 @@ gfc_simplify_transfer (gfc_expr *source, gfc_expr *mol
  >where);
   result->ts = mold->ts;
 
-  mold_element = mold->expr_type == EXPR_ARRAY
+  mold_element = (mold->expr_type == EXPR_ARRAY && mold->value.constructor)
 ? gfc_constructor_first (mold->value.constructor)->expr
 : mold;
 
Index: gcc/testsuite/gfortran.dg/transfer_simplify_11.f90
===
--- gcc/testsuite/gfortran.dg/transfer_simplify_11.f90  (nonexistent)
+++ gcc/testsuite/gfortran.dg/transfer_simplify_11.f90  (working copy)
@@ -0,0 +1,8 @@
+! { dg-do run }
+! PR Fortran/82841
+!
+   integer, parameter :: N = 2
+   character(len=1) :: chr(N)
+   chr = transfer(repeat("x",ncopies=N),[character(len=1) ::], N)
+   if (chr(1) /= 'x' .and. chr(2) /= 'x') call abort
+end

-- 
Steve


Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only

2017-11-08 Thread H.J. Lu
On Wed, Nov 8, 2017 at 2:26 PM, Tsimbalist, Igor V
 wrote:
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Jeff Law
>> Sent: Wednesday, November 8, 2017 7:31 PM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: trie...@redhat.com; Jakub Jelinek 
>> Subject: Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
>>
>> On 11/07/2017 09:22 AM, Tsimbalist, Igor V wrote:
>> > I decided to split my previous patch "Enable building libitm with Intel 
>> > CET "
>> > into two different patches. The first patch will add a new field to sjlj.S 
>> > and
>> > target.h  files. The second one will add Intel CET support on the top of 
>> > the
>> > first one. In this case the further changes for adding Intel CET support 
>> > are
>> > seen clearly.
>> >
>> > Ok for trunk?
>> >
>>
>> [ ... snip ... ]
>>
>> >
>> >
>> > 0021-Add-extra-field-to-gtm_jmpbuf-on-x86-only.patch
>> >
>> >
>> > From a6361c78bf774f2b4dbeeaf4147c286cff4ae5a4 Mon Sep 17 00:00:00
>> 2001
>> > From: Igor Tsimbalist 
>> > Date: Tue, 7 Nov 2017 17:00:24 +0300
>> > Subject: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
>> >
>> > Expand the gtm_jmpbuf structure by one word field to add
>> > Intel CET support further. The code in sjlj.S already
>> > allocates more space on the stack then gtm_jmpbuf needs.
>> > Use this extra space to absorb the new field.
>> >
>> > The structure is allocated on the stack in such a way
>> > that eip/rsp field is overlapped with return address on
>> > the stack. Locate the new field right before eip/rsp so
>> > code that accesses buffer fields relative to address of
>> > gtm_jmpbuf has its offsets unchanged.
>> >
>> > The libtool_VERSION is updated for x86 due to extending
>> > the gtm_jmpbuf structure.
>> >
>> > * libitm/config/x86/target.h: Add new field (ssp).
>> > * libitm/config/x86/sjlj.S: Change offsets.
>> > * libitm/configure.tgt: Update libtool_VERSION.
>> So if I understand correctly, given the desire to to have the eip/rip
>> field overlap with the return address on the stack offset changes are
>> inevitable if we add fields.
>
> Yes, that's exactly the case.
>
>> >  esac
>> > +
>> > +# Update libtool_VERSION since the size of struct gtm_jmpbuf is
>> > +# changed for x86.
>> > +case "${host}" in
>> > +
>> > +  # For x86, we use slots in the TCB head for most of our TLS.
>> > +  # The setup of those slots in beginTransaction can afford to
>> > +  # use the global-dynamic model.
>> > +  i[456]86-*-* | x86_64-*-*)
>> > +   libtool_VERSION=2:0:0
>> What's the plan for supporting existing code that may have linked
>> dynamically against libitm?
>
> This should just work.
>
>> One approach is to force the distros to carry the old libitm DSO.
>>
>> THe other would be to try and support both within the same DSO using
>> symbol versioning.  That would seem to imply that we'd need to the
>> before/after code to build that single library that supported both.
>>
>> Thoughts?  Jakub, any interest in chiming in here?
>
> My thought is that the buffer is encapsulated in the library, only sjlj.S
> functions allocate the buffer and access the fields of the buffer, it's
> sort of a black box. If an app loads the library it will work with the
> buffer through the library's functions from sjlj.S , which are compiled
> together.

It isn't the black box since gtm_jmpbuf is used in:

struct gtm_transaction_cp
{
  gtm_jmpbuf jb;
  size_t undolog_size;

If we don't want to change libtool_VERSION, we need to add symbol
versioning to libitm.so.  But from what I can see,  libitm.so wasn't designed
with symbol versioning.  Instead, it changes libtool_VERSION when
ABI is changed.

-- 
H.J.


Re: Protect against min_profitable_iters going negative

2017-11-08 Thread Jeff Law
On 11/08/2017 09:49 AM, Richard Sandiford wrote:
> We had:
> 
>   if (vec_outside_cost <= 0)
> min_profitable_iters = 0;
>   else
> {
> min_profitable_iters = ((vec_outside_cost - scalar_outside_cost)
> * assumed_vf
> - vec_inside_cost * peel_iters_prologue
> - vec_inside_cost * peel_iters_epilogue)
>/ ((scalar_single_iter_cost * assumed_vf)
>   - vec_inside_cost);
> 
> which can lead to negative min_profitable_iters when the *_outside_costs
> are the same and peel_iters_epilogue is nonzero (e.g. if we're peeling
> for gaps).
> 
> This is tested as part of the patch that adds support for fully-predicated
> loops.
> 
> Tested on aarch64-linux-gnu (both with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Thanks,
> Richard
> 
> 
> 2017-11-08  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-loop.c (vect_estimate_min_profitable_iters): Make sure
>   min_profitable_iters doesn't go negative.
OK.
jeff


Re: Base subreg rules on REGMODE_NATURAL_SIZE rather than UNITS_PER_WORD

2017-11-08 Thread Jeff Law
On 11/06/2017 07:53 AM, Richard Sandiford wrote:
>>> 2017-09-18  Richard Sandiford  
>>> Alan Hayward  
>>> David Sherwood  
>>>
>>> gcc/
>>> * doc/rtl.texi: Rewrite the subreg rules so that they partition
>>> the inner register into REGMODE_NATURAL_SIZE bytes rather than
>>> UNITS_PER_WORD bytes.
>>> * emit-rtl.c (validate_subreg): Divide subregs into blocks
>>> based on REGMODE_NATURAL_SIZE of the inner mode.
>>> (gen_lowpart_common): Split the SCALAR_FLOAT_MODE_P and
>>> !SCALAR_FLOAT_MODE_P cases.  Use REGMODE_NATURAL_SIZE for the latter.
>>> * expr.c (store_constructor): Use REGMODE_NATURAL_SIZE to test
>>> whether something is likely to occupy more than one register.
>> OK.  Though I must admit, I expected a larger change after reading the
>> intro.
> 
> Heh.  As it turned out I did find one more place that needed to change
> (lowpart_bit_field_p).  Maybe there'll be more -- e.g. it'd probably
> make sense to make lower-subreg.c split based on this instead of
> UNITS_PER_WORD.
> 
> How about this?  Tested as before.
> 
> Thanks,
> Richard
> 
> 
> 2017-11-06  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/rtl.texi: Rewrite the subreg rules so that they partition
>   the inner register into REGMODE_NATURAL_SIZE bytes rather than
>   UNITS_PER_WORD bytes.
>   * emit-rtl.c (validate_subreg): Divide subregs into blocks
>   based on REGMODE_NATURAL_SIZE of the inner mode.
>   (gen_lowpart_common): Split the SCALAR_FLOAT_MODE_P and
>   !SCALAR_FLOAT_MODE_P cases.  Use REGMODE_NATURAL_SIZE for the latter.
>   * expmed.c (lowpart_bit_field_p): Divide the value up into
>   chunks of REGMODE_NATURAL_SIZE rather than UNITS_PER_WORD.
>   * expr.c (store_constructor): Use REGMODE_NATURAL_SIZE to test
>   whether something is likely to occupy more than one register.
OK.  I still expect more fallout though :-)

Jeff


Re: Be stricter about CONST_VECTOR operands

2017-11-08 Thread Jeff Law
On 11/06/2017 02:10 AM, Richard Sandiford wrote:
> The recent gen_vec_duplicate patches used CONST_VECTOR for all
> constants, but the documentation says:
> 
>   @findex const_vector
>   @item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
>   Represents a vector constant.  The square brackets stand for the vector
>   containing the constant elements.  @var{x0}, @var{x1} and so on are
>   the @code{const_int}, @code{const_double} or @code{const_fixed} elements.
> 
> Both the AArch32 and AArch64 ports relied on the elements having
> this form and would ICE if the element was something like a CONST
> instead.  This showed up as a failure in vect-102.c for both arm-eabi
> and aarch64-elf (but not aarch64-linux-gnu, which is what the series
> was tested on).
> 
> The two obvious options were to redefine CONST_VECTOR to accept all
> constants or make gen_vec_duplicate honour the existing documentation.
> It looks like other code also assumes that integer CONST_VECTORs contain
> CONST_INTs, so the patch does the latter.
> 
> I deliberately didn't add an assert to gen_const_vec_duplicate
> because it looks like the SPU port *does* expect to be able to create
> CONST_VECTORs of symbolic constants.
> 
> Also, I think the list above should include const_wide_int for vectors
> of TImode and wider.
> 
> The new routine takes a mode for consistency with the generators,
> and because I think it does make sense to accept all constants for
> variable-length:
> 
> (const (vec_duplicate ...))
> 
> rather than have some rtxes for which we instead use:
> 
> (vec_duplicate (const ...))
> 
> Tested on aarch64-elf, aarch64-linux-gnu, x86_64-linux-gnu and
> powerpc64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-06  Richard Sandiford  
> 
> gcc/
>   * doc/rtl.texi (const_vector): Say that elements can be
>   const_wide_ints too.
>   * emit-rtl.h (valid_for_const_vec_duplicate_p): Declare.
>   * emit-rtl.c (valid_for_const_vec_duplicate_p): New function.
>   (gen_vec_duplicate): Use it instead of CONSTANT_P.
>   * optabs.c (expand_vector_broadcast): Likewise.
OK.
jeff


Re: Add support for adjusting the number of units in a mode

2017-11-08 Thread Jeff Law
On 10/25/2017 09:57 AM, Richard Sandiford wrote:
> We already allow the target to change the size and alignment of a mode.
> This patch does the same thing for the number of units, which is needed
> to give command-line control of the SVE vector length.
> 
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
> on top of the poly_int series.  I think I can approve this under the
> gen* maintainership, so if there are no comments in the menatime,
> I'll apply it if the prerequisites are approved.
> 
> Thanks,
> Richard
> 
> 
> 2017-10-25  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * machmode.h (mode_precision): Prefix with CONST_MODE_PRECISION.
>   (mode_nunits): Likewise CONST_MODE_NUNITS.
>   * machmode.def (ADJUST_NUNITS): Document.
>   * genmodes.c (mode_data::need_nunits_adj): New field.
>   (blank_mode): Update accordingly.
>   (adj_nunits): New variable.
>   (print_maybe_const_decl): Replace CATEGORY with a NEEDS_ADJ
>   parameter.
>   (emit_mode_size_inline): Set need_bytesize_adj for all modes
>   listed in adj_nunits.
>   (emit_mode_nunits_inline): Set need_nunits_adj for all modes
>   listed in adj_nunits.  Don't emit case statements for such modes.
>   (emit_insn_modes_h): Emit definitions of CONST_MODE_NUNITS
>   and CONST_MODE_PRECISION.  Make CONST_MODE_SIZE expand to
>   nothing if adj_nunits is nonnull.
>   (emit_mode_precision, emit_mode_nunits): Use print_maybe_const_decl.
>   (emit_mode_unit_size, emit_mode_base_align, emit_mode_ibit)
>   (emit_mode_fbit): Update use of print_maybe_const_decl.
>   (emit_move_size): Likewise.  Treat the array as non-const
>   if adj_nunits.
>   (emit_mode_adjustments): Handle adj_nunits.
Were all the prereqs here approved?  Or does this depend on the poly_int
stuff?

jeff



Re: [10/10] Add a vect_masked_store target selector

2017-11-08 Thread Jeff Law
On 11/03/2017 10:23 AM, Richard Sandiford wrote:
> This patch adds a target selector that says whether the target
> supports IFN_MASK_STORE.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_masked_store): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_vect_masked_store):
>   New proc.
>   * gcc.dg/vect/vect-cselim-1.c (foo): Mention that the second loop
>   is vectorizable with masked stores.  Update scan-tree-dump-times
>   accordingly.
> 
OK.
jeff


Re: [9/10] Add a vect_align_stack_vars target selector

2017-11-08 Thread Jeff Law
On 11/03/2017 10:22 AM, Richard Sandiford wrote:
> This patch adds a target selector to say whether it's possible to
> align a local variable to the target's preferred vector alignment.
> This can be false for large vectors if the alignment is only
> a preference and not a hard requirement (and thus if there is no
> need to support a stack realignment mechanism).
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_align_stack_vars): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp
>   (check_effective_target_vect_align_stack_vars): New proc.
>   * gcc.dg/vect/vect-23.c: Only expect the array to be aligned if
>   vect_align_stack_vars.
>   * gcc.dg/vect/vect-24.c: Likewise.
>   * gcc.dg/vect/vect-25.c: Likewise.
>   * gcc.dg/vect/vect-26.c: Likewise.
>   * gcc.dg/vect/vect-32-big-array.c: Likewise.
>   * gcc.dg/vect/vect-32.c: Likewise.
>   * gcc.dg/vect/vect-40.c: Likewise.
>   * gcc.dg/vect/vect-42.c: Likewise.
>   * gcc.dg/vect/vect-46.c: Likewise.
>   * gcc.dg/vect/vect-48.c: Likewise.
>   * gcc.dg/vect/vect-52.c: Likewise.
>   * gcc.dg/vect/vect-54.c: Likewise.
>   * gcc.dg/vect/vect-62.c: Likewise.
>   * gcc.dg/vect/vect-67.c: Likewise.
>   * gcc.dg/vect/vect-75-big-array.c: Likewise.
>   * gcc.dg/vect/vect-75.c: Likewise.
>   * gcc.dg/vect/vect-77-alignchecks.c: Likewise.
>   * gcc.dg/vect/vect-78-alignchecks.c: Likewise.
>   * gcc.dg/vect/vect-89-big-array.c: Likewise.
>   * gcc.dg/vect/vect-89.c: Likewise.
>   * gcc.dg/vect/vect-96.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-3.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-6.c: Likewise.
OK.
jeff


Re: [8/10] Add a vect_variable_length target selector

2017-11-08 Thread Jeff Law
On 11/03/2017 10:21 AM, Richard Sandiford wrote:
> This patch adds a target selector for variable-length vectors.
> Initially it's always false, but the SVE patch provides a case
> in which it's true.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_variable_length): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp
>   (check_effective_target_vect_variable_length): New proc.
>   * gcc.dg/vect/pr60482.c: XFAIL test for no epilog loop if
>   vect_variable_length.
>   * gcc.dg/vect/slp-reduc-6.c: XFAIL two-operation SLP if
>   vect_variable_length.
>   * gcc.dg/vect/vect-alias-check-5.c: XFAIL alias optimization if
>   vect_variable_length.
>   * gfortran.dg/vect/fast-math-mgrid-resid.f: XFAIL predictive
>   commoning optimization if vect_variable_length.
OK.
jeff


Re: [7/10] Add a vect_unaligned_possible target selector

2017-11-08 Thread Jeff Law
On 11/03/2017 10:20 AM, Richard Sandiford wrote:
> This patch adds a target selector that says whether we can ever
> generate an "unaligned" accesses, where "unaligned" is relative
> to the target's preferred vector alignment.  This is already true if:
> 
>vect_no_align && { ! vect_hw_misalign }
> 
> i.e. if the target doesn't have any alignment mechanism and also
> doesn't allow unaligned accesses.  It is also true (for the things
> tested by gcc.dg/vect) if the target only wants things to be aligned
> to an element; in that case every normal scalar access is "vector aligned".
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_unaligned_possible): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp
>   (check_effective_target_vect_unaligned_possible): New proc.
>   * gcc.dg/vect/slp-25.c: Extend XFAIL of peeling for alignment from
>   vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
>   * gcc.dg/vect/vect-multitypes-1.c: Likewise.
>   * gcc.dg/vect/vect-109.c: XFAIL vectorisation of an unaligned
>   access to ! vect_unaligned_possible.
>   * gcc.dg/vect/vect-33.c: Likewise.
>   * gcc.dg/vect/vect-42.c: Likewise.
>   * gcc.dg/vect/vect-56.c: Likewise.
>   * gcc.dg/vect/vect-60.c: Likewise.
>   * gcc.dg/vect/vect-96.c: Likewise.
>   * gcc.dg/vect/vect-peel-1.c: Likewise.
>   * gcc.dg/vect/vect-27.c: Extend XFAIL of unaligned vectorization from
>   vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
>   * gcc.dg/vect/vect-29.c: Likewise.
>   * gcc.dg/vect/vect-44.c: Likewise.
>   * gcc.dg/vect/vect-48.c: Likewise.
>   * gcc.dg/vect/vect-50.c: Likewise.
>   * gcc.dg/vect/vect-52.c: Likewise.
>   * gcc.dg/vect/vect-72.c: Likewise.
>   * gcc.dg/vect/vect-75-big-array.c: Likewise.
>   * gcc.dg/vect/vect-75.c: Likewise.
>   * gcc.dg/vect/vect-77-alignchecks.c: Likewise.
>   * gcc.dg/vect/vect-77-global.c: Likewise.
>   * gcc.dg/vect/vect-78-alignchecks.c: Likewise.
>   * gcc.dg/vect/vect-78-global.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-3.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-4.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-6.c: Likewise.
>   * gcc.dg/vect/vect-peel-4.c: Likewise.
>   * gcc.dg/vect/vect-peel-3.c: Likewise, and also for peeling
>   for alignment.
OK.  Though I could see a need to one day provide a more concrete test
than "in some circumstances".

jeff


Re: [6/10] Add a vect_element_align_preferred target selector

2017-11-08 Thread Jeff Law
On 11/03/2017 10:20 AM, Richard Sandiford wrote:
> This patch adds a target selector for targets whose
> preferred_vector_alignment is the alignment of one element.  We'll never
> peel in that case, and the step of a loop that operates on normal (as
> opposed to packed) elements will always divide the preferred alignment.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_element_align_preferred): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp
>   (check_effective_target_vect_element_align_preferred): New proc.
>   (check_effective_target_vect_peeling_profitable): Test it.
>   * gcc.dg/vect/no-section-anchors-vect-31.c: Don't expect peeling
>   if vect_element_align_preferred.
>   * gcc.dg/vect/no-section-anchors-vect-64.c: Likewise.
>   * gcc.dg/vect/pr65310.c: Likewise.
>   * gcc.dg/vect/vect-26.c: Likewise.
>   * gcc.dg/vect/vect-54.c: Likewise.
>   * gcc.dg/vect/vect-56.c: Likewise.
>   * gcc.dg/vect/vect-58.c: Likewise.
>   * gcc.dg/vect/vect-60.c: Likewise.
>   * gcc.dg/vect/vect-89-big-array.c: Likewise.
>   * gcc.dg/vect/vect-89.c: Likewise.
>   * gcc.dg/vect/vect-92.c: Likewise.
>   * gcc.dg/vect/vect-peel-1.c: Likewise.
>   * gcc.dg/vect/vect-outer-3a-big-array.c: Expect the step to
>   divide the alignment if vect_element_align_preferred.
>   * gcc.dg/vect/vect-outer-3a.c: Likewise.
OK.
jeff


RE: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only

2017-11-08 Thread Tsimbalist, Igor V
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jeff Law
> Sent: Wednesday, November 8, 2017 7:31 PM
> To: Tsimbalist, Igor V ; gcc-
> patc...@gcc.gnu.org
> Cc: trie...@redhat.com; Jakub Jelinek 
> Subject: Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
> 
> On 11/07/2017 09:22 AM, Tsimbalist, Igor V wrote:
> > I decided to split my previous patch "Enable building libitm with Intel CET 
> > "
> > into two different patches. The first patch will add a new field to sjlj.S 
> > and
> > target.h  files. The second one will add Intel CET support on the top of the
> > first one. In this case the further changes for adding Intel CET support are
> > seen clearly.
> >
> > Ok for trunk?
> >
> 
> [ ... snip ... ]
> 
> >
> >
> > 0021-Add-extra-field-to-gtm_jmpbuf-on-x86-only.patch
> >
> >
> > From a6361c78bf774f2b4dbeeaf4147c286cff4ae5a4 Mon Sep 17 00:00:00
> 2001
> > From: Igor Tsimbalist 
> > Date: Tue, 7 Nov 2017 17:00:24 +0300
> > Subject: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
> >
> > Expand the gtm_jmpbuf structure by one word field to add
> > Intel CET support further. The code in sjlj.S already
> > allocates more space on the stack then gtm_jmpbuf needs.
> > Use this extra space to absorb the new field.
> >
> > The structure is allocated on the stack in such a way
> > that eip/rsp field is overlapped with return address on
> > the stack. Locate the new field right before eip/rsp so
> > code that accesses buffer fields relative to address of
> > gtm_jmpbuf has its offsets unchanged.
> >
> > The libtool_VERSION is updated for x86 due to extending
> > the gtm_jmpbuf structure.
> >
> > * libitm/config/x86/target.h: Add new field (ssp).
> > * libitm/config/x86/sjlj.S: Change offsets.
> > * libitm/configure.tgt: Update libtool_VERSION.
> So if I understand correctly, given the desire to to have the eip/rip
> field overlap with the return address on the stack offset changes are
> inevitable if we add fields.

Yes, that's exactly the case.

> >  esac
> > +
> > +# Update libtool_VERSION since the size of struct gtm_jmpbuf is
> > +# changed for x86.
> > +case "${host}" in
> > +
> > +  # For x86, we use slots in the TCB head for most of our TLS.
> > +  # The setup of those slots in beginTransaction can afford to
> > +  # use the global-dynamic model.
> > +  i[456]86-*-* | x86_64-*-*)
> > +   libtool_VERSION=2:0:0
> What's the plan for supporting existing code that may have linked
> dynamically against libitm?

This should just work.

> One approach is to force the distros to carry the old libitm DSO.
> 
> THe other would be to try and support both within the same DSO using
> symbol versioning.  That would seem to imply that we'd need to the
> before/after code to build that single library that supported both.
> 
> Thoughts?  Jakub, any interest in chiming in here?

My thought is that the buffer is encapsulated in the library, only sjlj.S
functions allocate the buffer and access the fields of the buffer, it's
sort of a black box. If an app loads the library it will work with the
buffer through the library's functions from sjlj.S , which are compiled
together.

Igor

> jeff
> > +   ;;
> > +esac
> > -- 1.8.3.1
> >



RE: [PATCH 07/22] Enable building libgcc with CET options.

2017-11-08 Thread Tsimbalist, Igor V
> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, November 8, 2017 6:21 PM
> To: Tsimbalist, Igor V ; Joseph Myers
> ; gcc-patches@gcc.gnu.org; i...@airs.com
> Subject: Re: [PATCH 07/22] Enable building libgcc with CET options.
> 
> On 10/31/2017 05:29 AM, Tsimbalist, Igor V wrote:
> > The revised patch is attached. The differences are in what options are
> defined and propagated to Makefile for CET enabling, also needed asm files
> are updated.
> >
> [ ... ]
> 
> >
> > 0007-Enable-building-libgcc-with-CET-options.patch
> >
> >
> > From df923f7e0ebee1f10136bb64f9c723f2d58f8f2a Mon Sep 17 00:00:00
> 2001
> > From: Igor Tsimbalist 
> > Date: Fri, 27 Oct 2017 15:44:56 +0300
> > Subject: [PATCH 07/21] Enable building libgcc with CET options
> >
> > Enable building libgcc with CET options by default on Linux/x86 if
> > binutils supports CET v2.0.  It can be disabled with --disable-cet.
> > It is an error to configure GCC with --enable-cet if bintuiils doesn't
> > support CET v2.0.
> >
> > ENDBR is added to __morestack_large_model since it is called indirectly.
> >
> > config/
> > * cet.m4: New file.
> >
> > gcc/
> > * config.gcc (extra_headers): Add cet.h for x86 targets.
> > * config/i386/cet.h: New file.
> > * doc/install.texi: Add --enable-cet/--disable-cet.
> >
> > libgcc/
> > * Makefile.in (configure_deps): Add $(srcdir)/../config/cet.m4.
> > (CET_FLAGS): New.
> > * config/i386/morestack.S: Include .
> > (__morestack_large_model): Add _CET_ENDBR at function entrance.
> > * config/i386/resms64.h: Include .
> > * config/i386/resms64f.h: Likewise.
> > * config/i386/resms64fx.h: Likewise.
> > * config/i386/resms64x.h: Likewise.
> > * config/i386/savms64.h: Likewise.
> > * config/i386/savms64f.h: Likewise.
> > * config/i386/t-linux (HOST_LIBGCC2_CFLAGS): Add $(CET_FLAGS).
> > (CRTSTUFF_T_CFLAGS): Likewise.
> > * configure.ac: Include ../config/cet.m4.
> > Set and substitute CET_FLAGS.
> > * configure: Regenerated.
> So the question I have WRT this patch is the default setting.  If I
> understand it correctly, if the assembler supports the appropriate
> insns, then we enable building target libraries with CET by default.

That's right.

> These libraries continue to work on older systems without CET
> capabilities because the CET specific instructions are interpreted as
> NOPs on older hardware, right?

That's correct. One specific note though. The endbr and rdssp instructions
will be treated as NOPs. Incssp instruction generated by the compiler or
used in the library will be guarded not to be executed if CET features are
not enabled.

> What about cases where we're running on CET capable hardware, the main
> program gets compiled without CET, but links against a libgcc with CET.
> What happens in that case?

All object files and libraries must have CET property set to make the whole
application to be CET capable. In your case the program will not be CET
capable.

Igor

> What triggers the use of CET vs interpreting those instructions as NOPs
>
> I don't doubt y'all have already thought about these cases.  I just want
> to make sure that I understand them and the implications before I ack
> this patch.
> 
> Jeff


Re: [PATCH] RISC-V: Fix build error

2017-11-08 Thread Palmer Dabbelt
Committed.  Thanks, Kito :).

On Tue, 07 Nov 2017 15:20:05 PST (-0800), Palmer Dabbelt wrote:
> From: Kito Cheng 
>
>   - This build error was indroduced by "RISC-V: Implement movmemsi"
> and "RISC-V: Support -mpreferred-stack-boundary flag"
>
> gcc/ChangeLog
>
> 2017-11-07  Kito Cheng  
>
> * config/riscv/riscv-protos.h (riscv_slow_unaligned_access_p):
> New extern.
> (MOVE_RATIO): Use riscv_slow_unaligned_access_p.
> config/riscv/riscv.c (predict.h): New include.
> (riscv_slow_unaligned_access_p): No longer static.
> (riscv_block_move_straight): Add require.
> config/riscv/riscv-protos.h (riscv_hard_regno_nregs): Delete.
> ---
>  gcc/config/riscv/riscv-protos.h | 1 -
>  gcc/config/riscv/riscv.c| 5 +++--
>  gcc/config/riscv/riscv.h| 4 +++-
>  3 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 34f9859928e2..5f65b20e792e 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -68,7 +68,6 @@ extern void riscv_expand_prologue (void);
>  extern void riscv_expand_epilogue (bool);
>  extern bool riscv_can_use_return_insn (void);
>  extern rtx riscv_function_value (const_tree, const_tree, enum machine_mode);
> -extern unsigned int riscv_hard_regno_nregs (int, enum machine_mode);
>  extern bool riscv_expand_block_move (rtx, rtx, rtx);
>
>  /* Routines implemented in riscv-c.c.  */
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index e9783e920ef6..279af909a694 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "df.h"
>  #include "diagnostic.h"
>  #include "builtins.h"
> +#include "predict.h"
>
>  /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF.  */
>  #define UNSPEC_ADDRESS_P(X)  \
> @@ -217,7 +218,7 @@ struct riscv_cpu_info {
>  /* Global variables for machine-dependent things.  */
>
>  /* Whether unaligned accesses execute very slowly.  */
> -static bool riscv_slow_unaligned_access_p;
> +bool riscv_slow_unaligned_access_p;
>
>  /* Which tuning parameters to use.  */
>  static const struct riscv_tune_info *tune_info;
> @@ -2657,7 +2658,7 @@ riscv_block_move_straight (rtx dest, rtx src, 
> HOST_WIDE_INT length)
>bits = MAX (BITS_PER_UNIT,
> MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest;
>
> -  mode = mode_for_size (bits, MODE_INT, 0);
> +  mode = mode_for_size (bits, MODE_INT, 0).require ();
>delta = bits / BITS_PER_UNIT;
>
>/* Allocate a buffer for the temporary registers.  */
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index c0901a093033..91a9c33543d2 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -824,7 +824,7 @@ while (0)
> case, movmem or libcall is more efficient.  */
>
>  #define MOVE_RATIO(speed)\
> -  (!STRICT_ALIGNMENT && riscv_slow_unaligned_access ? 1 :\
> +  (!STRICT_ALIGNMENT && riscv_slow_unaligned_access_p ? 1 :  \
> (speed) ? RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD :   \
> CLEAR_RATIO (speed) / 2)
>
> @@ -841,6 +841,8 @@ while (0)
>
>  #ifndef USED_FOR_TARGET
>  extern const enum reg_class riscv_regno_to_class[];
> +extern bool riscv_slow_unaligned_access_p;
> +extern unsigned riscv_stack_boundary;
>  #endif
>
>  #define ASM_PREFERRED_EH_DATA_FORMAT(CODE,GLOBAL) \


RE: [PATCH 08/22] Add Intel CET support for EH in libgcc.

2017-11-08 Thread Tsimbalist, Igor V
> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, November 8, 2017 8:06 PM
> To: Tsimbalist, Igor V ; gcc-
> patc...@gcc.gnu.org
> Cc: i...@airs.com
> Subject: Re: [PATCH 08/22] Add Intel CET support for EH in libgcc.
> 
> On 11/04/2017 06:43 AM, Tsimbalist, Igor V wrote:
> >> -Original Message-
> >> From: Jeff Law [mailto:l...@redhat.com]
> >> Sent: Tuesday, October 31, 2017 5:49 AM
> >> To: Tsimbalist, Igor V ; gcc-
> >> patc...@gcc.gnu.org
> >> Cc: i...@airs.com
> >> Subject: Re: [PATCH 08/22] Add Intel CET support for EH in libgcc.
> >>
> >> On 10/12/2017 01:56 PM, Tsimbalist, Igor V wrote:
> >>> Control-flow Enforcement Technology (CET), published by Intel,
> Introduces
> >>> the Shadow Stack feature, which ensures a return from a function is
> done
> >>> to exactly the same location from where the function was called. When
> EH
> >>> is present the control-flow transfer may skip some stack frames and the
> >>> shadow stack has to be adjusted not to signal a violation of a
> >>> control-flow transfer. It's done by counting a number of skipping frames
> >>> and adjusting shadow stack pointer by this number.
> >>>
> >>> gcc/
> >>>   * config/i386/i386.c (ix86_expand_epilogue): Change simple
> >>>   return to indirect jump for EH return. Change explicit 'false'
> >>>   argument in pro_epilogue_adjust_stack with a value of
> >>>   flag_cf_protection.
> >>>   * config/i386/i386.md (simple_return_indirect_internal): Remove
> >>>   SImode restriction to support 64-bit.
> >>>
> >>> libgcc/
> >>>   * config/i386/linux-unwind.h: Include
> >>>   config/i386/shadow-stack-unwind.h.
> >>>   * config/i386/shadow-stack-unwind.h: New file.
> >>>   * unwind-dw2.c: (uw_install_context): Add a FRAMES argument and
> >>>   pass it to _Unwind_Frames_Extra.
> >>>   * unwind-generic.h (FRAMES_P_DECL): New.
> >>>   (FRAMES_VAR): Likewise.
> >>>   (FRAMES_VAR_P): Likewise.
> >>>   (FRAMES_VAR_DECL): Likewise.
> >>>   (FRAMES_VAR_DECL_1): Likewise.
> >>>   (FRAMES_VAR_INC): Likewise.
> >>>   (FRAMES_P_UPDATE): Likewise.
> >>>   (_Unwind_Frames_Extra): Likewise.
> >>>   * unwind.inc (_Unwind_RaiseException_Phase2): Use
> >> FRAMES_P_DECL,
> >>>   FRAMES_VAR_DECL_1, FRAMES_VAR_INC and FRAMES_P_UPDATE.
> >>>   (_Unwind_RaiseException): Use FRAMES_VAR_DECL,
> >> FRAMES_VAR_P and
> >>>   FRAMES_VAR.
> >>>   (_Unwind_ForcedUnwind_Phase2): Use FRAMES_P_DECL,
> >>>   FRAMES_VAR_DECL_1, FRAMES_VAR_INC, FRAMES_P_UPDATE.
> >>>   (_Unwind_ForcedUnwind): Use FRAMES_VAR_DECL,
> >> FRAMES_VAR_P and
> >>>   FRAMES_VAR.
> >>>   (_Unwind_Resume): Use FRAMES_VAR_DECL, FRAMES_VAR_P and
> >>>   FRAMES_VAR.
> >>>   (_Unwind_Resume_or_Rethrow): Use FRAMES_VAR_DECL,
> >> FRAMES_VAR_P
> >>>   and FRAMES_VAR.
> >>>
> >>> Igor
> >>>
> >>>
> >>>
> >>> 0008-Add-Intel-CET-support-for-EH-in-libgcc.patch
> >>>
> >>>
> >>> From 16eb1d0d9239e039fba28f1ae71762f19061b157 Mon Sep 17
> 00:00:00
> >> 2001
> >>> From: Igor Tsimbalist 
> >>> Date: Wed, 19 Jul 2017 03:04:46 +0300
> >>> Subject: [PATCH 08/22] Add Intel CET support for EH in libgcc.
> >>>
> >>> Control-flow Enforcement Technology (CET), published by Intel,
> >>> introduces
> >>> the Shadow Stack feature, which ensures a return from a function is
> done
> >>> to exactly the same location from where the function was called. When
> EH
> >>> is present the control-flow transfer may skip some stack frames and the
> >>> shadow stack has to be adjusted not to signal a violation of a
> >>> control-flow transfer. It's done by counting a number of skiping frames
> >>> and adjasting shadow stack pointer by this number.
> >>>
> >>> Having new semantic of the 'ret' instruction if CET is supported in HW
> >>> the 'ret' instruction cannot be generated in ix86_expand_epilogue when
> >>> we are returning after EH is processed. Added a code in
> >>> ix86_expand_epilogue to adjust Shadow Stack pointer and to generate
> an
> >>> indirect jump instead of 'ret'. As sp register is used during this
> >>> adjustment thus the argument in pro_epilogue_adjust_stack is changed
> >>> to update cfa_reg based on whether control-flow instrumentation is set.
> >>> Without updating the cfa_reg field there is an assert later in dwarf2
> >>> pass related to mismatch the stack register and cfa_reg value.
> >>>
> >>> gcc/
> >>>   * config/i386/i386.c (ix86_expand_epilogue): Change simple
> >>>   return to indirect jump for EH return. Change explicit 'false'
> >>>   argument in pro_epilogue_adjust_stack with a value of
> >>>   flag_cf_protection.
> >>>   * config/i386/i386.md (simple_return_indirect_internal): Remove
> >>>   SImode restriction to support 64-bit.
> >>>
> >>> libgcc/
> >>>   * config/i386/linux-unwind.h: Include
> >>>   config/i386/shadow-stack-unwind.h.
> >>>   * config/i386/shadow-stack-unwind.h: New file.
> >>>   * unwind-dw2.c: (uw_install_context): Add a FRAMES argument and
> >>>   pass it 

[PATCH, rs6000] Add __builtin_altivec_vsumsws

2017-11-08 Thread Carl Love


GCC Maintainers:

The following patch add support for the builtin:
  
  vector signed int __builtin_altivec_vsumsws_be (vector signed int, vector 
signed int)
 

The patch has been tested on

 powerpc64le-unknown-linux-gnu (Power 8 LE),   
 powerpc64le-unknown-linux-gnu (Power 9 LE)

without regressions.  

Please let me know if the following patch is acceptable.  Thanks.

   Carl Love
--

gcc/ChangeLog:

2017-11-08  Carl Love  

* config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_SUMS): Add macro 
expansion.
* config/rs6000/altivec.md (altivec_vsumsws_be): Add define_expand.
* config/rs6000/rs6000-builtin.def (VSUMSWS_BE): Add macro expansion.

gcc/testsuite/ChangeLog:

2017-11-08 Carl Love  

* gcc.target/powerpc/builtin-vec-sums-be-int.c: New test file.
---
 gcc/config/rs6000/altivec.md | 11 +++
 gcc/config/rs6000/rs6000-builtin.def |  1 +
 .../gcc.target/powerpc/builtin-vec-sums-be-int.c | 16 
 3 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-vec-sums-be-int.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 651f6c9..706001c 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1905,6 +1905,17 @@
   DONE;
 })
 
+(define_expand "altivec_vsumsws_be"
+  [(use (match_operand:V4SI 0 "register_operand"))
+   (use (match_operand:V4SI 1 "register_operand"))
+   (use (match_operand:V4SI 2 "register_operand"))]
+  "TARGET_ALTIVEC"
+{
+  emit_insn (gen_altivec_vsumsws_direct (operands[0], operands[1],
+operands[2]));
+  DONE;
+})
+
 ; FIXME: This can probably be expressed without an UNSPEC.
 (define_insn "altivec_vsumsws_direct"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 9dddc11..6addfdc 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1079,6 +1079,7 @@ BU_ALTIVEC_2 (VSUM4SBS, "vsum4sbs",   CONST,  
altivec_vsum4sbs)
 BU_ALTIVEC_2 (VSUM4SHS,  "vsum4shs",   CONST,  
altivec_vsum4shs)
 BU_ALTIVEC_2 (VSUM2SWS,  "vsum2sws",   CONST,  
altivec_vsum2sws)
 BU_ALTIVEC_2 (VSUMSWS,   "vsumsws",CONST,  altivec_vsumsws)
+BU_ALTIVEC_2 (VSUMSWS_BE, "vsumsws_be",CONST,  altivec_vsumsws_be)
 BU_ALTIVEC_2 (VXOR,  "vxor",   CONST,  xorv4si3)
 BU_ALTIVEC_2 (COPYSIGN_V4SF,  "copysignfp",CONST,  vector_copysignv4sf3)
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-vec-sums-be-int.c 
b/gcc/testsuite/gcc.target/powerpc/builtin-vec-sums-be-int.c
new file mode 100644
index 000..b4dfd06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin-vec-sums-be-int.c
@@ -0,0 +1,16 @@
+/* Test for the __builtin_altivec_vsumsws_be() builtin.
+   It produces just the instruction vsumsws in LE and BE modes.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector signed int
+test_vec_sums (vector signed int vsi2, vector signed int vsi3)
+{
+  return  __builtin_altivec_vsumsws_be (vsi2, vsi3);
+}
+
+/* { dg-final { scan-assembler-times "vsumsws" 1 } } */
-- 
2.7.4





RE: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only

2017-11-08 Thread Tsimbalist, Igor V


Igor


> -Original Message-
> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> Sent: Wednesday, November 8, 2017 7:18 PM
> To: Tsimbalist, Igor V 
> Cc: Jeff Law ; gcc-patches@gcc.gnu.org;
> trie...@redhat.com
> Subject: Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
> 
> On Tue, Nov 7, 2017 at 8:22 AM, Tsimbalist, Igor V
>  wrote:
> > I decided to split my previous patch "Enable building libitm with Intel CET 
> > "
> > into two different patches. The first patch will add a new field to sjlj.S 
> > and
> > target.h  files. The second one will add Intel CET support on the top of the
> > first one. In this case the further changes for adding Intel CET support are
> > seen clearly.
> >
> > Ok for trunk?
> >
> 
> libitm/configure.tgt should check ${target} like the other places:
> 
> +# Update libtool_VERSION since the size of struct gtm_jmpbuf is
> +# changed for x86.
> +case "${host}" in
> 
> Did these come from cut and paste?
> 
> +  # For x86, we use slots in the TCB head for most of our TLS.
> +  # The setup of those slots in beginTransaction can afford to
> +  # use the global-dynamic model.
> 
> I think the whole thing should be:
> 
> case "${target}" in
>   # Update libtool_VERSION since the size of struct gtm_jmpbuf is
>   # changed for x86.
>   i[456]86-*-* | x86_64-*-*)
> libtool_VERSION=2:0:0
> ;;
> esac

There was a feedback from Joseph (email attached) with the comment about
similar case in cet.m4:

> This file is checking $target.  That's only ever appropriate in 
directories
> building compilers and similar tools; target library directories 
should check
> $host, as the host for target libraries is the target for the 
compiler.

Igor

> 
> 
> 
> --
> H.J.
--- Begin Message ---
> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Thursday, October 12, 2017 10:36 PM
> To: Tsimbalist, Igor V 
> Cc: gcc-patches@gcc.gnu.org; Jeff Law ; i...@airs.com
> Subject: Re: [PATCH 07/22] Enable building libgcc with CET options.
>
> On Thu, 12 Oct 2017, Tsimbalist, Igor V wrote:
>
> > Enable building libgcc with CET options by default on Linux/x86 if
> > binutils supports CET v2.0.
> > It can be disabled with --disable-cet.  It is an error to configure
> > GCC with --enable-cet if bintuiils doesn't support CET v2.0.
> >
> > config/
> > * cet.m4: New file
>
> This file is checking $target.  That's only ever appropriate in directories
> building compilers and similar tools; target library directories should check
> $host, as the host for target libraries is the target for the compiler.

Fixed.

> This file has a comment
>
> > +dnl GCC_CET_LIBRARY
> > +dnl(SHELL-CODE_HANDLER)
>
> which doesn't seem to match the subsequent definition of GCC_CET_FLAGS.

Fixed.

> I don't see any documentation of the new configure option.  I'd expect the
> first patch adding such an option to document it in install.texi, and then
> subsequent patches to update that documentation if those patches extend
> the option to cover more things.

Added the description of this configure option to install.texi.

The updated patch is attached.

Igor

> --
> Joseph S. Myers
> jos...@codesourcery.com


0007-Enable-building-libgcc-with-CET-options.patch
Description: 0007-Enable-building-libgcc-with-CET-options.patch
--- End Message ---


[PATCH, committed] Fix PR fortran/82884

2017-11-08 Thread Steve Kargl
Once the bug is found, the fix is obvious.   Briefly,
in a typespec, ts.u.cl and ts.u.pad are in the same
union.  When parsing a Hollerith, ts.u.pad is is set
to a nonzero value.  Later, when resolving the array
constructor with Hollerith entities, a reference to
ts.u.cl is made which is a mangled non-NULL pointer.
The fix is to clear ts.u.pad in the hollerith to 
character conversion function.

Regression tested on x86_64-*-freebsd.

2017-11-08  Steven G. Kargl  

PR fortran/82884
* arith.c (gfc_hollerith2character): Clear pad.

2017-11-08  Steven G. Kargl  

PR fortran/82884
* gfortran.dg/hollerith_character_array_constructor.f90: New test.

Index: gcc/fortran/arith.c
===
--- gcc/fortran/arith.c (revision 254461)
+++ gcc/fortran/arith.c (working copy)
@@ -2604,6 +2604,7 @@ gfc_hollerith2character (gfc_expr *src, int kind)
   result = gfc_copy_expr (src);
   result->ts.type = BT_CHARACTER;
   result->ts.kind = kind;
+  result->ts.u.pad = 0;
 
   result->value.character.length = result->representation.length;
   result->value.character.string
Index: gcc/testsuite/gfortran.dg/hollerith_character_array_constructor.f90
===
--- gcc/testsuite/gfortran.dg/hollerith_character_array_constructor.f90 
(nonexistent)
+++ gcc/testsuite/gfortran.dg/hollerith_character_array_constructor.f90 
(working copy)
@@ -0,0 +1,11 @@
+! { dg-do run }
+! { dg-options "-w" }
+! PR fortran/82884
+! Original code contributed by Gerhard Steinmetz
+program p
+   character :: c(4) = [1h(, 1hi, 1h4, 1h)]
+   if (c(1) /= '(') call abort
+   if (c(2) /= 'i') call abort
+   if (c(3) /= '4') call abort
+   if (c(4) /= ')') call abort
+end

-- 
Steve


[PATCH 2/2] Reimplementation of param-manipulation

2017-11-08 Thread Martin Jambor
Hi,

this patch is a substantial rewrite of function parameter
manipulations currently used by IPA-SRA and OpenMP SIMD cloning.  I
started it with the aim to cleanup the code but also to make the data
structures as small as possible so that, for real IPA-SRA, I can
attach them to call graph nodes instead of args_to_skip bitmaps.

That is what the ipa_param_adjustments is intended to be, although it
is not attached to a cgraph node just yet.  This class is also capable
of modifying call-sites.  Additionally, there is a new class
ipa_param_body_adjustments for modifying signatures, types and bodies
of callees, which is not expected to live long and so code clarity was
emphasized over data structure compactness.

Furthermore, the interface is cleaner, the user simply describes what
new parameters should look like and does not need to care about
various data fields required to make the changes happen like today.
Also, some functionality like SSA name remapping is now shared among
the two users while now both implement it themselves.

Although I think the patch is a clear improvement, there may be
reasons to wait with committing.  one drawback is that
ipa_param_adjustments et. al. are now in GC memory, which will be
necessary once they are attached to call graph but now it is just
useless.  It is not a lot of memory though.

In any event, this patch also passes bootstrap and testing on
x86_64-linux.

Thanks,

Martin


2017-11-07  Martin Jambor  

* ipa-param-manipulation.h (IPA_PARAM_PREFIX_SYNTH): New.
(IPA_PARAM_PREFIX_ISRA): Likewise.
(IPA_PARAM_PREFIX_SIMD): Likewise.
(IPA_PARAM_PREFIX_MASK): Likewise.
(IPA_PARAM_MAX_INDEX_BITS): Likewise.
(ipa_parm_op): Changed element prefixes to IPA_PARAM_OP, renamed
IPA_PARM_OP_NONE to IPA_PARAM_OP_UNDEFINED, reordered elements.
(ipa_parm_adjustment): Removed.
(ipa_parm_adjustment_vec): Likewise.
(ipa_modify_formal_parameters): Removed declaration.
(ipa_modify_call_arguments): Likewise.
(ipa_combine_adjustments): Likewise.
(ipa_dump_param_adjustments): Likewise.
(ipa_modify_expr): Likewise.
(ipa_get_adjustment_candidate): Likewise.
(ipa_adjusted_param): Likewise.
(ipa_param_adjustments): Likewise.
(ipa_param_body_replacement): Likewise.
(ipa_param_body_adjustments): Likewise.
* Makefile.in (GTFILES): Added ipa-param-manipulation.h.
* ipa-param-manipulation.c: Included tree-eh.h and tree-cfg.h.
(ipa_modify_formal_parameters): Removed.
(ipa_modify_call_arguments): Likewise.
(index_in_adjustments_multiple_times_p): Likewise.
(ipa_combine_adjustments): Likewise.
(ipa_get_adjustment_candidate): Likewise.
(ipa_modify_expr): Likewise.
(ipa_dump_param_adjustments): Likewise.
(ipa_param_prefixes): New.
(ipa_param_adjustments::ipa_param_adjustments): Likewise.
(ipa_param_adjustments::init): Likewise.
(ipa_param_adjustments::modify_call_arguments): Likewise.
(modify_call_arguments): Likewise.
(ipa_param_op_names): Likewise.
(ipa_dump_adjusted_parameters): Likewise.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Likewise.
(register_replacement): Likewise.
(modify_formal_parameters): Likewise.
(lookup_replacement): Likewise.
(get_expr_replacement): Likewise.
(get_replacement_ssa_base): Likewise.
(replace_removed_params_ssa_names): Likewise.
(modify_expr): Likewise.
(modify_assignment): Likewise.
(modify_cfun_body): Likewise.
(reset_debug_stmts): Likewise.
(perform_cfun_body_modifications): Likewise.
* omp-simd-clone.c (simd_clone_adjust_argument_types): Adjusted to
use new parameter manipulation structures.
(simd_clone_init_simd_arrays): Likewise.
(modify_stmt_info): Likewise.
(ipa_simd_modify_stmt_ops): Likewise.
(ipa_simd_modify_function_body): Likewise.
(simd_clone_adjust): Likewise.
* tree-sra.c (turn_representatives_into_adjustments): Likewise.
(analyze_all_param_acesses): Likewise.
(replace_removed_params_ssa_names): Removed.
(get_adjustment_for_base): Likewise.
(replace_removed_params_ssa_names): Likewise.
(sra_ipa_modify_assign): Likewise.
(ipa_sra_modify_function_body): Likewise.
(sra_ipa_reset_debug_stmts): Likewise.
(convert_callers_data): New type.
(convert_callers_for_node): Adjusted to use new parameter
manipulation structures.
(convert_callers): Likewise.
(modify_function): Likewise.  Flip order of modifying callers and
callees.
(ipa_early_sra): Adjusted to use new parameter manipulation
structures.

* testsuite/gcc.dg/ipa/ipa-sra-3.c: Modified dump scan.
---
 gcc/Makefile.in

[PATCH 1/2] Moving parameter manipulation into its own file

2017-11-08 Thread Martin Jambor
Hi,

the following patch moves all function and call parameter manipulation
(as opposed to analysis) data structures and functions from ipa-prop.h
and ipa-prop.c to new files ipa-param-manipulation.h and
ipa-param-manipulation.c respectively.  It does no functional change.

Please look at the followup patch if you'd like to see where I am
heading with this.  While I am willing to hold up the followup patch
for GCC 9, I would like to commit this now because it increases
modularity in an area where it is needed.

Bootstrapped and teste on x86_64-linux.  OK for trunk?

Martin


2017-08-23  Martin Jambor  

* ipa-param-manipulation.c: New file.
* ipa-param-manipulation.h: Likewise.
* Makefile.in (OBJS): Add ipa-param-manipulation.o.
(PLUGIN_HEADERS): Addded ipa-param-manipulation.h
* ipa-param.h (ipa_parm_op): Moved to ipa-param-manipulation.h.
(ipa_parm_adjustment): Likewise.
(ipa_parm_adjustment_vec): Likewise.
(ipa_get_vector_of_formal_parms): Moved declaration to
ipa-param-manipulation.h.
(ipa_get_vector_of_formal_parm_types): Likewise.
(ipa_modify_formal_parameters): Likewise.
(ipa_modify_call_arguments): Likewise.
(ipa_combine_adjustments): Likewise.
(ipa_dump_param_adjustments): Likewise.
(ipa_modify_expr): Likewise.
(ipa_get_adjustment_candidate): Likewise.
* ipa-prop.c (ipa_get_vector_of_formal_parms): Moved to
ipa-param-manipulation.c.
(ipa_get_vector_of_formal_parm_types): Likewise.
(ipa_modify_formal_parameters): Likewise.
(ipa_modify_call_arguments): Likewise.
(ipa_modify_expr): Likewise.
(get_ssa_base_param): Likewise.
(ipa_get_adjustment_candidate): Likewise.
(index_in_adjustments_multiple_times_p): Likewise.
(ipa_combine_adjustments): Likewise.
(ipa_dump_param_adjustments): Likewise.
* tree-sra.c: Also include ipa-param-manipulation.h
* omp-simd-clone.c: Include ipa-param-manipulation.h instead of
ipa-param.h.
---
 gcc/Makefile.in  |   3 +-
 gcc/ipa-param-manipulation.c | 767 +++
 gcc/ipa-param-manipulation.h | 120 +++
 gcc/ipa-prop.c   | 725 
 gcc/ipa-prop.h   |  94 --
 gcc/omp-simd-clone.c |   2 +-
 gcc/tree-sra.c   |   1 +
 7 files changed, 891 insertions(+), 821 deletions(-)
 create mode 100644 gcc/ipa-param-manipulation.c
 create mode 100644 gcc/ipa-param-manipulation.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 51968e4a66f..1bb1d6ec0ff 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1354,6 +1354,7 @@ OBJS = \
ipa-predicate.o \
ipa-profile.o \
ipa-prop.o \
+   ipa-param-manipulation.o \
ipa-pure-const.o \
ipa-icf.o \
ipa-icf-gimple.o \
@@ -3467,7 +3468,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
   $(C_COMMON_H) c-family/c-objc.h $(C_PRETTY_PRINT_H) \
   tree-iterator.h $(PLUGIN_H) $(TREE_SSA_H) langhooks.h incpath.h debug.h \
   $(EXCEPT_H) tree-ssa-sccvn.h real.h output.h $(IPA_UTILS_H) \
-  $(C_PRAGMA_H)  $(CPPLIB_H)  $(FUNCTION_H) \
+  ipa-param-manipulation.h $(C_PRAGMA_H)  $(CPPLIB_H)  $(FUNCTION_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h params-enum.h \
   prefix.h tree-inline.h $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
new file mode 100644
index 000..bedd201f6dd
--- /dev/null
+++ b/gcc/ipa-param-manipulation.c
@@ -0,0 +1,767 @@
+/* Manipulation of formal and actual parameters of functions and function
+   calls.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "gimple.h"
+#include "ssa.h"
+#include "cgraph.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "tree-dfa.h"
+#include "ipa-param-manipulation.h"
+#include "print-tree.h"
+#include 

Re: [Patch, fortran] PR78619 - [6/7/8 Regression] ICE in copy_reference_ops_from_ref, at tree-ssa-sccvn.c:889

2017-11-08 Thread Steve Kargl
On Wed, Nov 08, 2017 at 07:51:54PM +, Paul Richard Thomas wrote:
> This regression arose from the patch for PR66465, in which the type
> check for the associated intrinsic was failing when testing the
> association of a procedure pointer component with a procedure pointer.
> See the comment in the patch for an explanation as to why this is an
> issue. The fix is to isolate the fix for PR66465 to calls from
> gfc_check_associated.
> 
> Bootstrapped and regtested on FC23/x86_64. OK for all three branches?
> 

Yes.  Thanks for the patch.

--   
Steve


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Andreas Schwab
On Nov 08 2017, Wilco Dijkstra  wrote:

> Joseph Myers wrote:
>> On Fri, 3 Nov 2017, Wilco Dijkstra wrote:
>>
>> > Almost all targets add an explict -fomit-frame-pointer in the target 
>> > specific
>> > options.  Rather than doing this in a target-specific way, do this in the
>>
>> Which targets do not?  You should explicitly list them and CC their 
>> maintainers and seek confirmation that such a change is appropriate for 
>> them.
>
> The targets that don't explicitly enable -fomit-frame-pointer in the target
> options or force it internally are bfin, ft32, h8300, m68k - I've CCd the
> maintainers (it seems there is no-one for h8300).

For m68k, adding -fomit-frame-pointer by default is ok.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH], Generate XXBR{H,W,D} for bswap{16,32,64} on PowerPC ISA 3.0 (power9)

2017-11-08 Thread Michael Meissner
On Wed, Nov 08, 2017 at 08:01:06AM -0600, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Wed, Nov 08, 2017 at 08:14:31AM -0500, Michael Meissner wrote:
> > PowerPC ISA 3.0 does not have a byte-reverse instruction that operates on 
> > the
> > GPRs, but it does have vector byte swap half-word, word, double-word 
> > operations
> > in the VSX registers.  The enclosed patch enables generation of the byte
> > revseral instructions for register-register operations.  It still prefers to
> > generate the load with byte reverse (L{H,W,D}BRX) or store with byte reverse
> > (ST{H,W,D}BRX) instructions over the register sequence.
> > 
> > For 16-bit and 32-bit byte swaps, it typically does the tradational 
> > operation
> > in GPR registers, but it will generate XXBR{H,W} if the values are in vector
> > registers.
> 
> You can do the byteswaps with just VMX, too, with some rotate instructions
> (vrlh 8 for HI, vrlh 8 followed by vrlw 16 for SI, etc.)  For a future
> date, I suppose.
> 
> > For 64-bit swaps, it no longer generates the 9 instruction sequence in 
> > favor of
> > XXBRD.  I did some timing runs on a prototype power9 system, and it was
> > slightly faster to do direct move to the vecter unit, XXBRD, and direct move
> > back to a GPR than the traditional sequence.
> > 
> > I did bootstraps on little endian Power8 and Power9 systems (with the 
> > default
> > cpu set to power8 and power9 respectively).  There were no regressions.  
> > Can I
> > check this patch into the trunk?
> > 
> > [gcc]
> > 2017-11-08  Michael Meissner  
> > 
> > * config/rs6000/rs6000.md (bswaphi2_reg): On ISA 3.0 systems,
> > enable generating XXBR{H,W} if the value is in a vector
> > register.
> 
> Join the last two lines?  And you only mean XXBRH here.
> 
> > (bswapsi2_reg): Likewise.
> 
> And XXBRW here.

No, I meant the comment for both insns, but I will separate them.

> > (bswapdi2_reg): On ISA 3.0 systems, use XXBRD to do bswap64
> > instead of doing the GPR sequence used on previoius machines.
> 
> Typo ("previous").

Ok.

> > (bswapdi2_xxbrd): Likewise.
> 
> Not "Likewise"; just "New define_insn."

Ok.

> Should this somehow be joined with p9_xxbrd_?  Or maybe you should
> generate that, instead.

No, since p9_xxbrd_ doesn't include DImode.  We could add yet another
mode iterator to include DI + V2DI/V2DF modes, but that seems to be overkill.

> > (bswapdi2_reg splitters): Use int_reg_operand instead of
> > gpc_reg_operand to not match when XXBRD is generated.
> 
> Please see if you can merge the define_splits to their corresponding
> define_insns (making define_insn_and_split).  Shorter, no mismatch
> possible between the two anymore, and you get a name for the patterns
> too :-)

No, there are two different patterns that generate the register to register
bswap64.  One is the fall through path on bswapdi2 if the current target is not
64-bit or is 64-bit and doesn't support LDBRX.  The second is from bswapdi2_reg
doing the split.

I simplified it to only change the one insn that would normally match the
register to register case to skip if ISA 3.0.  I put a test on bswapdi2_reg as
well.

> > @@ -2507,6 +2513,8 @@ (define_expand "bswapdi2"
> > emit_insn (gen_bswapdi2_load (dest, src));
> >else if (MEM_P (dest))
> > emit_insn (gen_bswapdi2_store (dest, src));
> > +  else if (TARGET_P9_VECTOR)
> > +   emit_insn (gen_bswapdi2_xxbrd (dest, src));
> >else
> > emit_insn (gen_bswapdi2_reg (dest, src));
> >DONE;
> 
> Pity that you cannot easily do similar for HI and SI.

Not really.  Bswap64 generates 9 separate insns, while bswap32 and bswap16 only
generate 3 insns.  So, having to add the move direct to/from the vector
registers would mean it would be slower than the normal case that is currently
generated.  But if the value happens to be in a VSX register, then we can do it
in one instruction.

Of course if we had a bswap insn in GPR registers, that would simplify things.
But we don't currently.

> > @@ -2544,7 +2559,8 @@ (define_insn "bswapdi2_reg"
> > (clobber (match_scratch:DI 3 "="))]
> >"TARGET_POWERPC64 && TARGET_LDBRX"
> >"#"
> > -  [(set_attr "length" "36")])
> > +  [(set_attr "length" "36")
> > +   (set_attr "type" "*")])
> 
> Explicitly setting "type" (or any other attr) to the default value is
> pretty useless; just leave it out.

Yep.  I missed that in simplifying the patch.  The first version, kept in the
GPR bswap patterns, and it needed to set the type vecperm on the VSX case.
After doing some testing, I yanked out the GPR support on ISA 3.0, but didn't
delete that line.

Here is the latest patch (tested and no regressions).  Can I check this into
the trunk?

After a burn-in period, I plan to backport a reduced version of the patch
(XXBRD only) to gcc 7, unless you think it shouldn't go into gcc 7.

[gcc]
2017-11-09  Michael Meissner  

* config/rs6000/rs6000.md 

[Patch, fortran] PR78619 - [6/7/8 Regression] ICE in copy_reference_ops_from_ref, at tree-ssa-sccvn.c:889

2017-11-08 Thread Paul Richard Thomas
This regression arose from the patch for PR66465, in which the type
check for the associated intrinsic was failing when testing the
association of a procedure pointer component with a procedure pointer.
See the comment in the patch for an explanation as to why this is an
issue. The fix is to isolate the fix for PR66465 to calls from
gfc_check_associated.

Bootstrapped and regtested on FC23/x86_64. OK for all three branches?

Cheers

Paul

2017-11-08  Paul Thomas  

PR fortran/78619
* check.c (same_type_check): Introduce a new argument 'assoc'
with default value false. If this is true, use the symbol type
spec of BT_PROCEDURE expressions.
(gfc_check_associated): Set 'assoc' true in the call to
'same_type_check'.

2017-11-08  Paul Thomas  

PR fortran/78619
* gfortran.dg/pr78619.f90: New test.
Index: gcc/fortran/check.c
===
*** gcc/fortran/check.c (revision 254440)
--- gcc/fortran/check.c (working copy)
*** less_than_bitsize2 (const char *arg1, gf
*** 427,441 
  /* Make sure two expressions have the same type.  */
  
  static bool
! same_type_check (gfc_expr *e, int n, gfc_expr *f, int m)
  {
gfc_typespec *ets = >ts;
gfc_typespec *fts = >ts;
  
!   if (e->ts.type == BT_PROCEDURE && e->symtree->n.sym)
! ets = >symtree->n.sym->ts;
!   if (f->ts.type == BT_PROCEDURE && f->symtree->n.sym)
! fts = >symtree->n.sym->ts;
  
if (gfc_compare_types (ets, fts))
  return true;
--- 427,448 
  /* Make sure two expressions have the same type.  */
  
  static bool
! same_type_check (gfc_expr *e, int n, gfc_expr *f, int m, bool assoc = false)
  {
gfc_typespec *ets = >ts;
gfc_typespec *fts = >ts;
  
!   if (assoc)
! {
!   /* Procedure pointer component expressions have the type of the 
interface
!procedure. If they are being tested for association with a procedure
!pointer (ie. not a component), the type of the procedure must be
!determined.  */
!   if (e->ts.type == BT_PROCEDURE && e->symtree->n.sym)
!   ets = >symtree->n.sym->ts;
!   if (f->ts.type == BT_PROCEDURE && f->symtree->n.sym)
!   fts = >symtree->n.sym->ts;
! }
  
if (gfc_compare_types (ets, fts))
  return true;
*** gfc_check_associated (gfc_expr *pointer,
*** 1002,1008 
  }
  
t = true;
!   if (!same_type_check (pointer, 0, target, 1))
  t = false;
if (!rank_check (target, 0, pointer->rank))
  t = false;
--- 1009,1015 
  }
  
t = true;
!   if (!same_type_check (pointer, 0, target, 1, true))
  t = false;
if (!rank_check (target, 0, pointer->rank))
  t = false;
Index: gcc/testsuite/gfortran.dg/pr78619.f90
===
*** gcc/testsuite/gfortran.dg/pr78619.f90   (nonexistent)
--- gcc/testsuite/gfortran.dg/pr78619.f90   (working copy)
***
*** 0 
--- 1,21 
+ ! { dg-do compile }
+ ! { dg-options "-Werror -O3" }
+ !
+ ! Tests the fix for PR78619, in which the recursive use of 'f' at line 13
+ ! caused an ICE.
+ !
+ ! Contributed by Gerhard Steinmetz  
+ !
+   print *, g(1.0) ! 'g' is OK
+ contains
+   function f(x) result(z)
+ real :: x, z
+ z = sign(1.0, f) ! { dg-error "calling itself recursively|must be the 
same type" }
+   end
+   real function g(x)
+ real :: x
+ g = -1
+ g = -sign(1.0, g) ! This is OK.
+   end
+ end
+ ! { dg-message "all warnings being treated as errors" "" { target *-*-* } 0 }


Re: [PATCH] Add option to force indirect calls for x86

2017-11-08 Thread Uros Bizjak
> gcc/:
> 2017-11-08  Andi Kleen  
>
> * config/i386/i386.opt: Add -mforce-indirect-call.
> * config/i386/predicates.md: Check for flag_force_indirect_call.
> * doc/invoke.texi: Document -mforce-indirect-call
>
> gcc/testsuite/:
> 2017-11-08  Andi Kleen  
>
> * gcc.target/i386/force-indirect-call-1.c: New test.
> * gcc.target/i386/force-indirect-call-2.c: New test.
> * gcc.target/i386/force-indirect-call-3.c: New test.

LGTM.

Thanks,
Uros.


Re: [x86][patch] Add -march=cannonlake.

2017-11-08 Thread Uros Bizjak
On Wed, Nov 8, 2017 at 9:02 AM, Koval, Julia  wrote:
> Attachment got lost.
>
>> -Original Message-
>> From: Koval, Julia
>> Sent: Wednesday, November 08, 2017 9:01 AM
>> To: 'GCC Patches' 
>> Cc: 'Uros Bizjak' ; 'Kirill Yukhin' 
>> 
>> Subject: RE: [x86][patch] Add -march=cannonlake.
>>
>> Hi, this patch adds new option -march=cannonlake. Isasets defined in:
>> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
>> instruction-set-extensions-programming-reference.pdf
>>
>> Ok for trunk?
>>
>> gcc/
>>   * config.gcc: Add -march=cannonlake.
>>   * config/i386/driver-i386.c (host_detect_local_cpu): Detect cannonlake.
>>   * config/i386/i386-c.c (ix86_target_macros_internal): Handle
>> cannonlake.
>>   * config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
>>   (PTA_CANNONLAKE): New.
>>   (processor_target_table): Add cannonlake.
>>   (ix86_option_override_internal): Ditto.
>>   (fold_builtin_cpu): Ditto.
>>   (get_builtin_code_for_version): Handle cannonlake.
>>   (M_INTEL_CANNONLAKE): New.
>>   * config/i386/i386.h (TARGET_CANNONLAKE,
>> PROCESSOR_CANNONLAKE): New.
>>   * doc/invoke.texi: Add -march=cannonlake.
>> gcc/testsuite/
>>   * gcc.target/i386/funcspec-56.inc: Handle new march.

--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -803,9 +803,11 @@ const char *host_detect_local_cpu (int argc,
const char **argv)
 default:
   if (arch)
 {
+  if (has_avx512vbmi)
+cpu = "cannonlake";
   /* This is unknown family 0x6 CPU.  */
   /* Assume Knights Landing.  */
-  if (has_avx512f)
+  else if (has_avx512f)
 cpu = "knl";
   /* Assume Knights Mill */
   else if (has_avx5124vnniw)

You should add correct model numbers under . The above is for the unknown case (which should
not happen), and it should read (note that "knl" is already
misplaced):

default:
  /* This is unknown family 0x6 CPU.  */
  if (arch)
{
  /* Assume Cannonlake.  */
  if (has_avx512vbmi)
cpu = "cannonlake";
  /* Assume Knights Mill */
  else if (has_avx5124vnniw)
cpu = "knm";
  /* Assume Skylake.  */
  else if (has_clflushopt)
cpu = "skylake";
  /* Assume Knights Landing.  */
  else if (has_avx512f)
cpu = "knl";
  /* Assume Broadwell.  */
  else if (has_adx)
...


@@ -31832,7 +31839,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_INTEL_COREI7_HASWELL,
 M_INTEL_COREI7_BROADWELL,
 M_INTEL_COREI7_SKYLAKE,
-M_INTEL_COREI7_SKYLAKE_AVX512
+M_INTEL_COREI7_SKYLAKE_AVX512,
+M_INTEL_CANNONLAKE
   };

Please also update libgcc/config/i386/cpuinfo.h, enum processor_features.

diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 9ae74cb..ed0748b 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -144,6 +144,7 @@ extern void test_arch_core_avx2 (void)
__attribute__((__target__("arch=core-avx
 extern void test_arch_knl (void)
__attribute__((__target__("arch=knl")));
 extern void test_arch_knm (void)
__attribute__((__target__("arch=knm")));
 extern void test_arch_skylake_avx512 (void)
__attribute__((__target__("arch=skylake-avx512")));
+extern void test_arch_cannonlake (void)
__attribute__((__target__("arch=cannonlake")));
 extern void test_arch_k8 (void)
__attribute__((__target__("arch=k8")));
 extern void test_arch_k8_sse3 (void)
__attribute__((__target__("arch=k8-sse3")));
 extern void test_arch_opteron (void)
__attribute__((__target__("arch=opteron")));

Please also add new architecture to multiversioning testcases, see
gcc/testsuite/g++.dg/ext/mv*.C

Uros.


Re: [5/10] Add vect_perm3_* target selectors

2017-11-08 Thread Jeff Law
On 11/03/2017 10:19 AM, Richard Sandiford wrote:
> SLP load permutation fails if any individual permutation requires more
> than two vector inputs.  For 128-bit vectors, it's possible to permute
> 3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
> or 64-bit elements.  The results are reversed for 256-bit vectors,
> and so on for wider vectors.
> 
> This patch adds a routine that tests whether a permute will require
> three vectors for a given vector count and element size, then adds
> vect_perm3_* target selectors for the cases that we currently use.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/sourcebuild.texi (vect_perm_short, vect_perm_byte): Document
>   previously undocumented selectors.
>   (vect_perm3_byte, vect_perm3_short, vect_perm3_int): Document.
> 
> gcc/testsuite/
>   * lib/target-supports.exp (vect_perm_supported): New proc.
>   (check_effective_target_vect_perm3_int): Likewise.
>   (check_effective_target_vect_perm3_short): Likewise.
>   (check_effective_target_vect_perm3_byte): Likewise.
>   * gcc.dg/vect/slp-perm-1.c: Expect SLP load permutation to
>   succeed if vect_perm3_int.
>   * gcc.dg/vect/slp-perm-5.c: Likewise.
>   * gcc.dg/vect/slp-perm-6.c: Likewise.
>   * gcc.dg/vect/slp-perm-7.c: Likewise.
>   * gcc.dg/vect/slp-perm-8.c: Likewise vect_perm3_byte.
>   * gcc.dg/vect/slp-perm-9.c: Likewise vect_perm3_short.
>   Use vect_perm_short instead of vect_perm.  Add a scan-tree-dump-not
>   test for vect_perm3_short targets.
Going to take your word on the correctness of vect_perm_supported. :-)

OK for the trunk.

jeff


[GCC-6.4][ARM][PATCH v2] enable FL_LPAE flag for armv7ve cores

2017-11-08 Thread Andre McCurdy
The following commit added the FL_LPAE flag to FL_FOR_ARCH7VE, but
neglected to also add it to the armv7ve compatible cores defined in
arm-cores.def.

  
https://github.com/gcc-mirror/gcc/commit/af2d9b9e58e8be576c53d94f30c48c68146b0c98

The result is that gcc 6.4 now refuses to allow -march=armv7ve and
-mcpu=XXX to be used together, even when -mcpu is set to an armv7ve
compatible core:

  arm-linux-gnueabi-gcc -march=armv7ve -mcpu=cortex-a7 -Werror ...
  error: switch -mcpu=cortex-a7 conflicts with -march=armv7ve switch [-Werror]

This is a regression relative to gcc 6.3.

Fix by defining flags for armv7ve compatible cores directly from
FL_FOR_ARCH7VE, rather than re-creating the armv7ve flags
independently by combining FL_FOR_ARCH7A with the armv7ve specific
FL_THUMB_DIV and FL_ARM_DIV flags.

Signed-off-by: Andre McCurdy 
---
 gcc/ChangeLog|  5 +
 gcc/config/arm/arm-cores.def | 12 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8c980ab..7be9e67 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2017-11-08  Andre McCurdy  
+
+   * config/arm/arm-cores.def: Fix missing FL_LPAE flag for armv7ve
+   compatible cores.
+
 2017-10-31  Uros Bizjak  
 
PR target/82772
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 829b839..ca37e6f 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -145,12 +145,12 @@ 
ARM_CORE("cortex-m0plus.small-multiply",cortexm0plussmallmultiply, cortexm0plus,
 /* V7 Architecture Processors */
 ARM_CORE("generic-armv7-a",genericv7a, genericv7a, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7A), cortex)
 ARM_CORE("cortex-a5",  cortexa5, cortexa5, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7A), cortex_a5)
-ARM_CORE("cortex-a7",  cortexa7, cortexa7, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a7)
+ARM_CORE("cortex-a7",  cortexa7, cortexa7, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a7)
 ARM_CORE("cortex-a8",  cortexa8, cortexa8, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7A), cortex_a8)
 ARM_CORE("cortex-a9",  cortexa9, cortexa9, 7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7A), cortex_a9)
-ARM_CORE("cortex-a12", cortexa12, cortexa17,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a12)
-ARM_CORE("cortex-a15", cortexa15, cortexa15,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a15)
-ARM_CORE("cortex-a17", cortexa17, cortexa17,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a12)
+ARM_CORE("cortex-a12", cortexa12, cortexa17,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a12)
+ARM_CORE("cortex-a15", cortexa15, cortexa15,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a15)
+ARM_CORE("cortex-a17", cortexa17, cortexa17,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a12)
 ARM_CORE("cortex-r4",  cortexr4, cortexr4, 7R, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7R), cortex)
 ARM_CORE("cortex-r4f", cortexr4f, cortexr4f,   7R, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7R), cortex)
 ARM_CORE("cortex-r5",  cortexr5, cortexr5, 7R, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_ARM_DIV | FL_FOR_ARCH7R), cortex)
@@ -162,8 +162,8 @@ ARM_CORE("cortex-m3",   cortexm3, cortexm3, 
7M, ARM_FSET_MAKE_CPU1 (FL_LDSCHED |
 ARM_CORE("marvell-pj4",marvell_pj4, marvell_pj4,   7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7A), marvell_pj4)
 
 /* V7 big.LITTLE implementations */
-ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,  7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a15)
-ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7,  7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
cortex_a12)
+ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,  7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a15)
+ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7,  7A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH7VE), cortex_a12)
 
 /* V8 Architecture Processors */
 ARM_CORE("cortex-a32", cortexa32, cortexa53,   8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
-- 
1.9.1



Re: [GCC-6.4][ARM][PATCH] enable FL_LPAE flag for armv7ve cores

2017-11-08 Thread Andre McCurdy
On Wed, Nov 8, 2017 at 2:03 AM, Kyrill  Tkachov
 wrote:
> Hi Andre,
>
> On 08/11/17 05:12, Andre McCurdy wrote:
>>
>> The following commit added the FL_LPAE flag to FL_FOR_ARCH7VE, but
>> neglected to also add it to the armv7ve compatible cores defined in
>> arm-cores.def.
>>
>> https://github.com/gcc-mirror/gcc/commit/af2d9b9e58e8be576c53d94f30c48c68146b0c98
>>
>> The result is that gcc 6.4 now refuses to allow -march=armv7ve and
>> -mcpu=XXX to be used together, even when -mcpu is set to an armv7ve
>> compatible core:
>>
>>   arm-linux-gnueabi-gcc -march=armv7ve -mcpu=cortex-a7 -Werror ...
>>   error: switch -mcpu=cortex-a7 conflicts with -march=armv7ve switch
>> [-Werror]
>>
>> This is a regression relative to gcc 6.3.
>>
>> Fix by defining flags for armv7ve compatible cores directly from
>> FL_FOR_ARCH7VE, rather than re-creating the armv7ve flags
>> independently by combining FL_FOR_ARCH7A with the armv7ve specific
>> FL_THUMB_DIV and FL_ARM_DIV flags.
>>
>
> Thank you for the patch. The change looks reasonable to me.
> The way CPUs are defined internally was changed for GCC 7
> so later branches won't have this problem.
>
> How has this patch been tested?

I've tested to confirm that the error when running the above test
command (ie combining -march=armv7ve and -mcpu=cortex-a7) goes away
and I used the patched gcc to build glibc. I didn't try to run any
test suite, etc.

I didn't try to confirm that code which uses LPAE can now be
successfully built when only -mcpu=XXX is specified.

> This should be ready for committing with a proper
> ChangeLog entry [1].

I'll send a v2 patch with a ChangeLog entry.

> Thanks,
> Kyrill
>
> [1] https://gcc.gnu.org/codingconventions.html#ChangeLogs or have a look in
> the ChangeLog files in the gcc/ directory for examples
>


Re: [4/10] Don't assume vect_multiple_sizes means 2 sizes

2017-11-08 Thread Jeff Law
On 11/03/2017 10:18 AM, Richard Sandiford wrote:
> Some tests assumed that there would only be 2 vector sizes if
> vect_multiple_sizes, whereas for SVE there are three (SVE, 128-bit
> and 64-bit).  This patch replaces scan-tree-dump-times with
> scan-tree-dump for vect_multiple_sizes but keeps it for
> !vect_multiple_sizes.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/testsuite/
>   * gcc.dg/vect/no-vfa-vect-101.c: Use scan-tree-dump rather than
>   scan-tree-dump-times for vect_multiple_sizes.
>   * gcc.dg/vect/no-vfa-vect-102.c: Likewise.
>   * gcc.dg/vect/no-vfa-vect-102a.c: Likewise.
>   * gcc.dg/vect/no-vfa-vect-37.c: Likewise.
>   * gcc.dg/vect/no-vfa-vect-79.c: Likewise.
>   * gcc.dg/vect/vect-104.c: Likewise.
OK.
jeff


Re: [3/10] Add available_vector_sizes to target-supports.exp

2017-11-08 Thread Jeff Law
On 11/03/2017 10:18 AM, Richard Sandiford wrote:
> This patch adds a routine that lists the available vector sizes
> for a target and uses it for some existing target conditions.
> Later patches add more uses.
> 
> The cases are taken from multiple_sizes.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/testsuite/
>   * lib/target-supports.exp (available_vector_sizes): New proc.
>   (check_effective_target_vect_multiple_sizes): Use it.
>   (check_effective_target_vect64): Likewise.
>   (check_effective_target_vect_sizes_32B_16B): Likewise.
OK.
jeff


Re: [2/10] Add VECTOR_BITS to tree-vect.h

2017-11-08 Thread Jeff Law
On 11/03/2017 10:17 AM, Richard Sandiford wrote:
> Several vector tests are sensitive to the vector size.  This patch adds
> a VECTOR_BITS macro to tree-vect.h to select the expected vector size
> and uses it to influence iteration counts and array sizes.  The tests
> keep the original values if the vector size is small enough.
> 
> For now VECTOR_BITS is always 128, but the SVE patches add other values.
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/testsuite/
>   * gcc.dg/vect/tree-vect.h (VECTOR_BITS): Define.
>   * gcc.dg/vect/bb-slp-pr69907.c: Include tree-vect.h.
>   (N): New macro.
>   (foo): Use it instead of hard-coded 320.
>   * gcc.dg/vect/no-scevccp-outer-7.c (N): Redefine if the default
>   value is too small for VECTOR_BITS.
>   * gcc.dg/vect/no-scevccp-vect-iv-3.c (N): Likewise.
>   * gcc.dg/vect/no-section-anchors-vect-31.c (N): Likewise.
>   * gcc.dg/vect/no-section-anchors-vect-36.c (N): Likewise.
>   * gcc.dg/vect/slp-perm-9.c (N): Likewise.
>   * gcc.dg/vect/vect-32.c (N): Likewise.
>   * gcc.dg/vect/vect-75.c (N, OFF): Likewise.
>   * gcc.dg/vect/vect-77-alignchecks.c (N, OFF): Likewise.
>   * gcc.dg/vect/vect-78-alignchecks.c (N, OFF): Likewise.
>   * gcc.dg/vect/vect-89.c (N): Likewise.
>   * gcc.dg/vect/vect-96.c (N): Likewise.
>   * gcc.dg/vect/vect-multitypes-3.c (N): Likewise.
>   * gcc.dg/vect/vect-multitypes-6.c (N): Likewise.
>   * gcc.dg/vect/vect-over-widen-1.c (N): Likewise.
>   * gcc.dg/vect/vect-over-widen-4.c (N): Likewise.
>   * gcc.dg/vect/vect-reduc-pattern-1a.c (N): Likewise.
>   * gcc.dg/vect/vect-reduc-pattern-1b.c (N): Likewise.
>   * gcc.dg/vect/vect-reduc-pattern-2a.c (N): Likewise.
>   * gcc.dg/vect/no-section-anchors-vect-64.c (NINTS): New macro.
>   (N): Redefine in terms of NINTS.
>   (ia, ib, ic): Use NINTS instead of hard-coded constants in the
>   array bounds.
>   * gcc.dg/vect/no-section-anchors-vect-69.c (NINTS): New macro.
>   (N): Redefine in terms of NINTS.
>   (test1): Replace a and b fields with NINTS - 2 ints of padding.
>   (main1): Use NINTS instead of hard-coded constants.
>   * gcc.dg/vect/section-anchors-vect-69.c (NINTS): New macro.
>   (N): Redefine in terms of NINTS.
>   (test1): Replace a and b fields with NINTS - 2 ints of padding.
>   (test2): Remove incorrect comments about alignment.
>   (main1): Use NINTS instead of hard-coded constants.
>   * gcc.dg/vect/pr45752.c (N): Redefine if the default value is
>   too small for VECTOR_BITS.
>   (main): Continue to use canned results for the default value of N,
>   but compute the expected results from scratch for other values.
>   * gcc.dg/vect/slp-perm-1.c (N, main): As for pr45752.c.
>   * gcc.dg/vect/slp-perm-4.c (N, main): Likewise.
>   * gcc.dg/vect/slp-perm-5.c (N, main): Likewise.
>   * gcc.dg/vect/slp-perm-6.c (N, main): Likewise.
>   * gcc.dg/vect/slp-perm-7.c (N, main): Likewise.
>   * gcc.dg/vect/pr65518.c (NINTS, N, RESULT): New macros.
>   (giga): Use NINTS as the array bound.
>   (main): Use NINTS, N and RESULT.
>   * gcc.dg/vect/pr65947-5.c (N): Redefine if the default value is
>   too small for VECTOR_BITS.
>   (main): Fill in any remaining elements of A programmatically.
>   * gcc.dg/vect/pr81136.c: Include tree-vect.h.
>   (a): Use VECTOR_BITS to set the alignment of the target structure.
>   * gcc.dg/vect/slp-19c.c (N): Redefine if the default value is
>   too small for VECTOR_BITS.
>   (main1): Continue to use the canned input for the default value of N,
>   but compute the input from scratch for other values.
>   * gcc.dg/vect/slp-28.c (N): Redefine if the default value is
>   too small for VECTOR_BITS.
>   (in1, in2, in3): Remove initialization.
>   (check1, check2): Delete.
>   (main1): Initialize in1, in2 and in3 here.  Check every element
>   of the vectors and compute the expected values directly instead
>   of using an array.
>   * gcc.dg/vect/slp-perm-8.c (N): Redefine if the default value is
>   too small for VECTOR_BITS.
>   (foo, main): Change type of "i" to int.
>   * gcc.dg/vect/vect-103.c (NINTS): New macro.
>   (N): Redefine in terms of N.
>   (c): Delete.
>   (main1): Use NINTS.  Check the result from a and b directly.
>   * gcc.dg/vect/vect-67.c (NINTS): New macro.
>   (N): Redefine in terms of N.
>   (main1): Use NINTS for the inner array bounds.
>   * gcc.dg/vect/vect-70.c (NINTS, OUTERN): New macros.
>   (N): Redefine in terms of NINTS.
>   (s): Keep the outer dimensions as 4 even if N is larger than 24.
>   (tmp1): New variable.
>   (main1): Only define a local tmp1 if NINTS is relatively small.
>   Use 

Re: [PATCH 08/22] Add Intel CET support for EH in libgcc.

2017-11-08 Thread H.J. Lu
>>
>>> Has this been tested anywhere other than x86/x86_64 linux?
>> Yes, I tested it on arm64 system. I applied 2 patches, previous 07/22 and 
>> this one 08/22. Everything
>> was built successfully. Further to the building I did testing also. No new 
>> fails.
> So how does that reconcile with H-P's note about the calls to
> uw_install_context when FRAMES_VAR is defined as nothinng?
>
> +  uw_install_context (_context, _context, FRAMES_VAR);
>
> Doesn't that create a syntax error when FRAMES_VAR is defined, but with
> no content?
>

From:

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01832.html



It is done on purpose.   There are

/* Install TARGET into CURRENT so that we can return to it.  This is a
   macro because __builtin_eh_return must be invoked in the context of
   our caller.  */

#define uw_install_context(CURRENT, TARGET, FRAMES) \
  do\
{   \
  long offset = uw_install_context_1 ((CURRENT), (TARGET)); \
  void *handler = uw_frob_return_addr ((CURRENT), (TARGET));\
  _Unwind_DebugHook ((TARGET)->cfa, handler);   \
  _Unwind_Frames_Extra (FRAMES);\
  __builtin_eh_return (offset, handler);\
}   \
  while (0)

For non-CET targets, we have

#define FRAMES_VAR

[hjl@gnu-6 tmp]$ cat bar.c
#define FRAMES_VAR

#define foo(x, FRAMES) _Unwind_Frames_Extra (FRAMES)

foo (xxx, FRAMES_VAR);
[hjl@gnu-6 tmp]$ gcc -E bar.c
# 1 "bar.c"
# 1 ""
# 1 ""
# 31 ""
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "" 2
# 1 "bar.c"




_Unwind_Frames_Extra ();
[hjl@gnu-6 tmp]$


-- 
H.J.


Re: [1/10] Consistently use asm volatile ("" ::: "memory") in vect tests

2017-11-08 Thread Jeff Law
On 11/03/2017 10:15 AM, Richard Sandiford wrote:
> The vectoriser tests used a combination of:
> 
> 1) if (impossible condition) abort ();
> 2) volatile int x; ... *x = ...;
> 3) asm volatile ("" ::: "memory");
> 
> to prevent vectorisation of a set-up loop.  The problem with 1) is that
> the compiler can often tell that the condition is false and optimise
> it away before vectorisation.
> 
> This was already happening in slp-perm-9.c, which is why the test was
> expecting one loop to be vectorised even when the required permutes
> weren't supported.  It becomes a bigger problem with SVE, which is
> able to vectorise more set-up loops.
> 
> The point of this patch is therefore to replace 1) with something else.
> 2) should work most of the time, but we don't usually treat non-volatile
> accesses as aliasing unrelated volatile accesses, so I think in principle
> we could split the loop into one that does the set-up and one that does
> the volatile accesses.  3) seems more robust because it's also a wild
> read and write.
> 
> The patch therefore tries to replace all instances of 1) and 2) with 3).
> 
> 
> 2017-11-03  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/testsuite/
>   * gcc.dg/vect/bb-slp-cond-1.c (main): Add an asm volatile
>   to the set-up loop.
>   * gcc.dg/vect/slp-perm-7.c (main): Prevent vectorisation with
>   asm volatile ("" ::: "memory") instead of a conditional abort.
>   Update the expected vector loop count accordingly.
>   * gcc.dg/vect/slp-perm-9.c (main): Likewise.
>   * gcc.dg/vect/bb-slp-1.c (main1): Prevent vectorisation with
>   asm volatile ("" ::: "memory") instead of a conditional abort.
>   * gcc.dg/vect/slp-23.c (main): Likewise,
>   * gcc.dg/vect/slp-35.c (main): Likewise,
>   * gcc.dg/vect/slp-37.c (main): Likewise,
>   * gcc.dg/vect/slp-perm-4.c (main): Likewise.
>   * gcc.dg/vect/bb-slp-24.c (foo): Likewise.  Remove dummy argument.
>   (main): Update call accordingly.
>   * gcc.dg/vect/bb-slp-25.c (foo, main): As for bb-slp-24.c.
>   * gcc.dg/vect/bb-slp-26.c (foo, main): Likewise.
>   * gcc.dg/vect/bb-slp-29.c (foo, main): Likewise.
>   * gcc.dg/vect/no-vfa-vect-102.c (foo): Delete.
>   (main): Don't initialize it.
>   (main1): Prevent vectorisation with asm volatile ("" ::: "memory")
>   instead of a conditional abort.
>   * gcc.dg/vect/no-vfa-vect-102a.c (foo, main1, main): As for
>   no-vfa-vect-102.c
>   * gcc.dg/vect/vect-103.c (foo, main1, main): Likewise.
>   * gcc.dg/vect/vect-104.c (foo, main1, main): Likewise.
>   * gcc.dg/vect/pr42709.c (main1): Remove dummy argument.
>   Prevent vectorisation with asm volatile ("" ::: "memory")
>   instead of a conditional abort.
>   * gcc.dg/vect/slp-13-big-array.c (y): Delete.
>   (main1): Prevent vectorisation with asm volatile ("" ::: "memory")
>   instead of a conditional abort.
>   * gcc.dg/vect/slp-3-big-array.c (y, main1): As for slp-13-big-array.c.
>   * gcc.dg/vect/slp-34-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/slp-4-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/slp-multitypes-11-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-105.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-105-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-112-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-15-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-2-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-34-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-6-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-73-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-74-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-75-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-76-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-80-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-97-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-all-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-reduc-1char-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-reduc-2char-big-array.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-mult.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u16-i2.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u16-i4.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u16-mult.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u8-i2-gap.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c (y, main1):
>   Likewise.
>   * gcc.dg/vect/vect-strided-a-u8-i8-gap2.c (y, main1): Likewise.
>   * gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c (y, main1):
>   Likewise.
>   * 

Re: [PATCH 08/22] Add Intel CET support for EH in libgcc.

2017-11-08 Thread Jeff Law
On 11/04/2017 06:43 AM, Tsimbalist, Igor V wrote:
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Tuesday, October 31, 2017 5:49 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: i...@airs.com
>> Subject: Re: [PATCH 08/22] Add Intel CET support for EH in libgcc.
>>
>> On 10/12/2017 01:56 PM, Tsimbalist, Igor V wrote:
>>> Control-flow Enforcement Technology (CET), published by Intel, Introduces
>>> the Shadow Stack feature, which ensures a return from a function is done
>>> to exactly the same location from where the function was called. When EH
>>> is present the control-flow transfer may skip some stack frames and the
>>> shadow stack has to be adjusted not to signal a violation of a
>>> control-flow transfer. It's done by counting a number of skipping frames
>>> and adjusting shadow stack pointer by this number.
>>>
>>> gcc/
>>> * config/i386/i386.c (ix86_expand_epilogue): Change simple
>>> return to indirect jump for EH return. Change explicit 'false'
>>> argument in pro_epilogue_adjust_stack with a value of
>>> flag_cf_protection.
>>> * config/i386/i386.md (simple_return_indirect_internal): Remove
>>> SImode restriction to support 64-bit.
>>>
>>> libgcc/
>>> * config/i386/linux-unwind.h: Include
>>> config/i386/shadow-stack-unwind.h.
>>> * config/i386/shadow-stack-unwind.h: New file.
>>> * unwind-dw2.c: (uw_install_context): Add a FRAMES argument and
>>> pass it to _Unwind_Frames_Extra.
>>> * unwind-generic.h (FRAMES_P_DECL): New.
>>> (FRAMES_VAR): Likewise.
>>> (FRAMES_VAR_P): Likewise.
>>> (FRAMES_VAR_DECL): Likewise.
>>> (FRAMES_VAR_DECL_1): Likewise.
>>> (FRAMES_VAR_INC): Likewise.
>>> (FRAMES_P_UPDATE): Likewise.
>>> (_Unwind_Frames_Extra): Likewise.
>>> * unwind.inc (_Unwind_RaiseException_Phase2): Use
>> FRAMES_P_DECL,
>>> FRAMES_VAR_DECL_1, FRAMES_VAR_INC and FRAMES_P_UPDATE.
>>> (_Unwind_RaiseException): Use FRAMES_VAR_DECL,
>> FRAMES_VAR_P and
>>> FRAMES_VAR.
>>> (_Unwind_ForcedUnwind_Phase2): Use FRAMES_P_DECL,
>>> FRAMES_VAR_DECL_1, FRAMES_VAR_INC, FRAMES_P_UPDATE.
>>> (_Unwind_ForcedUnwind): Use FRAMES_VAR_DECL,
>> FRAMES_VAR_P and
>>> FRAMES_VAR.
>>> (_Unwind_Resume): Use FRAMES_VAR_DECL, FRAMES_VAR_P and
>>> FRAMES_VAR.
>>> (_Unwind_Resume_or_Rethrow): Use FRAMES_VAR_DECL,
>> FRAMES_VAR_P
>>> and FRAMES_VAR.
>>>
>>> Igor
>>>
>>>
>>>
>>> 0008-Add-Intel-CET-support-for-EH-in-libgcc.patch
>>>
>>>
>>> From 16eb1d0d9239e039fba28f1ae71762f19061b157 Mon Sep 17 00:00:00
>> 2001
>>> From: Igor Tsimbalist 
>>> Date: Wed, 19 Jul 2017 03:04:46 +0300
>>> Subject: [PATCH 08/22] Add Intel CET support for EH in libgcc.
>>>
>>> Control-flow Enforcement Technology (CET), published by Intel,
>>> introduces
>>> the Shadow Stack feature, which ensures a return from a function is done
>>> to exactly the same location from where the function was called. When EH
>>> is present the control-flow transfer may skip some stack frames and the
>>> shadow stack has to be adjusted not to signal a violation of a
>>> control-flow transfer. It's done by counting a number of skiping frames
>>> and adjasting shadow stack pointer by this number.
>>>
>>> Having new semantic of the 'ret' instruction if CET is supported in HW
>>> the 'ret' instruction cannot be generated in ix86_expand_epilogue when
>>> we are returning after EH is processed. Added a code in
>>> ix86_expand_epilogue to adjust Shadow Stack pointer and to generate an
>>> indirect jump instead of 'ret'. As sp register is used during this
>>> adjustment thus the argument in pro_epilogue_adjust_stack is changed
>>> to update cfa_reg based on whether control-flow instrumentation is set.
>>> Without updating the cfa_reg field there is an assert later in dwarf2
>>> pass related to mismatch the stack register and cfa_reg value.
>>>
>>> gcc/
>>> * config/i386/i386.c (ix86_expand_epilogue): Change simple
>>> return to indirect jump for EH return. Change explicit 'false'
>>> argument in pro_epilogue_adjust_stack with a value of
>>> flag_cf_protection.
>>> * config/i386/i386.md (simple_return_indirect_internal): Remove
>>> SImode restriction to support 64-bit.
>>>
>>> libgcc/
>>> * config/i386/linux-unwind.h: Include
>>> config/i386/shadow-stack-unwind.h.
>>> * config/i386/shadow-stack-unwind.h: New file.
>>> * unwind-dw2.c: (uw_install_context): Add a FRAMES argument and
>>> pass it to _Unwind_Frames_Extra.
>>> * unwind-generic.h (FRAMES_P_DECL): New.
>>> (FRAMES_VAR): Likewise.
>>> (FRAMES_VAR_P): Likewise.
>>> (FRAMES_VAR_DECL): Likewise.
>>> (FRAMES_VAR_DECL_1): Likewise.
>>> (FRAMES_VAR_INC): Likewise.
>>> (FRAMES_P_UPDATE): Likewise.
>>> (_Unwind_Frames_Extra): Likewise.
>>> * unwind.inc (_Unwind_RaiseException_Phase2): Use
>> FRAMES_P_DECL,
>>> 

Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-08 Thread Kyrill Tkachov


On 06/06/17 14:17, James Greenhalgh wrote:

On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:

Hi all,

On top of the previous vec_merge simplifications [1] we can add this pattern to 
perform
a store of a vec_concat of two 64-bit values in distinct registers as an STP.
This avoids constructing such a vector explicitly in a register and storing it 
as
a Q register.
This way for the code in the testcase we can generate:

construct_lane_1:
 ldp d1, d0, [x0]
 fmovd3, 1.0e+0
 fmovd2, 2.0e+0
 faddd4, d1, d3
 faddd5, d0, d2
 stp d4, d5, [x1, 32]
 ret

construct_lane_2:
 ldp x2, x0, [x0]
 add x3, x2, 1
 add x4, x0, 2
 stp x3, x4, [x1, 32]
 ret

instead of the current:
construct_lane_1:
 ldp d0, d1, [x0]
 fmovd3, 1.0e+0
 fmovd2, 2.0e+0
 faddd0, d0, d3
 faddd1, d1, d2
 dup v0.2d, v0.d[0]
 ins v0.d[1], v1.d[0]
 str q0, [x1, 32]
 ret

construct_lane_2:
 ldp x2, x3, [x0]
 add x0, x2, 1
 add x2, x3, 2
 dup v0.2d, x0
 ins v0.d[1], x2
 str q0, [x1, 32]
 ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

OK.


Thanks, I've committed this and the other patches in this series after 
rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
The only conflict from updating the patch was that I had to use the 
store_16 attribute rather than
the old store2 for the new define_insn. This is what I've committed with 
r254551.


Sorry for the delay in committing.

Kyrill


Thanks,
James


2017-06-06  Kyrylo Tkachov  

 * config/aarch64/aarch64-simd.md (store_pair_lanes):
 New pattern.
 * config/aarch64/constraints.md (Uml): New constraint.
 * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
 predicate.

2017-06-06  Kyrylo Tkachov  

 * gcc.target/aarch64/store_v2vec_lanes.c: New test.




diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b78affe9b06ffc97322a4fcf1ec8e80ecdf6..0847e7aff230782874565e565929c10053807839 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2817,6 +2817,18 @@ (define_insn "load_pair_lanes"
   [(set_attr "type" "neon_load1_1reg_q")]
 )
 
+(define_insn "store_pair_lanes"
+  [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand" "=Uml, Uml")
+	(vec_concat:
+	   (match_operand:VDC 1 "register_operand" "w, r")
+	   (match_operand:VDC 2 "register_operand" "w, r")))]
+  "TARGET_SIMD"
+  "@
+   stp\\t%d1, %d2, %0
+   stp\\t%x1, %x2, %0"
+  [(set_attr "type" "neon_stp, store_16")]
+)
+
 ;; In this insn, operand 1 should be low, and operand 2 the high part of the
 ;; dest vector.
 
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index ab607b9f7488e903a14fe93e88d4c4e1fad762b3..6bdb93f07a7c8d19675971ccfc71e681fd2dae66 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -154,6 +154,15 @@ (define_memory_constraint "Ump"
(match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 0),
 		  PARALLEL, 1)")))
 
+;; Used for storing two 64-bit values in an AdvSIMD register using an STP
+;; as a 128-bit vec_concat.
+(define_memory_constraint "Uml"
+  "@internal
+  A memory address suitable for a load/store pair operation."
+  (and (match_code "mem")
+   (match_test "aarch64_legitimate_address_p (DFmode, XEXP (op, 0),
+		   PARALLEL, 1)")))
+
 (define_memory_constraint "Utv"
   "@internal
An address valid for loading/storing opaque structure
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 16e864765cde3bf8f54a11e5fc8db5a53606db30..c175f9d9a5f969817782bede858b1834ac642e28 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -173,6 +173,13 @@ (define_predicate "aarch64_mem_pair_operand"
(match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), PARALLEL,
 	   0)")))
 
+;; Used for storing two 64-bit values in an AdvSIMD register using an STP
+;; as a 128-bit vec_concat.
+(define_predicate "aarch64_mem_pair_lanes_operand"
+  (and (match_code "mem")
+   (match_test "aarch64_legitimate_address_p (DFmode, XEXP (op, 0),
+		   PARALLEL, 1)")))
+
 (define_predicate "aarch64_prefetch_operand"
   (match_test "aarch64_address_valid_for_prefetch_p (op, false)"))
 
diff --git a/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c b/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c
new file mode 100644
index ..6810db3c54dce81777caf062177facedb464d1d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */

Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only

2017-11-08 Thread Jeff Law
On 11/07/2017 09:22 AM, Tsimbalist, Igor V wrote:
> I decided to split my previous patch "Enable building libitm with Intel CET "
> into two different patches. The first patch will add a new field to sjlj.S and
> target.h  files. The second one will add Intel CET support on the top of the
> first one. In this case the further changes for adding Intel CET support are
> seen clearly.
> 
> Ok for trunk?
> 

[ ... snip ... ]

> 
> 
> 0021-Add-extra-field-to-gtm_jmpbuf-on-x86-only.patch
> 
> 
> From a6361c78bf774f2b4dbeeaf4147c286cff4ae5a4 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 7 Nov 2017 17:00:24 +0300
> Subject: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only
> 
> Expand the gtm_jmpbuf structure by one word field to add
> Intel CET support further. The code in sjlj.S already
> allocates more space on the stack then gtm_jmpbuf needs.
> Use this extra space to absorb the new field.
> 
> The structure is allocated on the stack in such a way
> that eip/rsp field is overlapped with return address on
> the stack. Locate the new field right before eip/rsp so
> code that accesses buffer fields relative to address of
> gtm_jmpbuf has its offsets unchanged.
> 
> The libtool_VERSION is updated for x86 due to extending
> the gtm_jmpbuf structure.
>   
> * libitm/config/x86/target.h: Add new field (ssp).
> * libitm/config/x86/sjlj.S: Change offsets.
> * libitm/configure.tgt: Update libtool_VERSION.
So if I understand correctly, given the desire to to have the eip/rip
field overlap with the return address on the stack offset changes are
inevitable if we add fields.


>  esac
> +
> +# Update libtool_VERSION since the size of struct gtm_jmpbuf is
> +# changed for x86.
> +case "${host}" in
> +
> +  # For x86, we use slots in the TCB head for most of our TLS.
> +  # The setup of those slots in beginTransaction can afford to
> +  # use the global-dynamic model.
> +  i[456]86-*-* | x86_64-*-*)
> + libtool_VERSION=2:0:0
What's the plan for supporting existing code that may have linked
dynamically against libitm?

One approach is to force the distros to carry the old libitm DSO.

THe other would be to try and support both within the same DSO using
symbol versioning.  That would seem to imply that we'd need to the
before/after code to build that single library that supported both.

Thoughts?  Jakub, any interest in chiming in here?

jeff
> + ;;
> +esac
> -- 1.8.3.1
> 



Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Richard Sandiford
Martin Sebor  writes:
> On 11/08/2017 09:51 AM, Richard Sandiford wrote:
>> Martin Sebor  writes:
>>> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
 Martin Sebor  writes:
> I haven't done nearly a thorough review but the dtor followed by
> the placement new in the POLY_SET_COEFF() macro caught my eye so
> I thought I'd ask sooner rather than later.  Given the macro
> definition:
>
> +   The dummy comparison against a null C * is just a way of checking
> +   that C gives the right type.  */
> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
> +  ((void) (&(RES).coeffs[0] == (C *) 0), \
> +   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
> +   ? (void) ((RES).coeffs[I] = VALUE) \
> +   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>
> is the following use well-defined?
>
> +template
> +inline poly_int_pod&
> +poly_int_pod::operator <<= (unsigned int a)
> +{
> +  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>
> It looks to me as though the VALUE argument in the ctor invoked
> by the placement new expression is evaluated after the dtor has
> destroyed the very array element the VALUE argument expands to.

 Good catch!  It should simply have been doing <<= on each coefficient --
 I must have got carried away when converting to POLY_SET_COEFF.

 I double-checked the other uses and think that's the only one.

> Whether or not is, in fact, a problem, it seems to me that using
> a function template rather than a macro would be a clearer and
> safer way to do the same thing.  (Safer in that the macro also
> evaluates its arguments multiple times, which is often a source
> of subtle bugs.)

 That would slow down -O0 builds though, by introducing an extra
 function call and set of temporaries even when the coefficients
 are primitive integers.
>>>
>>> Would decorating the function template with attribute always_inline
>>> help?
>>
>> It would remove the call itself, but we'd still have the extra temporary
>> objects that were the function argument and return value.
>
> Sorry, I do not want to get into another long discussion about
> trade-offs between safety and efficiency but I'm not sure I see
> what extra temporaries it would create.  It seems to me that
> an inline function template that took arguments of user-defined
> types by reference and others by value should be just as efficient
> as a macro.
>
>  From GCC's own manual:
>
>6.43 An Inline Function is As Fast As a Macro
>https://gcc.gnu.org/onlinedocs/gcc/Inline.html

You can see the difference with something like:

  inline
  void __attribute__((always_inline))
  f(int , const int ) { dst = src; }

  int g1(const int ) { int x; f(x, y); return x; }
  int g2(const int ) { int x; x = y; return x; }

where *.optimized from GCC 7.1 at -O0 is:

int g1(const int&) (const int & y)
{
  int & dst;
  const int & src;
  int x;
  int D.2285;
  int _3;
  int _6;

   [0.00%]:
  src_5 = y_2(D);
  _6 = *src_5;
  x = _6;
  _3 = x;
  x ={v} {CLOBBER};

 [0.00%]:
  return _3;

}

vs:

int g2(const int&) (const int & y)
{
  int x;
  int D.2288;
  int _4;

   [0.00%]:
  x_3 = *y_2(D);
  _4 = x_3;

 [0.00%]:
  return _4;

}

> If that's not the case and there is a significant performance
> penalty associated with inline functions at -O0 then GCC should
> be fixed to avoid it.

I think those docs are really talking about inline functions being as
fast as macros when optimisation is enabled.  I don't think we make
any guarantees about -O0 code quality.

Thanks,
Richard


Re: [PATCH, rs6000] Fix scheduling description for quad-precision multiply instructions

2017-11-08 Thread Pat Haugen
On 11/08/2017 11:30 AM, Segher Boessenkool wrote:
>> --- gcc/config/rs6000/power9.md  (revision 254377)
>> +++ gcc/config/rs6000/power9.md  (working copy)
>> @@ -436,6 +436,12 @@ (define_insn_reservation "power9-qpdiv"
>> (eq_attr "cpu" "power9"))
>>"DU_super_power9,dfu_power9")
>>  
>> +(define_insn_reservation "power9-qpmul" 24
>> +  (and (eq_attr "type" "qmul")
>> +   (eq_attr "size" "128")
>> +   (eq_attr "cpu" "power9"))
>> +  "DU_super_power9,dfu_power9*12")
> All other p9 reservations (other than integer div) do not reserve a
> unit more than one cycle.  Will this blow up size of the automaton?
> If not, should the other qp ops not reserve for more cycles, too?
> 

You are observant. :)  I should have just included the change for qp
divide in this patch since it's so simple, but didn't. The qp divide
also blocks the pipe for a number of cycles, so yes should be changed.
The other qp insns (add/sub/etc) do not, so are correct as is.


> Rest looks fine; okay for trunk with the reservation thing taken care
> of one way or the other.  Thanks!

Following is additional change I'll include along with updated
ChangeLog. Testing was fine.

@@ -434,8 +434,14 @@
   (and (eq_attr "type" "vecdiv")
(eq_attr "size" "128")
(eq_attr "cpu" "power9"))
-  "DU_super_power9,dfu_power9")
+  "DU_super_power9,dfu_power9*44")



Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Wilco Dijkstra
Jeff Law wrote:

> I'd actually prefer to deprecate the H8 and M68k.  But assuming that's
> not going to happen in the immediate future I think dropping frame
> pointers on those targets is appropriate as long as we're generating
> dwarf frame info.

Is there a way to check a target does not generate dwarf? I thought that
was the default.

> > I believe in those cases targets already force the frame pointer as 
> > required,
> > for example msp430 sets the frame pointer if unwind tables are emitted
> > irrespectively of the command-line or default setting. Various other targets
> > don't even use frame_pointer_needed and just do their own thing.
> I've had the "pleasure" of going round and round on this repeatedly
> through the years with the kernel teams on this.  Essentially they
> didn't want to embed the dwarf2 unwinder in the kernel or have all those
> pages of unwind data.  Instead they strongly preferred to have a frame
> pointer to facilitate easy and fast unwinding.

A frame pointer does not facilitate unwinding, it can give a backtrace at best.

But indeed, frame pointers, unwinding and stack chains are often confused
despite being completely orthogonal concepts...

> So  my concern is to make sure the kernel folks, particularly in the
> ia32 world aren't going to get hosed by this change.  If we're changing
> the default it needs to be signaled to them so that they can ensure that
> if they want frame pointers that they still get them.

x86/x64 is not affected since it already omits the frame pointer by default 
(like
almost all targets in a target specific way). This patch is about making that
the global default precisely because pretty much every target already has it
as the default.

Wilco

Re: [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only

2017-11-08 Thread H.J. Lu
On Tue, Nov 7, 2017 at 8:22 AM, Tsimbalist, Igor V
 wrote:
> I decided to split my previous patch "Enable building libitm with Intel CET "
> into two different patches. The first patch will add a new field to sjlj.S and
> target.h  files. The second one will add Intel CET support on the top of the
> first one. In this case the further changes for adding Intel CET support are
> seen clearly.
>
> Ok for trunk?
>

libitm/configure.tgt should check ${target} like the other places:

+# Update libtool_VERSION since the size of struct gtm_jmpbuf is
+# changed for x86.
+case "${host}" in

Did these come from cut and paste?

+  # For x86, we use slots in the TCB head for most of our TLS.
+  # The setup of those slots in beginTransaction can afford to
+  # use the global-dynamic model.

I think the whole thing should be:

case "${target}" in
  # Update libtool_VERSION since the size of struct gtm_jmpbuf is
  # changed for x86.
  i[456]86-*-* | x86_64-*-*)
libtool_VERSION=2:0:0
;;
esac




-- 
H.J.


Re: [PATCH 13/22] Enable building libstdc++-v3 with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:03 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Tsimbalist, Igor V
>> Sent: Friday, October 13, 2017 2:09 PM
>> To: gcc-patches@gcc.gnu.org; libstd...@gcc.gnu.org
>> Cc: Jeff Law ; Tsimbalist, Igor V
>> 
>> Subject: RE: [PATCH 13/22] Enable building libstdc++-v3 with Intel CET
>>
>> Added libstd...@gcc.gnu.org
>>
>>
>>> -Original Message-
>>> From: Tsimbalist, Igor V
>>> Sent: Thursday, October 12, 2017 10:24 PM
>>> To: gcc-patches@gcc.gnu.org
>>> Cc: Jeff Law ; jwak...@redhat.com; Tsimbalist, Igor V
>>> 
>>> Subject: [PATCH 13/22] Enable building libstdc++-v3 with Intel CET
>>>
>>> Enable building libstdc++v3 with CET options.
>>>
>>> libstdc++-v3/
>>> * acinclude.m4: Add cet.m4.
>>> * configure.ac: Set CET_FLAGS. Update EXTRA_CFLAGS.
>>> * libsupc++/Makefile.am: Add EXTRA_CFLAGS.
>>> * Makefile.in: Regenerate.
>>> * configure: Likewise.
>>> * doc/Makefile.in: Likewise.
>>> * include/Makefile.in: Likewise.
>>> * libsupc++/Makefile.in: Likewise.
>>> * po/Makefile.in: Likewise.
>>> * python/Makefile.in: Likewise.
>>> * src/Makefile.in: Likewise.
>>> * src/c++11/Makefile.in: Likewise.
>>> * src/c++98/Makefile.in: Likewise.
>>> * src/filesystem/Makefile.in: Likewise.
>>> * testsuite/Makefile.in: Likewise.
> 
> 
> 0013-Enable-building-libstdc-v3-with-Intel-CET.PATCH
> 
> 
> From 83ef2dcf9f09600245e2162fdc945374d0ad808b Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Mon, 14 Aug 2017 18:41:40 +0300
> Subject: [PATCH 13/21] Enable building libstdc++-v3 with Intel CET
> 
> libstdc++-v3/
>   * acinclude.m4: Add cet.m4.
>   * configure.ac: Set CET_FLAGS. Update EXTRA_CFLAGS,
>   EXTRA_CXX_FLAGS.
>   * libsupc++/Makefile.am: Use Add EXTRA_CFLAGS.
>   * Makefile.in: Regenerate.
>   * configure: Likewise.
>   * doc/Makefile.in: Likewise.
>   * include/Makefile.in: Likewise.
>   * libsupc++/Makefile.in: Likewise.
>   * po/Makefile.in: Likewise.
>   * python/Makefile.in: Likewise.
>   * src/Makefile.in: Likewise.
>   * src/c++11/Makefile.in: Likewise.
>   * src/c++98/Makefile.in: Likewise.
>   * src/filesystem/Makefile.in: Likewise.
>   * testsuite/Makefile.in: Likewise.
OK for the trunk.  Please wait to commit until entire series is ACK'd.

jeff


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Jeff Law
On 11/08/2017 10:47 AM, Wilco Dijkstra wrote:
> Joseph Myers wrote:
>> On Fri, 3 Nov 2017, Wilco Dijkstra wrote:
>>
>>> Almost all targets add an explict -fomit-frame-pointer in the target 
>>> specific
>>> options.  Rather than doing this in a target-specific way, do this in the
>>
>> Which targets do not?  You should explicitly list them and CC their 
>> maintainers and seek confirmation that such a change is appropriate for 
>> them.
> 
> The targets that don't explicitly enable -fomit-frame-pointer in the target
> options or force it internally are bfin, ft32, h8300, m68k - I've CCd the
> maintainers (it seems there is no-one for h8300).
Which means it's probably myself or Alex for the H8 :(  Some things you
can never manage to get rid of.

I'd actually prefer to deprecate the H8 and M68k.  But assuming that's
not going to happen in the immediate future I think dropping frame
pointers on those targets is appropriate as long as we're generating
dwarf frame info.

For deeply embedded targets that don't want the overhead of dwarf, well,
IMHO, they can add -fno-omit-frame-pointer explicitly :-)


> 
>> The addition of -fomit-frame-pointer through this mechanism was a 
>> replacement for the old target macro CAN_DEBUG_WITHOUT_FP.  It may now be 
>> the cases that with DWARF debug info, having or not having a frame pointer 
>> is not particularly relevant to debugging.  But since there are other 
>> reasons people may want a frame pointer (e.g. light-weight backtraces that 
>> don't depend on debug / unwind info), it's at least possible there are 
>> architecture-specific choices regarding keeping frame pointers involved 
>> here.
> 
> I believe in those cases targets already force the frame pointer as required,
> for example msp430 sets the frame pointer if unwind tables are emitted
> irrespectively of the command-line or default setting. Various other targets
> don't even use frame_pointer_needed and just do their own thing.
I've had the "pleasure" of going round and round on this repeatedly
through the years with the kernel teams on this.  Essentially they
didn't want to embed the dwarf2 unwinder in the kernel or have all those
pages of unwind data.  Instead they strongly preferred to have a frame
pointer to facilitate easy and fast unwinding.

So  my concern is to make sure the kernel folks, particularly in the
ia32 world aren't going to get hosed by this change.  If we're changing
the default it needs to be signaled to them so that they can ensure that
if they want frame pointers that they still get them.

jeff



Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Jakub Jelinek
On Wed, Nov 08, 2017 at 06:51:34PM +0100, Marek Polacek wrote:
> > Ok, so like this if it passes bootstrap/regtest?
> > 
> > Changes from the last patch:
> > 1) false instead of lval for COMPOUND_EXPR and *COND_EXPR op1/op2
> 
> So...

Oops, I've hand-edited it in the patch and then regenerated the patch
after doing there more changes.

Here is updated patch with that in again:

2017-11-08  Jakub Jelinek  

PR c/66618
PR c/69960
c-family/
* c-common.h (c_fully_fold): Add LVAL argument defaulted to false.
c/
* c-parser.c (c_parser_omp_atomic): Pass true as LVAL to c_fully_fold
where needed.
* c-typeck.c (build_unary_op, build_modify_expr, build_asm_expr,
handle_omp_array_sections): Likewise.
(digest_init): Don't call decl_constant_value_for_optimization.
* c-tree.h (decl_constant_value_for_optimization): Removed.
* c-fold.c (c_fold_array_ref): New function.
(c_fully_fold_internal): Add LVAL argument, propagate it through
recursive calls.  For VAR_P call decl_constant_value and
unshare if not LVAL and either optimizing or IN_INIT.  Remove
decl_constant_value_for_optimization calls.  If IN_INIT and not LVAL,
fold ARRAY_REF with STRING_CST and INTEGER_CST operands.
(c_fully_fold): Add LVAL argument, pass it through to
c_fully_fold_internal.
(decl_constant_value_for_optimization): Removed.
cp/
* cp-gimplify.c (c_fully_fold): Add LVAL argument, call
cp_fold_maybe_rvalue instead of cp_fold_rvalue and pass it !LVAL.
testsuite/
* gcc.dg/pr69960.c: New test.
* gcc.dg/pr66618.c: New test.
* gcc.dg/pr66618-2.c: New test.

--- gcc/c-family/c-common.h.jj  2017-11-08 16:19:30.309488815 +0100
+++ gcc/c-family/c-common.h 2017-11-08 17:49:00.548178246 +0100
@@ -827,7 +827,7 @@ extern tree c_build_bitfield_integer_typ
 extern enum conversion_safety unsafe_conversion_p (location_t, tree, tree, 
tree,
   bool);
 extern bool decl_with_nonnull_addr_p (const_tree);
-extern tree c_fully_fold (tree, bool, bool *);
+extern tree c_fully_fold (tree, bool, bool *, bool = false);
 extern tree c_wrap_maybe_const (tree, bool);
 extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
--- gcc/c/c-parser.c.jj 2017-11-08 16:19:30.569485632 +0100
+++ gcc/c/c-parser.c2017-11-08 17:49:00.552178196 +0100
@@ -14748,7 +14748,7 @@ c_parser_omp_atomic (location_t loc, c_p
 case NOP_EXPR: /* atomic write */
   v = c_parser_cast_expression (parser, NULL).value;
   non_lvalue_p = !lvalue_p (v);
-  v = c_fully_fold (v, false, NULL);
+  v = c_fully_fold (v, false, NULL, true);
   if (v == error_mark_node)
goto saw_error;
   if (non_lvalue_p)
@@ -14767,7 +14767,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  lhs = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (lhs);
- lhs = c_fully_fold (lhs, false, NULL);
+ lhs = c_fully_fold (lhs, false, NULL, true);
  if (lhs == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14793,7 +14793,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  v = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (v);
- v = c_fully_fold (v, false, NULL);
+ v = c_fully_fold (v, false, NULL, true);
  if (v == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14814,7 +14814,7 @@ restart:
   lhs = expr.value;
   expr = default_function_array_conversion (eloc, expr);
   unfolded_lhs = expr.value;
-  lhs = c_fully_fold (lhs, false, NULL);
+  lhs = c_fully_fold (lhs, false, NULL, true);
   orig_lhs = lhs;
   switch (TREE_CODE (lhs))
 {
@@ -14954,15 +14954,19 @@ restart:
  if (c_tree_equal (TREE_OPERAND (rhs1, 0), unfolded_lhs))
{
  opcode = TREE_CODE (rhs1);
- rhs = c_fully_fold (TREE_OPERAND (rhs1, 1), false, NULL);
- rhs1 = c_fully_fold (TREE_OPERAND (rhs1, 0), false, NULL);
+ rhs = c_fully_fold (TREE_OPERAND (rhs1, 1), false, NULL,
+ true);
+ rhs1 = c_fully_fold (TREE_OPERAND (rhs1, 0), false, NULL,
+  true);
  goto stmt_done;
}
  if (c_tree_equal (TREE_OPERAND (rhs1, 1), unfolded_lhs))
{
  opcode = TREE_CODE (rhs1);
- rhs = c_fully_fold (TREE_OPERAND (rhs1, 0), false, NULL);
- rhs1 = c_fully_fold (TREE_OPERAND (rhs1, 1), false, NULL);
+ rhs = c_fully_fold (TREE_OPERAND (rhs1, 0), false, NULL,
+ true);
+ rhs1 = c_fully_fold (TREE_OPERAND 

Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Marek Polacek
On Wed, Nov 08, 2017 at 06:38:21PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 08, 2017 at 05:17:50PM +, Joseph Myers wrote:
> > On Wed, 8 Nov 2017, Jakub Jelinek wrote:
> > 
> > > of course only if LVAL is false.  Additionally, I've added folding of
> > > "foo"[2] into 'o'.  We have it in gimple-fold.c or so, so that one
> > 
> > Note that if the 2 there comes from an overflowing expression that's not 
> > valid as an extension to constant expressions in initializers (that is,
> > 
> > static char c = "foo"[INT_MAX * -2];
> > 
> > should result in an error with -pedantic-errors because of the overflow, 
> > just as INT_MAX * -2 by itself wouldn't be a valid initializer in that 
> > case).
> > 
> > > Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
> > > false as LVAL for the first operand (condition) and lval as LVAL for the
> > > other two (i.e. if called with lval == true on the whole *_COND_EXPR
> > > decl_constant_value_for_optimization etc. isn't performed on op1/op2, 
> > > while
> > > without it it is).  Can one take address of the whole COND_EXPR, or
> > > have it on LHS of anything in C?  If not, perhaps I could just pass false
> > > in all the 3 calls.  Though, then likely it would be called with lval == 
> > > false
> > > anyway.
> > 
> > Conditional and compound expressions are never valid lvalues in C.
> 
> Ok, so like this if it passes bootstrap/regtest?
> 
> Changes from the last patch:
> 1) false instead of lval for COMPOUND_EXPR and *COND_EXPR op1/op2

So...

> @@ -278,21 +339,16 @@ c_fully_fold_internal (tree expr, bool i
>orig_op0 = op0 = TREE_OPERAND (expr, 0);
>orig_op1 = op1 = TREE_OPERAND (expr, 1);
>op0 = c_fully_fold_internal (op0, in_init, maybe_const_operands,
> -maybe_const_itself, for_int_const);
> +maybe_const_itself, for_int_const,
> +op0_lval);
>STRIP_TYPE_NOPS (op0);
> -  if (code != MODIFY_EXPR
> -   && code != PREDECREMENT_EXPR
> -   && code != PREINCREMENT_EXPR
> -   && code != POSTDECREMENT_EXPR
> -   && code != POSTINCREMENT_EXPR)
> - op0 = decl_constant_value_for_optimization (op0);
>/* The RHS of a MODIFY_EXPR was fully folded when building that
>expression for the sake of conversion warnings.  */
>if (code != MODIFY_EXPR)
>   op1 = c_fully_fold_internal (op1, in_init, maybe_const_operands,
> -  maybe_const_itself, for_int_const);
> +  maybe_const_itself, for_int_const,
> +  code == COMPOUND_EXPR ? lval : false);

...shouldn't we be passing just false here?

Marek


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Wilco Dijkstra
Joseph Myers wrote:
> On Fri, 3 Nov 2017, Wilco Dijkstra wrote:
>
> > Almost all targets add an explict -fomit-frame-pointer in the target 
> > specific
> > options.  Rather than doing this in a target-specific way, do this in the
>
> Which targets do not?  You should explicitly list them and CC their 
> maintainers and seek confirmation that such a change is appropriate for 
> them.

The targets that don't explicitly enable -fomit-frame-pointer in the target
options or force it internally are bfin, ft32, h8300, m68k - I've CCd the
maintainers (it seems there is no-one for h8300).

> The addition of -fomit-frame-pointer through this mechanism was a 
> replacement for the old target macro CAN_DEBUG_WITHOUT_FP.  It may now be 
> the cases that with DWARF debug info, having or not having a frame pointer 
> is not particularly relevant to debugging.  But since there are other 
> reasons people may want a frame pointer (e.g. light-weight backtraces that 
> don't depend on debug / unwind info), it's at least possible there are 
> architecture-specific choices regarding keeping frame pointers involved 
> here.

I believe in those cases targets already force the frame pointer as required,
for example msp430 sets the frame pointer if unwind tables are emitted
irrespectively of the command-line or default setting. Various other targets
don't even use frame_pointer_needed and just do their own thing.

Wilco


Re: [PATCH 22/22] Enable building libitm with Intel CET

2017-11-08 Thread Jeff Law
On 11/07/2017 09:28 AM, Tsimbalist, Igor V wrote:
> Enable building libitm with Intel CET. The patch is based on previous patch 
> [PATCH 21/22] Add extra field to gtm_jmpbuf on x86 only.
> 
> Ok for trunk?
> 
> Igor
> 
> 
> 0022-Enable-building-libitm-with-Intel-CET.PATCH
> 
> 
> From 2a83369b28b279aed2d6fd110bb35ef65761677a Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 7 Nov 2017 17:07:33 +0300
> Subject: [PATCH 22/22] Enable building libitm with Intel CET
> 
> libitm/
>   * Makefile.in: Regenerate.
>   * acinclude.m4: Add enable.m4 and cet.m4.
>   * config/x86/sjlj.S: Include cet.h.
>   (_ITM_beginTransaction): Add _CET_ENDBR.
>   Save Shadow Stack pointer.
>   (GTM_longjmp): Add _CET_ENDBR. Restore Shadow Stack pointer.
>   * config/x86/target.h (struct gtm_jmpbuf):
>   Add new field for Shadow Stack pointer.
>   * configure: Regenerate.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>   * configure.ac: Update libtool_VERSION for x86.
>   * testsuite/Makefile.in: Regenerate.
I'm going to assume the bits to compute the number of skipped frames is
reasonble.

OK to commit when the series as a whole is approved.

jeff


Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Jakub Jelinek
On Wed, Nov 08, 2017 at 05:17:50PM +, Joseph Myers wrote:
> On Wed, 8 Nov 2017, Jakub Jelinek wrote:
> 
> > of course only if LVAL is false.  Additionally, I've added folding of
> > "foo"[2] into 'o'.  We have it in gimple-fold.c or so, so that one
> 
> Note that if the 2 there comes from an overflowing expression that's not 
> valid as an extension to constant expressions in initializers (that is,
> 
> static char c = "foo"[INT_MAX * -2];
> 
> should result in an error with -pedantic-errors because of the overflow, 
> just as INT_MAX * -2 by itself wouldn't be a valid initializer in that 
> case).
> 
> > Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
> > false as LVAL for the first operand (condition) and lval as LVAL for the
> > other two (i.e. if called with lval == true on the whole *_COND_EXPR
> > decl_constant_value_for_optimization etc. isn't performed on op1/op2, while
> > without it it is).  Can one take address of the whole COND_EXPR, or
> > have it on LHS of anything in C?  If not, perhaps I could just pass false
> > in all the 3 calls.  Though, then likely it would be called with lval == 
> > false
> > anyway.
> 
> Conditional and compound expressions are never valid lvalues in C.

Ok, so like this if it passes bootstrap/regtest?

Changes from the last patch:
1) false instead of lval for COMPOUND_EXPR and *COND_EXPR op1/op2
2) outlined the "foo"[2] folding into c_fold_array_ref
3) added !TREE_OVERFLOW check there
4) added another testcase

2017-11-08  Jakub Jelinek  

PR c/66618
PR c/69960
c-family/
* c-common.h (c_fully_fold): Add LVAL argument defaulted to false.
c/
* c-parser.c (c_parser_omp_atomic): Pass true as LVAL to c_fully_fold
where needed.
* c-typeck.c (build_unary_op, build_modify_expr, build_asm_expr,
handle_omp_array_sections): Likewise.
(digest_init): Don't call decl_constant_value_for_optimization.
* c-tree.h (decl_constant_value_for_optimization): Removed.
* c-fold.c (c_fold_array_ref): New function.
(c_fully_fold_internal): Add LVAL argument, propagate it through
recursive calls.  For VAR_P call decl_constant_value and
unshare if not LVAL and either optimizing or IN_INIT.  Remove
decl_constant_value_for_optimization calls.  If IN_INIT and not LVAL,
fold ARRAY_REF with STRING_CST and INTEGER_CST operands.
(c_fully_fold): Add LVAL argument, pass it through to
c_fully_fold_internal.
(decl_constant_value_for_optimization): Removed.
cp/
* cp-gimplify.c (c_fully_fold): Add LVAL argument, call
cp_fold_maybe_rvalue instead of cp_fold_rvalue and pass it !LVAL.
testsuite/
* gcc.dg/pr69960.c: New test.
* gcc.dg/pr66618.c: New test.
* gcc.dg/pr66618-2.c: New test.

--- gcc/c-family/c-common.h.jj  2017-11-08 16:19:30.309488815 +0100
+++ gcc/c-family/c-common.h 2017-11-08 17:49:00.548178246 +0100
@@ -827,7 +827,7 @@ extern tree c_build_bitfield_integer_typ
 extern enum conversion_safety unsafe_conversion_p (location_t, tree, tree, 
tree,
   bool);
 extern bool decl_with_nonnull_addr_p (const_tree);
-extern tree c_fully_fold (tree, bool, bool *);
+extern tree c_fully_fold (tree, bool, bool *, bool = false);
 extern tree c_wrap_maybe_const (tree, bool);
 extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
--- gcc/c/c-parser.c.jj 2017-11-08 16:19:30.569485632 +0100
+++ gcc/c/c-parser.c2017-11-08 17:49:00.552178196 +0100
@@ -14748,7 +14748,7 @@ c_parser_omp_atomic (location_t loc, c_p
 case NOP_EXPR: /* atomic write */
   v = c_parser_cast_expression (parser, NULL).value;
   non_lvalue_p = !lvalue_p (v);
-  v = c_fully_fold (v, false, NULL);
+  v = c_fully_fold (v, false, NULL, true);
   if (v == error_mark_node)
goto saw_error;
   if (non_lvalue_p)
@@ -14767,7 +14767,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  lhs = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (lhs);
- lhs = c_fully_fold (lhs, false, NULL);
+ lhs = c_fully_fold (lhs, false, NULL, true);
  if (lhs == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14793,7 +14793,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  v = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (v);
- v = c_fully_fold (v, false, NULL);
+ v = c_fully_fold (v, false, NULL, true);
  if (v == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14814,7 +14814,7 @@ restart:
   lhs = expr.value;
   expr = default_function_array_conversion (eloc, expr);
   unfolded_lhs = expr.value;
-  lhs = c_fully_fold (lhs, false, NULL);
+  lhs = c_fully_fold (lhs, false, NULL, true);
   

Re: [Diagnostic Patch] don't print column zero

2017-11-08 Thread Nathan Sidwell

On 11/02/2017 04:33 PM, Nathan Sidwell wrote:

On 10/26/2017 10:34 AM, David Malcolm wrote:

[CCing Rainer and Mike for the gcc-dg.exp part]



My Tcl skills aren't great, so hopefully someone else can review this;
CCing Rainer and Mike.


Ping?

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01911.html


Ping? A change to the column-checking logic the diagnostic processing.

nathan

--
Nathan Sidwell


Re: [PATCH 20/22] Enable building libobjc with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:18 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:43 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 20/22] Enable building libobjc with Intel CET
>>
>> On 10/12/2017 03:19 PM, Tsimbalist, Igor V wrote:
>>> Enable building libobjc with Intel CET options.
>>>
>>> libobjc/
>>> * Makefile.in: Regenerate.
>>> * aclocal.m4: Likeiwse.
>>> * configure: Likewise.
>>> * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>>>
>>
>> Same comments as the libcilkrts changes.
>>
>> Jeff
> 
> 0020-Enable-building-libobjc-with-Intel-CET.PATCH
> 
> 
> From edb63547af1626d7d9dc8383661234bf81e73216 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Thu, 17 Aug 2017 09:44:54 -0700
> Subject: [PATCH 20/21] Enable building libobjc with Intel CET
> 
> libobjc/
>   * Makefile.in: Regenerate.
>   * aclocal.m4: Likeiwse.
>   * configure: Likewise.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.

OK.  Please wait to commit until entire set is ACK'd.

Jeff


Re: [PATCH 18/22] Enable building libmpx with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:15 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Jeff Law
>> Sent: Wednesday, October 18, 2017 1:42 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: enkovich@gmail.com
>> Subject: Re: [PATCH 18/22] Enable building libmpx with Intel CET
>>
>> On 10/12/2017 02:36 PM, Tsimbalist, Igor V wrote:
>>> Enable building libmpx with Intel CET options.
>>>
>>> libmpx/
>>> * Makefile.in: Regenerate.
>>> * acinclude.m4: Add enable.m4 and cet.m4.
>>> * configure: Regenerate.
>>> * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>>> * mpxrt/Makefile.am: Update libmpx_la_CFLAGS.
>>> * mpxrt/Makefile.in: Regenerate.
>>> * mpxwrap/Makefile.am: Add AM_CFLAGS. Update
>>> * libmpxwrappers_la_CFLAGS.
>>> * mpxwrap/Makefile.in: Regenerate.
>>>
>>
>> Same comments as the libcilkrts changes.
>>
>> Jeff
> 
> 0018-Enable-building-libmpx-with-Intel-CET.PATCH
> 
> 
> From 6485e27ad096f520b749fbfd8ae7922f09a31879 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 15 Aug 2017 21:43:04 +0300
> Subject: [PATCH 18/21] Enable building libmpx with Intel CET
> 
> libmpx/
>   * Makefile.in: Regenerate.
>   * acinclude.m4: Add enable.m4 and cet.m4.
>   * configure: Regenerate.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>   * mpxrt/Makefile.am: Update libmpx_la_CFLAGS.
>   * mpxrt/Makefile.in: Regenerate.
>   * mpxwrap/Makefile.am: Add AM_CFLAGS. Update
>   libmpxwrappers_la_CFLAGS.
>   * mpxwrap/Makefile.in: Regenerate.

OK.  Please wait to commit until entire set is ACK'd.
Jeff


Re: [PATCH 19/22] Enable building libgfortran with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:16 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:43 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 19/22] Enable building libgfortran with Intel CET
>>
>> On 10/12/2017 03:17 PM, Tsimbalist, Igor V wrote:
>>> Enable building libgfortran with Intel CET options.
>>>
>>> libgfortran/
>>> * acinclude.m4: Add enable.m4, cet.m4.
>>> * configure: Regenerate.
>>> * configure.ac: Set CET_FLAGS. Update AM_FCFLAGS, AM_CFLAGS,
>>> CFLAGS.
>>>
>> Same comments as the libcilkrts changes.
>>
>> Jeff
> 
> 0019-Enable-building-libgfortran-with-Intel-CET.PATCH
> 
> 
> From ee560ffcd8014d5c39914a5548b7e13c20a619af Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Wed, 16 Aug 2017 10:06:00 -0700
> Subject: [PATCH 19/21] Enable building libgfortran with Intel CET
> 
> libgfortran/
>   * acinclude.m4: Add enable.m4, cet.m4.
>   * Makefile.in: Regenerate.
>   * configure: Likewise.
>   * configure.ac: Set CET_FLAGS. Update AM_FCFLAGS,
>   AM_CFLAGS, CFLAGS.
OK.  Please wait to commit until entire set is ACK'd.



Re: [PATCH 17/22] Enable building libquadmath with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:13 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:41 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: ja...@redhat.com
>> Subject: Re: [PATCH 17/22] Enable building libquadmath with Intel CET
>>
>> On 10/12/2017 02:34 PM, Tsimbalist, Igor V wrote:
>>> Enable building libquadmath with Intel CET options.
>>>
>>> libquadmath/
>>> * Makefile.am: Update AM_CFLAGS.
>>> * Makefile.in: Regenerate:
>>> * acinclude.m4: Add enable.m4 and cet.m4.
>>> * configure: Regenerate.
>>> * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>>>
>> Same comments as the libcilkrts changes.
>>
>> Jeff
> 
> 0017-Enable-building-libquadmath-with-Intel-CET.PATCH
> 
> 
> From 038f4ea88505a59911f48074fb358f5a64133125 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 15 Aug 2017 20:09:42 +0300
> Subject: [PATCH 17/21] Enable building libquadmath with Intel CET
> 
> libquadmath/
>   * Makefile.am: Update AM_CFLAGS.
>   * Makefile.in: Regenerate:
>   * acinclude.m4: Add enable.m4 and cet.m4.
>   * configure: Regenerate.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.
OK.  Please wait to commit until entire set is ACK'd.

Jeff


Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Martin Sebor

On 11/08/2017 09:51 AM, Richard Sandiford wrote:

Martin Sebor  writes:

On 11/08/2017 02:32 AM, Richard Sandiford wrote:

Martin Sebor  writes:

I haven't done nearly a thorough review but the dtor followed by
the placement new in the POLY_SET_COEFF() macro caught my eye so
I thought I'd ask sooner rather than later.  Given the macro
definition:

+   The dummy comparison against a null C * is just a way of checking
+   that C gives the right type.  */
+#define POLY_SET_COEFF(C, RES, I, VALUE) \
+  ((void) (&(RES).coeffs[0] == (C *) 0), \
+   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
+   ? (void) ((RES).coeffs[I] = VALUE) \
+   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))

is the following use well-defined?

+template
+inline poly_int_pod&
+poly_int_pod::operator <<= (unsigned int a)
+{
+  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);

It looks to me as though the VALUE argument in the ctor invoked
by the placement new expression is evaluated after the dtor has
destroyed the very array element the VALUE argument expands to.


Good catch!  It should simply have been doing <<= on each coefficient --
I must have got carried away when converting to POLY_SET_COEFF.

I double-checked the other uses and think that's the only one.


Whether or not is, in fact, a problem, it seems to me that using
a function template rather than a macro would be a clearer and
safer way to do the same thing.  (Safer in that the macro also
evaluates its arguments multiple times, which is often a source
of subtle bugs.)


That would slow down -O0 builds though, by introducing an extra
function call and set of temporaries even when the coefficients
are primitive integers.


Would decorating the function template with attribute always_inline
help?


It would remove the call itself, but we'd still have the extra temporary
objects that were the function argument and return value.


Sorry, I do not want to get into another long discussion about
trade-offs between safety and efficiency but I'm not sure I see
what extra temporaries it would create.  It seems to me that
an inline function template that took arguments of user-defined
types by reference and others by value should be just as efficient
as a macro.

From GCC's own manual:

  6.43 An Inline Function is As Fast As a Macro
  https://gcc.gnu.org/onlinedocs/gcc/Inline.html

If that's not the case and there is a significant performance
penalty associated with inline functions at -O0 then GCC should
be fixed to avoid it.

Martin


Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Martin Sebor

On 11/08/2017 09:51 AM, Richard Sandiford wrote:

Martin Sebor  writes:

On 11/08/2017 02:32 AM, Richard Sandiford wrote:

Martin Sebor  writes:

I haven't done nearly a thorough review but the dtor followed by
the placement new in the POLY_SET_COEFF() macro caught my eye so
I thought I'd ask sooner rather than later.  Given the macro
definition:

+   The dummy comparison against a null C * is just a way of checking
+   that C gives the right type.  */
+#define POLY_SET_COEFF(C, RES, I, VALUE) \
+  ((void) (&(RES).coeffs[0] == (C *) 0), \
+   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
+   ? (void) ((RES).coeffs[I] = VALUE) \
+   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))

is the following use well-defined?

+template
+inline poly_int_pod&
+poly_int_pod::operator <<= (unsigned int a)
+{
+  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);

It looks to me as though the VALUE argument in the ctor invoked
by the placement new expression is evaluated after the dtor has
destroyed the very array element the VALUE argument expands to.


Good catch!  It should simply have been doing <<= on each coefficient --
I must have got carried away when converting to POLY_SET_COEFF.

I double-checked the other uses and think that's the only one.


Whether or not is, in fact, a problem, it seems to me that using
a function template rather than a macro would be a clearer and
safer way to do the same thing.  (Safer in that the macro also
evaluates its arguments multiple times, which is often a source
of subtle bugs.)


That would slow down -O0 builds though, by introducing an extra
function call and set of temporaries even when the coefficients
are primitive integers.


Would decorating the function template with attribute always_inline
help?


It would remove the call itself, but we'd still have the extra temporary
objects that were the function argument and return value.


Sorry, I do not want to get into another long discussion about
trade-offs between safety and efficiency but I'm not sure I see
what extra temporaries it would create.  It seems to me that
an inline function template that took arguments of user-defined
types by reference and others by value should be just as efficient
as a macro.

From GCC's own manual:

  6.43 An Inline Function is As Fast As a Macro
  https://gcc.gnu.org/onlinedocs/gcc/Inline.html

If that's not the case and there is a significant performance
penalty associated with inline functions at -O0 then GCC should
be fixed to avoid it.

Martin


Re: [PATCH 16/22] Enable building libssp with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:12 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:38 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 16/22] Enable building libssp with Intel CET
>>
>> On 10/12/2017 02:31 PM, Tsimbalist, Igor V wrote:
>>> Enable building libssp with Intel CET options.
>>>
>>> libssp/
>>> * Makefile.am: Update AM_CFLAGS.
>>> * Makefile.in: Regenerate.
>>> * configure: Likewise.
>>> * aclocal.m4: Likewise.
>>> * configure.ac: Set CET_FLAGS. Update XCFLAGS.
>>>
>> Same comments as with libcilkrts.
>> Jeff
> 
> 0016-Enable-building-libssp-with-Intel-CET.PATCH
> 
> 
> From f60e80874cab302d11d877a2d362f65714612bc8 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 15 Aug 2017 19:55:42 +0300
> Subject: [PATCH 16/21] Enable building libssp with Intel CET
> 
> libssp/
>   * Makefile.am: Update AM_CFLAGS, update
>   libssp_nonshared_la_CFLAGS.
>   * Makefile.in: Regenerate.
>   * configure: Likewise.
>   * aclocal.m4: Likewise.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.
OK.  Please wait to commit until entire set is ACK'd.

Jeff


[Ada] Implementation of AI12-0127 : delta aggregate

2017-11-08 Thread Pierre-Marie de Rodat
This patch updates the implementation of Ada2020 delta aggregatesa to
conform to the latest version of AI12-0127. this patch adds checks to
reject statically delta aggregates that specify values for components
that appear in different variants of a record type.

Compiling test2.adb in Ada2020 mode must yield:

   test2.adb:31:51: "F3" and "F4" appear in different variants
   test2.adb:32:51: "F3" and "F2" appear in different variants
   test2.adb:33:39: "F3" and "F2" appear in different variants
   test2.adb:34:39: type subtype of "T" has no component with this name

---
procedure Test2 is
  type T (Disc1, Disc2 : Boolean := True) is
 record
F1 : Integer;
case Disc1 is
   when False =>
   F4 : Float;
   case Disc2 is
 when False =>
null;
 when True =>
F2 : Integer;
  end case;
   when True =>
   case Disc2 is
 when False =>
null;
 when True =>
F3 : Integer;
  end case;
end case;
 end record;

 procedure Munge (X : in out T) is
 begin
-- illegal aggregate; F2 and F3 are declared within (albeit not
 -- immediately within) different variants of the same variant
-- part.
X := (X with delta F2 => 123, F1 => 456);  --  OK
X := (X with delta F2 => 123, F4 => 3.14);  --  OK
X := (X with delta F4 => 3.14, F2 => 444, F3 => 0);  --  ERROR
X := (X with delta F2 => 314, F4 => 4.44, F3 => 0);  --  ERROR
X := (X with delta F2 => 123, F3 => 456);  --  ERROR
X := (X with delta F2 => 123, F5 => 456);  --  ERROR
 end Munge;

 X : T;
  begin
 Munge (X);
  end Test2;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Ed Schonberg  

* sem_aggr.adb (Resolve_Delta_Aggregate): Divide into the
following separate procedures.
(Resolve_Delta_Array_Aggregate): Previous code form
Resolve_Delta_Aggregate.
(Resolve_Delta_Record_Aggregate): Extend previous code to cover latest
ARG decisions on the legality rules for delta aggregates for records:
in the case of a variant record, components from different variants
cannot be specified in the delta aggregate, and this must be checked
statically.

Index: sem_aggr.adb
===
--- sem_aggr.adb(revision 254542)
+++ sem_aggr.adb(working copy)
@@ -418,6 +418,13 @@
--  array of characters is expected. This procedure simply rewrites the
--  string as an aggregate, prior to resolution.
 
+   -
+   --  Delta aggregate processing --
+   -
+
+   procedure Resolve_Delta_Array_Aggregate  (N : Node_Id; Typ : Entity_Id);
+   procedure Resolve_Delta_Record_Aggregate (N : Node_Id; Typ : Entity_Id);
+

-- Array_Aggr_Subtype --

@@ -2759,143 +2766,278 @@
 
procedure Resolve_Delta_Aggregate (N : Node_Id; Typ : Entity_Id) is
   Base   : constant Node_Id := Expression (N);
+
+   begin
+  if not Is_Composite_Type (Typ) then
+ Error_Msg_N ("not a composite type", N);
+  end if;
+
+  Analyze_And_Resolve (Base, Typ);
+
+  if Is_Array_Type (Typ) then
+ Resolve_Delta_Array_Aggregate (N, Typ);
+  else
+ Resolve_Delta_Record_Aggregate (N, Typ);
+  end if;
+
+  Set_Etype (N, Typ);
+   end Resolve_Delta_Aggregate;
+
+   ---
+   -- Resolve_Delta_Array_Aggregate --
+   ---
+
+   procedure Resolve_Delta_Array_Aggregate (N : Node_Id; Typ : Entity_Id) is
   Deltas : constant List_Id := Component_Associations (N);
+  Assoc  : Node_Id;
+  Choice : Node_Id;
+  Index_Type : Entity_Id;
 
-  function Get_Component_Type (Nam : Node_Id) return Entity_Id;
+   begin
+  Index_Type := Etype (First_Index (Typ));
+  Assoc := First (Deltas);
+  while Present (Assoc) loop
+ if Nkind (Assoc) = N_Iterated_Component_Association then
+Choice := First (Choice_List (Assoc));
+while Present (Choice) loop
+   if Nkind (Choice) = N_Others_Choice then
+  Error_Msg_N
+("others not allowed in delta aggregate", Choice);
 
-  
-  -- Get_Component_Type --
-  
+   else
+  Analyze_And_Resolve (Choice, Index_Type);
+   end if;
 
-  function Get_Component_Type (Nam : Node_Id) return Entity_Id is
- Comp : Entity_Id;
+   Next (Choice);
+end loop;
 
-  begin
- 

[Ada] Warn on missing deallocation of coextension

2017-11-08 Thread Pierre-Marie de Rodat
This patch adds an informational warning to alert the user to the fact that
GNAT currently mishandles coextensions and that they will not be finalized or
deallocated with their respective owners in some as they should according
to RM 13.11.2 (9/3).


-- Source --


--  types.ads

with Ada.Finalization; use Ada.Finalization;

package Types is
   type Ctrl_Discr is new Controlled with record
  Id : Natural;
   end record;

   type Ctrl_Discr_Ptr is access all Ctrl_Discr;

   procedure Finalize (Obj : in out Ctrl_Discr);
   procedure Initialize (Obj : in out Ctrl_Discr);

   type Discr_B is null record;

   type Discr_B_Ptr is access all Discr_B;

   type Ctrl_Owner_B (Discr : access Discr_B) is new Controlled with record
  Id : Natural;
   end record;

   type Ctrl_Owner_B_Ptr is access all Ctrl_Owner_B;

   procedure Finalize (Obj : in out Ctrl_Owner_B);
   procedure Initialize (Obj : in out Ctrl_Owner_B);

   type Ctrl_Owner (Discr : access Ctrl_Discr) is new Controlled with record
  Id : Natural;
   end record;

   type Ctrl_Owner_Ptr is access all Ctrl_Owner;

   procedure Finalize (Obj : in out Ctrl_Owner);
   procedure Initialize (Obj : in out Ctrl_Owner);

   type Owner (Discr : access Ctrl_Discr) is null record;

   type Owner_Ptr is access all Owner;

   type Owner_B (Discr : access Discr_B) is null record;

   type Owner_B_Ptr is access all Owner_B;

   function New_Id return Natural;
end Types;

--  types.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Types is
   Id_Gen : Natural := 0;

   procedure Finalize (Obj : in out Ctrl_Discr) is
   begin
  Put_Line ("  fin Discr:" & Obj.Id'Img);
  Obj.Id := 0;
   end Finalize;

   procedure Finalize (Obj : in out Ctrl_Owner) is
   begin
  Put_Line ("  fin Ctrl_Owner:" & Obj.Id'Img);
  Obj.Id := 0;
   end Finalize;

   procedure Finalize (Obj : in out Ctrl_Owner_B) is
   begin
  Put_Line ("  fin Ctrl_Owner_B:" & Obj.Id'Image);
  Obj.Id := 0;
   end;

   procedure Initialize (Obj : in out Ctrl_Discr) is
   begin
  Obj.Id := New_Id;
  Put_Line ("  ini Discr:" & Obj.Id'Img);
   end Initialize;

   procedure Initialize (Obj : in out Ctrl_Owner) is
   begin
  Obj.Id := New_Id;
  Put_Line ("  ini Ctrl_Owner:" & Obj.Id'Img);
   end Initialize;

   procedure Initialize (Obj : in out Ctrl_Owner_B) is
   begin
  Obj.Id := New_Id;
  Put_Line ("  ini Ctrl_Owner_B:" & Obj.Id'Img);
   end Initialize;

   function New_Id return Natural is
   begin
  Id_Gen := Id_Gen + 1;
  return Id_Gen;
   end New_Id;
end Types;

--  main.adb

with Ada.Finalization; use Ada.Finalization;
with Ada.Text_IO;  use Ada.Text_IO;
with Ada.Unchecked_Deallocation;
with Types;use Types;

procedure Main is
   procedure Free is
 new Ada.Unchecked_Deallocation (Ctrl_Owner, Ctrl_Owner_Ptr);
   procedure Free is
 new Ada.Unchecked_Deallocation (Owner, Owner_Ptr);
   procedure Free is
 new Ada.Unchecked_Deallocation (Ctrl_Owner_B, Ctrl_Owner_B_Ptr);
   procedure Free is
 new Ada.Unchecked_Deallocation (Owner_B, Owner_B_Ptr);

begin
   Put_Line ("Ctrl_Owner named access - non-controlled discr");

   declare
  D_Ptr_1 : constant Discr_B_Ptr:= new Discr_B;
  D_Ptr_2 : constant access Discr_B := new Discr_B;

  O_Ptr_1 : Ctrl_Owner_B_Ptr :=
  new Ctrl_Owner_B'(Controlled with Discr => new Discr_B,
Id=> New_Id);

  O_Ptr_2 : Ctrl_Owner_B_Ptr :=
  new Ctrl_Owner_B'(Controlled with Discr => D_Ptr_1,
Id=> New_Id);

  O_Ptr_3 : Ctrl_Owner_B_Ptr :=
  new Ctrl_Owner_B'(Controlled with Discr => D_Ptr_2,
Id=> New_Id);
   begin
  Free (O_Ptr_1);
  Free (O_Ptr_2);
  Free (O_Ptr_3);
   end;

   Put_Line ("Ctrl_Owner anonymous access - non-controlled discr");

   declare
  D_Ptr_1 : constant Discr_B_Ptr:= new Discr_B;
  D_Ptr_2 : constant access Discr_B := new Discr_B;

  O_Ptr_1 : access Ctrl_Owner_B :=
  new Ctrl_Owner_B'(Controlled with Discr => new Discr_B,
Id=> New_Id);

  O_Ptr_2 : access Ctrl_Owner_B :=
  new Ctrl_Owner_B'(Controlled with Discr => D_Ptr_1,
Id=> New_Id);

  O_Ptr_3 : access Ctrl_Owner_B :=
  new Ctrl_Owner_B'(Controlled with Discr => D_Ptr_2,
Id=> New_Id);
   begin
  Free (O_Ptr_1);
  Free (O_Ptr_2);
  Free (O_Ptr_3);
   end;

   Put_Line ("Owner named access - non-controlled discr");

   declare
  D_Ptr_1 : constant Discr_B_Ptr:= new Discr_B;
  D_Ptr_2 : constant access Discr_B := new Discr_B;

  O_Ptr_1 : Owner_B_Ptr := new Owner_B'(Discr => new Discr_B);
 

[Ada] Crash on access-to-object in SPARK

2017-11-08 Thread Pierre-Marie de Rodat
This patch corrects the light expansion of object renamings for SPARK to
prevent a crash by querying the subtype mark when the renaming carries an
access definition.

-
-- Sources --
-

--  p.ads

package P with SPARK_Mode is
   type T is record
  Ptr : access constant Integer;
   end record;

   function Get (X : T) return Integer;
end P;

--  p.adb

package body P with SPARK_Mode is
   function Get (X : T) return Integer is
  Proxy : access constant Integer renames X.Ptr;
   begin
  return Proxy.all;
   end;
end P;

-
-- Compilation --
-

$ gcc -c p.adb
$ gcc -c p.adb -gnatd.F

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Hristian Kirtchev  

* exp_spark.adb (Expand_SPARK_N_Object_Renaming_Declaration): Obtain
the type of the renaming from its defining entity, rather then the
subtype mark as there may not be a subtype mark.

Index: exp_spark.adb
===
--- exp_spark.adb   (revision 254542)
+++ exp_spark.adb   (working copy)
@@ -349,7 +349,7 @@
   Loc: constant Source_Ptr := Sloc (N);
   Obj_Id : constant Entity_Id  := Defining_Entity (N);
   Nam: constant Node_Id:= Name (N);
-  Typ: constant Entity_Id  := Etype (Subtype_Mark (N));
+  Typ: constant Entity_Id  := Etype (Obj_Id);
 
begin
   --  Transform a renaming of the form


Re: [PATCH 15/22] Enable building libvtv with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:10 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:38 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: cmt...@google.com
>> Subject: Re: [PATCH 15/22] Enable building libvtv with Intel CET
>>
>> On 10/12/2017 02:29 PM, Tsimbalist, Igor V wrote:
>>> Enable building libvtv with Intel CET options.
>>>
>>> libvtv/
>>> * acinclude.m4: Add enable.m4 and cet.m4.
>>> * libvtv/configure: Regenerate.
>>> * libvtv/configure.ac: Set CET_FLAGS. Update XCFLAGS.
>> Same comments as with libcilkrts.
>> Jeff
>>
> 
> 0015-Enable-building-libvtv-with-Intel-CET.PATCH
> 
> 
> From cf1746f369de4fcd78bba197b05ba78f8b4f0b4e Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 15 Aug 2017 19:43:24 +0300
> Subject: [PATCH 15/21] Enable building libvtv with Intel CET
> 
> libvtv/
>   * acinclude.m4: Add enable.m4 and cet.m4.
>   * Makefile.in: Regenerate.
>   * testsuite/Makefile.in: Likewise.
>   * configure: Likewise.
>   * configure.ac: Set CET_FLAGS. Update XCFLAGS.
OK.  Please wait to commit until entire set is ACK'd.

Jeff


Re: [PATCH, rs6000] Fix scheduling description for quad-precision multiply instructions

2017-11-08 Thread Segher Boessenkool
Hi Pat,

On Wed, Nov 08, 2017 at 10:59:23AM -0600, Pat Haugen wrote:
>  The following patch creates a new insn type to annotate quad precision
> multiply instructions, updates the appropriate insns to use the new type
> and creates an entry in the Power9 machine description which describes
> the correct latency/resources. Bootstrap/regtest on powerpc64le-linux
> with no new regressions. Ok for trunk?

One question.

> --- gcc/config/rs6000/power9.md   (revision 254377)
> +++ gcc/config/rs6000/power9.md   (working copy)
> @@ -436,6 +436,12 @@ (define_insn_reservation "power9-qpdiv"
> (eq_attr "cpu" "power9"))
>"DU_super_power9,dfu_power9")
>  
> +(define_insn_reservation "power9-qpmul" 24
> +  (and (eq_attr "type" "qmul")
> +   (eq_attr "size" "128")
> +   (eq_attr "cpu" "power9"))
> +  "DU_super_power9,dfu_power9*12")

All other p9 reservations (other than integer div) do not reserve a
unit more than one cycle.  Will this blow up size of the automaton?
If not, should the other qp ops not reserve for more cycles, too?

Rest looks fine; okay for trunk with the reservation thing taken care
of one way or the other.  Thanks!


Segher


Re: [PATCH 12/22] Enable building libgomp with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 06:00 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:36 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: ja...@redhat.com
>> Subject: Re: [PATCH 12/22] Enable building libgomp with Intel CET
>>
>> On 10/12/2017 02:20 PM, Tsimbalist, Igor V wrote:
>>> Enable building libgomp with CET options.
>>>
>>> libgomp/
>>> * configure.ac: Set CET_FLAGS, update XCFLAGS and FCFLAGS.
>>> * acinclude.m4: Add cet.m4.
>>> * configure: Regenerate.
>>> * Makefile.in: Likewise.
>>> * testsuite/Makefile.in: Likewise
>>>
>> Same comments as with libcilkrts.
>> Jeff
> 
> 0012-Enable-building-libgomp-with-Intel-CET.PATCH
> 
> 
> From cbca6dd666d914ba223f4561e136ce29d8c9d705 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Thu, 10 Aug 2017 15:54:41 +0300
> Subject: [PATCH 12/21] Enable building libgomp with Intel CET
> 
> libgomp/
>   * configure.ac: Set CET_FLAGS, update XCFLAGS and FCFLAGS.
>   * acinclude.m4: Add cet.m4.
>   * configure: Regenerate.
>   * Makefile.in: Likewise.
>   * testsuite/Makefile.in: Likewise.
OK.  Please wait to commit until entire set is ACK'd.
Jeff


Re: [PATCH 11/22] Enable building libatomic with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 05:58 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
>  
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Wednesday, October 18, 2017 1:35 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 11/22] Enable building libatomic with Intel CET
>>
>> On 10/12/2017 02:18 PM, Tsimbalist, Igor V wrote:
>>> Enable building libatomic with CET options.
>>>
>>> libatomic/
>>> * configure.ac: Set CET_FLAGS, update XCFLAGS.
>>> * acinclude.m4: Add cet.m4 and enable.m4.
>>> * configure: Regenerate.
>>> * Makefile.in: Likewise.
>>> * testsuite/Makefile.in: Likewise.
>>>
>> Same comments as with libcilkrts.
>> Jeff
> 
> 0011-Enable-building-libatomic-with-Intel-CET.PATCH
> 
> 
> From 7c93c6bad933e408e6ed6642843ea2556b23df3c Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Thu, 10 Aug 2017 14:36:17 +0300
> Subject: [PATCH 11/21] Enable building libatomic with Intel CET
> 
> libatomic/
>   * configure.ac: Set CET_FLAGS, update XCFLAGS.
>   * acinclude.m4: Add cet.m4 and enable.m4.
>   * configure: Regenerate.
>   * Makefile.in: Likewise.
>   * testsuite/Makefile.in: Likewise.
OK.  Please wait to commit until entire set is ACK'd.

Jeff


Re: [PATCH 10/22] Enable building libcilkrts with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 05:54 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling.
> 
> Ok for trunk?
> 
> Igor
> 
> 
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Jeff Law
>> Sent: Wednesday, October 18, 2017 1:34 AM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 10/22] Enable building libcilkrts with Intel CET
>>
>> On 10/12/2017 02:13 PM, Tsimbalist, Igor V wrote:
>>> Enable building libcilkrts with CET options.
>>>
>>> libcilkrts/
>>> * Makefile.am: Add AM_CXXFLAGS and XCXXFLAGS.
>>> * configure.ac: Set CET_FLAGS, update XCFLAGS, XCXXFLAGS.
>>> * Makefile.in: Regenerate.
>>> * aclocal.m4: Likewise.
>>> * configure: Likewise.
>>>
>> So like the other patches in this space, the inclusion of cet.h seems
>> wrong.  I don't see why this should be needed here.
>>
>> It's OK with that bit removed and once any prereqs are OK'd.
>>
>> jeff
> 
> 0010-Enable-building-libcilkrts-with-Intel-CET.PATCH
> 
> 
> From db5978749148cd35c97145ea52e7a5e07121b27e Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Thu, 10 Aug 2017 14:20:31 +0300
> Subject: [PATCH 10/21] Enable building libcilkrts with Intel CET
> 
> libcilkrts/
>   * Makefile.am: Add AM_CXXFLAGS.
>   * configure.ac: Set CET_FLAGS, update XCFLAGS, XCXXFLAGS.
>   * Makefile.in: Regenerate.
>   * aclocal.m4: Likewise.
>   * configure: Likewise.
OK.  Please wait to commit until entire set is ACK'd.

jeff


Re: [PATCH 09/22] Enable building libbacktrace with Intel CET

2017-11-08 Thread Jeff Law
On 10/31/2017 05:47 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling. -iclude option is 
> dropped, each needed asm file is processed separately.
> 
> Igor
> 
> 
[ ... ]


> 
> 
> From d41da8a9f48e5b98d5e8e0e43805734c822e0caa Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Fri, 27 Oct 2017 15:56:39 +0300
> Subject: [PATCH 09/21] Enable building libbacktrace with Intel CET
> 
> libbacktrace/
>   * configure.ac: Add CET_FLAGS to EXTRA_FLAGS.
>   * aclocal.m4: Regenerate.
>   * Makefile.in: Likewise.
>   * configure: Likewise.
OK.  However, please wait to commit until the entire set is ACK'd.

Jeff


Re: [PATCH 07/22] Enable building libgcc with CET options.

2017-11-08 Thread Jeff Law
On 10/31/2017 05:29 AM, Tsimbalist, Igor V wrote:
> The revised patch is attached. The differences are in what options are 
> defined and propagated to Makefile for CET enabling, also needed asm files 
> are updated.
> 
[ ... ]

> 
> 0007-Enable-building-libgcc-with-CET-options.patch
> 
> 
> From df923f7e0ebee1f10136bb64f9c723f2d58f8f2a Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Fri, 27 Oct 2017 15:44:56 +0300
> Subject: [PATCH 07/21] Enable building libgcc with CET options
> 
> Enable building libgcc with CET options by default on Linux/x86 if
> binutils supports CET v2.0.  It can be disabled with --disable-cet.
> It is an error to configure GCC with --enable-cet if bintuiils doesn't
> support CET v2.0.
> 
> ENDBR is added to __morestack_large_model since it is called indirectly.
> 
> config/
>   * cet.m4: New file.
> 
> gcc/
>   * config.gcc (extra_headers): Add cet.h for x86 targets.
>   * config/i386/cet.h: New file.
>   * doc/install.texi: Add --enable-cet/--disable-cet.
> 
> libgcc/
>   * Makefile.in (configure_deps): Add $(srcdir)/../config/cet.m4.
>   (CET_FLAGS): New.
>   * config/i386/morestack.S: Include .
>   (__morestack_large_model): Add _CET_ENDBR at function entrance.
>   * config/i386/resms64.h: Include .
>   * config/i386/resms64f.h: Likewise.
>   * config/i386/resms64fx.h: Likewise.
>   * config/i386/resms64x.h: Likewise.
>   * config/i386/savms64.h: Likewise.
>   * config/i386/savms64f.h: Likewise.
>   * config/i386/t-linux (HOST_LIBGCC2_CFLAGS): Add $(CET_FLAGS).
>   (CRTSTUFF_T_CFLAGS): Likewise.
>   * configure.ac: Include ../config/cet.m4.
>   Set and substitute CET_FLAGS.
>   * configure: Regenerated.
So the question I have WRT this patch is the default setting.  If I
understand it correctly, if the assembler supports the appropriate
insns, then we enable building target libraries with CET by default.

These libraries continue to work on older systems without CET
capabilities because the CET specific instructions are interpreted as
NOPs on older hardware, right?

What about cases where we're running on CET capable hardware, the main
program gets compiled without CET, but links against a libgcc with CET.
What happens in that case?

What triggers the use of CET vs interpreting those instructions as NOPs

I don't doubt y'all have already thought about these cases.  I just want
to make sure that I understand them and the implications before I ack
this patch.

Jeff


Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Joseph Myers
On Wed, 8 Nov 2017, Jakub Jelinek wrote:

> of course only if LVAL is false.  Additionally, I've added folding of
> "foo"[2] into 'o'.  We have it in gimple-fold.c or so, so that one

Note that if the 2 there comes from an overflowing expression that's not 
valid as an extension to constant expressions in initializers (that is,

static char c = "foo"[INT_MAX * -2];

should result in an error with -pedantic-errors because of the overflow, 
just as INT_MAX * -2 by itself wouldn't be a valid initializer in that 
case).

> Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
> false as LVAL for the first operand (condition) and lval as LVAL for the
> other two (i.e. if called with lval == true on the whole *_COND_EXPR
> decl_constant_value_for_optimization etc. isn't performed on op1/op2, while
> without it it is).  Can one take address of the whole COND_EXPR, or
> have it on LHS of anything in C?  If not, perhaps I could just pass false
> in all the 3 calls.  Though, then likely it would be called with lval == false
> anyway.

Conditional and compound expressions are never valid lvalues in C.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Jakub Jelinek
On Wed, Nov 08, 2017 at 05:42:07PM +0100, Marek Polacek wrote:
> > Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
> > false as LVAL for the first operand (condition) and lval as LVAL for the
> > other two (i.e. if called with lval == true on the whole *_COND_EXPR
> > decl_constant_value_for_optimization etc. isn't performed on op1/op2, while
> > without it it is).  Can one take address of the whole COND_EXPR, or
> > have it on LHS of anything in C?  
> 
> I don't think so, the ?: operator does not yield an lvalue.

Ok, I'll pass false then.

> > @@ -218,12 +235,51 @@ c_fully_fold_internal (tree expr, bool i
> >op2 = TREE_OPERAND (expr, 2);
> >op3 = TREE_OPERAND (expr, 3);
> >op0 = c_fully_fold_internal (op0, in_init, maybe_const_operands,
> > -  maybe_const_itself, for_int_const);
> > +  maybe_const_itself, for_int_const, lval);
> >STRIP_TYPE_NOPS (op0);
> >op1 = c_fully_fold_internal (op1, in_init, maybe_const_operands,
> > -  maybe_const_itself, for_int_const);
> > +  maybe_const_itself, for_int_const, false);
> >STRIP_TYPE_NOPS (op1);
> > -  op1 = decl_constant_value_for_optimization (op1);
> > +  /* Fold "foo"[2] in initializers.  */
> > +  if (!lval
> > + && in_init
> > + && TREE_CODE (op0) == STRING_CST
> > + && TREE_CODE (op1) == INTEGER_CST
> > + && TREE_CODE (TREE_TYPE (op0)) == ARRAY_TYPE
> 
> Does this also handle 2["foobar"]?

Yes, because build_array_ref has:
2607  if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE
2608  && TREE_CODE (TREE_TYPE (array)) != POINTER_TYPE
2609  /* Allow vector[index] but not index[vector].  */
2610  && !VECTOR_TYPE_P (TREE_TYPE (array)))
2611{
2612  if (TREE_CODE (TREE_TYPE (index)) != ARRAY_TYPE
2613  && TREE_CODE (TREE_TYPE (index)) != POINTER_TYPE)
2614{
2615  error_at (loc,
2616"subscripted value is neither array nor pointer nor 
vector");
2617
2618  return error_mark_node;
2619}
2620  std::swap (array, index);
2621  swapped = true;
2622}


> > + && tree_fits_uhwi_p (op1))
> > +   {
> > + tree ary = op0;
> > + tree index = op1;
> > + unsigned len = 0;
> > + tree elem_type = TREE_TYPE (TREE_TYPE (ary));
> > + unsigned elem_nchars = (TYPE_PRECISION (elem_type)
> > + / TYPE_PRECISION (char_type_node));
> > + len = (unsigned) TREE_STRING_LENGTH (ary) / elem_nchars;
> > +
> > + tree nelts = array_type_nelts (TREE_TYPE (ary));
> > + bool dummy1 = true, dummy2 = true;
> > + nelts = c_fully_fold_internal (nelts, in_init, , ,
> > +for_int_const, false);
> > + unsigned HOST_WIDE_INT i = tree_to_uhwi (index);
> > + if (tree_int_cst_le (index, nelts)
> > + && i < len
> > + && i + elem_nchars <= len)
> > +   {
> > + tree type = TREE_TYPE (expr);
> > + if (elem_nchars == 1)
> > +   ret = build_int_cst (type,
> > +TREE_STRING_POINTER (ary)[i]);
> > + else
> > +   {
> > + const unsigned char *ptr
> > +   = ((const unsigned char *)TREE_STRING_POINTER (ary)
> > +  + i * elem_nchars);
> > + ret = native_interpret_expr (type, ptr, elem_nchars);
> > +   }
> > + if (ret)
> > +   goto out;
> > +   }
> > +   }
> 
> So this code comes from gimple-fold.c?  I would probably move it to a new
> function.

No, this comes from very simplified cp/constexpr.c (extract_string_elt,
cxx_eval_array_reference).  Can outline it into a separate function, sure.

Jakub


[PATCH, rs6000] Fix scheduling description for quad-precision multiply instructions

2017-11-08 Thread Pat Haugen
 The following patch creates a new insn type to annotate quad precision
multiply instructions, updates the appropriate insns to use the new type
and creates an entry in the Power9 machine description which describes
the correct latency/resources. Bootstrap/regtest on powerpc64le-linux
with no new regressions. Ok for trunk?

-Pat


2017-11-08  Pat Haugen  

* rs6000/power9.md (power9-qpmul): New.
* rs6000/rs6000.md ("type" attr): Add qmul.
(mul3, fma4_hw, *fms4_hw, *nfma4_hw,
*nfms4_hw, mul3_odd, fma4_odd, *fms4_odd,
*nfma4_odd, *nfms4_odd): Change type to qmul.


Index: gcc/config/rs6000/power9.md
===
--- gcc/config/rs6000/power9.md	(revision 254377)
+++ gcc/config/rs6000/power9.md	(working copy)
@@ -436,6 +436,12 @@ (define_insn_reservation "power9-qpdiv"
(eq_attr "cpu" "power9"))
   "DU_super_power9,dfu_power9")
 
+(define_insn_reservation "power9-qpmul" 24
+  (and (eq_attr "type" "qmul")
+   (eq_attr "size" "128")
+   (eq_attr "cpu" "power9"))
+  "DU_super_power9,dfu_power9*12")
+
 (define_insn_reservation "power9-mffgpr" 2
   (and (eq_attr "type" "mffgpr")
(eq_attr "cpu" "power9"))
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md	(revision 254377)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -182,7 +182,7 @@ (define_attr "type"
cmp,
branch,jmpreg,mfjmpr,mtjmpr,trap,isync,sync,load_l,store_c,
cr_logical,delayed_cr,mfcr,mfcrf,mtcr,
-   fpcompare,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,
+   fpcompare,fp,fpsimple,dmul,qmul,sdiv,ddiv,ssqrt,dsqrt,
vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto,
veclogical,veccmpfx,vecexts,vecmove,
@@ -14230,7 +14230,7 @@ (define_insn "mul3"
 	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmulqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "div3"
@@ -14332,7 +14332,7 @@ (define_insn "fma4_hw"
 	 (match_operand:IEEE128 3 "altivec_register_operand" "0")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmaddqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*fms4_hw"
@@ -14344,7 +14344,7 @@ (define_insn "*fms4_hw"
 	  (match_operand:IEEE128 3 "altivec_register_operand" "0"]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmsubqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*nfma4_hw"
@@ -14356,7 +14356,7 @@ (define_insn "*nfma4_hw"
 	  (match_operand:IEEE128 3 "altivec_register_operand" "0"]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsnmaddqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*nfms4_hw"
@@ -14369,7 +14369,7 @@ (define_insn "*nfms4_hw"
 	   (match_operand:IEEE128 3 "altivec_register_operand" "0")]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsnmsubqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "extend2_hw"
@@ -14644,7 +14644,7 @@ (define_insn "mul3_odd"
 	 UNSPEC_MUL_ROUND_TO_ODD))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmulqpo %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "div3_odd"
@@ -14677,7 +14677,7 @@ (define_insn "fma4_odd"
 	 UNSPEC_FMA_ROUND_TO_ODD))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmaddqpo %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*fms4_odd"
@@ -14690,7 +14690,7 @@ (define_insn "*fms4_odd"
 	 UNSPEC_FMA_ROUND_TO_ODD))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsmsubqpo %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*nfma4_odd"
@@ -14703,7 +14703,7 @@ (define_insn "*nfma4_odd"
 	  UNSPEC_FMA_ROUND_TO_ODD)))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsnmaddqpo %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "*nfms4_odd"
@@ -14717,7 +14717,7 @@ (define_insn "*nfms4_odd"
 	  UNSPEC_FMA_ROUND_TO_ODD)))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xsnmsubqpo %0,%1,%2"
-  [(set_attr "type" "vecfloat")
+  [(set_attr "type" "qmul")
(set_attr "size" "128")])
 
 (define_insn "truncdf2_odd"


[PING] [PATCH] Add a warning for invalid function casts

2017-11-08 Thread Bernd Edlinger
Ping...

for the C++ part of this patch:

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00559.html


Thanks
Bernd.

> On 10/10/17 00:30, Bernd Edlinger wrote:
>> On 10/09/17 20:34, Martin Sebor wrote:
>>> On 10/09/2017 11:50 AM, Bernd Edlinger wrote:
 On 10/09/17 18:44, Martin Sebor wrote:
> On 10/07/2017 10:48 AM, Bernd Edlinger wrote:
>> Hi!
>>
>> I think I have now something useful, it has a few more heuristics
>> added, to reduce the number of false-positives so that it
>> is able to find real bugs, for instance in openssl it triggers
>> at a function cast which has already a TODO on it.
>>
>> The heuristics are:
>> - handle void (*)(void) as a wild-card function type.
>> - ignore volatile, const qualifiers on parameters/return.
>> - handle any pointers as equivalent.
>> - handle integral types, enums, and booleans of same precision
>>     and signedness as equivalent.
>> - stop parameter validation at the first "...".
>
> These sound quite reasonable to me.  I have a reservation about
> just one of them, and some comments about other aspects of the
> warning.  Sorry if this seems like a lot.  I'm hoping you'll
> find the feedback constructive.
>
> I don't think using void(*)(void) to suppress the warning is
> a robust solution because it's not safe to call a function that
> takes arguments through such a pointer (especially not if one
> or more of the arguments is a pointer).  Depending on the ABI,
> calling a function that expects arguments with none could also
> mess up the stack as the callee may pop arguments that were
> never passed to it.
>

 This is of course only a heuristic, and if there is no warning
 that does not mean any guarantee that there can't be a problem
 at runtime.  The heuristic is only meant to separate the
 bad from the very bad type-cast.  In my personal opinion there
 is not a single good type cast.
>>>
>>> I agree.  Since the warning uses one kind of a cast as an escape
>>> mechanism from the checking it should be one whose result can
>>> the most likely be used to call the function without undefined
>>> behavior.
>>>
>>> Since it's possible to call any function through a pointer to
>>> a function with no arguments (simply by providing arguments of
>>> matching types) it's a reasonable candidate.
>>>
>>> On the other hand, since it is not safe to call an arbitrary
>>> function through void (*)(void), it's not as good a candidate.
>>>
>>> Another reason why I think a protoype-less function is a good
>>> choice is because the alias and ifunc attributes already use it
>>> as an escape mechanism from their type incompatibility warning.
>>>
>>
>> I know of pre-existing code-bases where a type-cast to type:
>> void (*) (void);
>>
>> .. is already used as a generic function pointer: libffi and
>> libgo, I would not want to break these.
>>
>> Actually when I have a type:
>> X (*) (...);
>>
>> I would like to make sure that the warning checks that
>> only functions returning X are assigned.
>>
>> and for X (*) (Y, );
>>
>> I would like to check that anything returning X with
>> first argument of type Y is assigned.
>>
>> There are code bases where such a scheme is used.
>> For instance one that I myself maintain: the OPC/UA AnsiC Stack,
>> where I have this type definition:
>>
>> typedef OpcUa_StatusCode (OpcUa_PfnInvokeService)(OpcUa_Endpoint
>> hEndpoint, ...);
>>
>> And this plays well together with this warning, because only
>> functions are assigned that match up to the ...);
>> Afterwards this pointer is cast back to the original signature,
>> so everything is perfectly fine.
>>
>> Regarding the cast from pointer to member to function, I see also a
>> warning without -Wpedantic:
>> Warnung: converting from »void (S::*)(int*)« to »void (*)(int*)«
>> [-Wpmf-conversions]
>>  F *pf = (F*)::foo;
>>  ^~~
>>
>> And this one is even default-enabled, so I think that should be
>> more than sufficient.
>>
>> I also changed the heuristic, so that your example with the enum should
>> now work.  I did not add it to the test case, because it would
>> break with -fshort-enums :(
>>
>> Attached I have an updated patch that extends this warning to the
>> pointer-to-member function cast, and relaxes the heuristic on the
>> benign integral type differences a bit further.
>>
>>
>> Is it OK for trunk after bootstrap and reg-testing?
>>
>>
>> Thanks
>> Bernd.
>>


[Ada] Accessibility violation flagged on anonymous access component

2017-11-08 Thread Pierre-Marie de Rodat
The compiler was incorrectly generating an unconditional raise of
Program_Error (and associated warnings) for returning a component with an
anonymous access type within a function of a generic instance.  The code
that was performing this check was intended to apply only to anonymous
access discriminants (as indicated by comments) but neglected to test
whether the selected component was a discriminant. This was ancient code
that was implemented prior to Ada 2005's additions of anonymous access
types for components and stand-alone objects. The fix refines the test
so that it only applies to access discriminants.

The following test must compile quietly:

$ gcc -c anonymous_instance.ads


generic
package Anonymous_Generic is
   type Holder is tagged limited private;

   function Get_Ref_Named (This : in Holder) return not null access Integer;
   function Get_Ref_Anon  (This : in Holder) return not null access Integer;

private
   type Integer_Access is access all Integer;

   type Holder is tagged limited record
  M_Named : Integer_Access := new Integer'(1);
  M_Anon  : access Integer := new Integer'(2);
   end record;
end Anonymous_Generic;


package body Anonymous_Generic is
   function Get_Ref_Named (This : in Holder) return not null access Integer is
   begin
  return This.M_Named;
   end Get_Ref_Named;

   function Get_Ref_Anon (This : in Holder) return not null access Integer is
   begin
  return This.M_Anon;
   end Get_Ref_Anon;
end Anonymous_Generic;


with Anonymous_Generic;

package Anonymous_Instance is new Anonymous_Generic;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Gary Dismukes  

* exp_ch4.adb (Expand_N_Type_Conversion): Add test that the selector
name is a discriminant in check for unconditional accessibility
violation within instances.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 254542)
+++ exp_ch4.adb (working copy)
@@ -11279,6 +11279,7 @@
  elsif In_Instance_Body
and then Ekind (Operand_Type) = E_Anonymous_Access_Type
and then Nkind (Operand) = N_Selected_Component
+   and then Ekind (Entity (Selector_Name (Operand))) = E_Discriminant
and then Object_Access_Level (Operand) >
   Type_Access_Level (Target_Type)
  then


Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Richard Sandiford
Martin Sebor  writes:
> On 11/08/2017 02:32 AM, Richard Sandiford wrote:
>> Martin Sebor  writes:
>>> I haven't done nearly a thorough review but the dtor followed by
>>> the placement new in the POLY_SET_COEFF() macro caught my eye so
>>> I thought I'd ask sooner rather than later.  Given the macro
>>> definition:
>>>
>>> +   The dummy comparison against a null C * is just a way of checking
>>> +   that C gives the right type.  */
>>> +#define POLY_SET_COEFF(C, RES, I, VALUE) \
>>> +  ((void) (&(RES).coeffs[0] == (C *) 0), \
>>> +   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
>>> +   ? (void) ((RES).coeffs[I] = VALUE) \
>>> +   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))
>>>
>>> is the following use well-defined?
>>>
>>> +template
>>> +inline poly_int_pod&
>>> +poly_int_pod::operator <<= (unsigned int a)
>>> +{
>>> +  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);
>>>
>>> It looks to me as though the VALUE argument in the ctor invoked
>>> by the placement new expression is evaluated after the dtor has
>>> destroyed the very array element the VALUE argument expands to.
>>
>> Good catch!  It should simply have been doing <<= on each coefficient --
>> I must have got carried away when converting to POLY_SET_COEFF.
>>
>> I double-checked the other uses and think that's the only one.
>>
>>> Whether or not is, in fact, a problem, it seems to me that using
>>> a function template rather than a macro would be a clearer and
>>> safer way to do the same thing.  (Safer in that the macro also
>>> evaluates its arguments multiple times, which is often a source
>>> of subtle bugs.)
>>
>> That would slow down -O0 builds though, by introducing an extra
>> function call and set of temporaries even when the coefficients
>> are primitive integers.
>
> Would decorating the function template with attribute always_inline
> help?

It would remove the call itself, but we'd still have the extra temporary
objects that were the function argument and return value.

Thanks,
Richard


Protect against min_profitable_iters going negative

2017-11-08 Thread Richard Sandiford
We had:

  if (vec_outside_cost <= 0)
min_profitable_iters = 0;
  else
{
  min_profitable_iters = ((vec_outside_cost - scalar_outside_cost)
  * assumed_vf
  - vec_inside_cost * peel_iters_prologue
  - vec_inside_cost * peel_iters_epilogue)
 / ((scalar_single_iter_cost * assumed_vf)
- vec_inside_cost);

which can lead to negative min_profitable_iters when the *_outside_costs
are the same and peel_iters_epilogue is nonzero (e.g. if we're peeling
for gaps).

This is tested as part of the patch that adds support for fully-predicated
loops.

Tested on aarch64-linux-gnu (both with and without SVE), x86_64-linux-gnu
and powerpc64le-linux-gnu.  OK to install?

Thanks,
Richard


2017-11-08  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Make sure
min_profitable_iters doesn't go negative.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2017-11-08 16:35:04.770241799 +
+++ gcc/tree-vect-loop.c2017-11-08 16:47:06.386264822 +
@@ -3651,23 +3651,24 @@ vect_estimate_min_profitable_iters (loop
 
   if ((scalar_single_iter_cost * assumed_vf) > (int) vec_inside_cost)
 {
-  if (vec_outside_cost <= 0)
+  min_profitable_iters = ((vec_outside_cost - scalar_outside_cost)
+ * assumed_vf
+ - vec_inside_cost * peel_iters_prologue
+ - vec_inside_cost * peel_iters_epilogue);
+
+  if (min_profitable_iters <= 0)
 min_profitable_iters = 0;
   else
-{
- min_profitable_iters = ((vec_outside_cost - scalar_outside_cost)
- * assumed_vf
- - vec_inside_cost * peel_iters_prologue
- - vec_inside_cost * peel_iters_epilogue)
-/ ((scalar_single_iter_cost * assumed_vf)
-   - vec_inside_cost);
+   {
+ min_profitable_iters /= ((scalar_single_iter_cost * assumed_vf)
+  - vec_inside_cost);
 
  if ((scalar_single_iter_cost * assumed_vf * min_profitable_iters)
  <= (((int) vec_inside_cost * min_profitable_iters)
  + (((int) vec_outside_cost - scalar_outside_cost)
 * assumed_vf)))
min_profitable_iters++;
-}
+   }
 }
   /* vector version will never be profitable.  */
   else


Re: [001/nnn] poly_int: add poly-int.h

2017-11-08 Thread Martin Sebor

On 11/08/2017 02:32 AM, Richard Sandiford wrote:

Martin Sebor  writes:

I haven't done nearly a thorough review but the dtor followed by
the placement new in the POLY_SET_COEFF() macro caught my eye so
I thought I'd ask sooner rather than later.  Given the macro
definition:

+   The dummy comparison against a null C * is just a way of checking
+   that C gives the right type.  */
+#define POLY_SET_COEFF(C, RES, I, VALUE) \
+  ((void) (&(RES).coeffs[0] == (C *) 0), \
+   wi::int_traits::precision_type == wi::FLEXIBLE_PRECISION \
+   ? (void) ((RES).coeffs[I] = VALUE) \
+   : (void) ((RES).coeffs[I].~C (), new (&(RES).coeffs[I]) C (VALUE)))

is the following use well-defined?

+template
+inline poly_int_pod&
+poly_int_pod::operator <<= (unsigned int a)
+{
+  POLY_SET_COEFF (C, *this, 0, this->coeffs[0] << a);

It looks to me as though the VALUE argument in the ctor invoked
by the placement new expression is evaluated after the dtor has
destroyed the very array element the VALUE argument expands to.


Good catch!  It should simply have been doing <<= on each coefficient --
I must have got carried away when converting to POLY_SET_COEFF.

I double-checked the other uses and think that's the only one.


Whether or not is, in fact, a problem, it seems to me that using
a function template rather than a macro would be a clearer and
safer way to do the same thing.  (Safer in that the macro also
evaluates its arguments multiple times, which is often a source
of subtle bugs.)


That would slow down -O0 builds though, by introducing an extra
function call and set of temporaries even when the coefficients
are primitive integers.


Would decorating the function template with attribute always_inline
help?

Martin


[Ada] Implementation of AI12-0127 : delta aggregate

2017-11-08 Thread Pierre-Marie de Rodat
This patch updates the implementation of Ada2020 delta aggregates, so they
can be used in the context of a private extension of a record type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-11-08  Ed Schonberg  

* sem_ch4.adb (Analyze_Aggregate): For Ada2020 delta aggregates, use
the type of the base of the construct to determine the type (or
candidate interpretations) of the delta aggregate. This allows the
construct to appear in a context that expects a private extension.
* sem_res.adb (Resolve): Handle properly a delta aggregate with an
overloaded base.

gcc/testsuite/

2017-11-08  Ed Schonberg  

* gnat.dg/delta_aggr.adb: New testcase.
Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 254542)
+++ sem_ch4.adb (working copy)
@@ -414,12 +414,44 @@
---
 
--  Most of the analysis of Aggregates requires that the type be known,
-   --  and is therefore put off until resolution.
+   --  and is therefore put off until resolution of the context.
+   --  Delta aggregates have a base component that determines the type of the
+   --  enclosing aggregate so its type can be ascertained earlier. This also
+   --  allows delta aggregates to appear in the context of a record type with
+   --  a private extension, as per the latest update of AI2-0127.
 
procedure Analyze_Aggregate (N : Node_Id) is
begin
   if No (Etype (N)) then
- Set_Etype (N, Any_Composite);
+ if Nkind (N) = N_Delta_Aggregate then
+declare
+   Base : constant Node_Id := Expression (N);
+   I  : Interp_Index;
+   It : Interp;
+
+begin
+   Analyze (Base);
+
+   --  If the base is overloaded, propagate interpretations
+   --  to the enclosing aggregate.
+
+   if Is_Overloaded (Base) then
+  Get_First_Interp (Base, I, It);
+  Set_Etype (N, Any_Type);
+
+  while Present (It.Nam) loop
+ Add_One_Interp (N, It.Typ, It.Typ);
+ Get_Next_Interp (I, It);
+  end loop;
+
+   else
+  Set_Etype (N, Etype (Base));
+   end if;
+end;
+
+ else
+Set_Etype (N, Any_Composite);
+ end if;
   end if;
end Analyze_Aggregate;
 
Index: sem_res.adb
===
--- sem_res.adb (revision 254542)
+++ sem_res.adb (working copy)
@@ -2439,15 +2439,13 @@
   Set_Entity (N, Seen);
   Generate_Reference (Seen, N);
 
-   elsif Nkind (N) = N_Case_Expression then
+   elsif Nkind_In (N, N_Case_Expression,
+  N_Character_Literal,
+  N_If_Expression,
+  N_Delta_Aggregate)
+   then
   Set_Etype (N, Expr_Type);
 
-   elsif Nkind (N) = N_Character_Literal then
-  Set_Etype (N, Expr_Type);
-
-   elsif Nkind (N) = N_If_Expression then
-  Set_Etype (N, Expr_Type);
-
--  AI05-0139-2: Expression is overloaded because type has
--  implicit dereference. If type matches context, no implicit
--  dereference is involved.
Index: ../testsuite/gnat.dg/delta_aggr.adb
===
--- ../testsuite/gnat.dg/delta_aggr.adb (revision 0)
+++ ../testsuite/gnat.dg/delta_aggr.adb (revision 0)
@@ -0,0 +1,51 @@
+--  { dg-do compile }
+--  { dg-options "-gnat2020" }
+
+procedure Delta_Aggr is
+   type T1 is tagged record
+  F1, F2, F3 : Integer := 0;
+   end record;
+
+   function Make (X : Integer)  return T1 is
+   begin
+  return (10, 20, 30);
+   end Make;
+
+   package Pkg is
+  type T2 is new T1 with private;
+  X, Y : constant T2;
+  function Make (X : Integer) return T2;
+   private
+  type T2 is new T1 with
+ record
+F4 : Integer := 0;
+ end record;
+  X : constant T2 := (0, 0, 0, 0);
+  Y : constant T2 := (1, 2, 0, 0);
+   end Pkg;
+
+   package body Pkg is
+  function Make (X : Integer) return T2 is
+  begin
+ return (X, X ** 2, X ** 3, X ** 4);
+  end Make;
+   end Pkg;
+
+   use Pkg;
+
+   Z : T2 := (Y with delta F1 => 111);
+
+   -- a legal delta aggregate whose type is a private extension
+   pragma Assert (Y = (X with delta F1 => 1, F2 => 2));
+   pragma assert (Y.F2 = X.F1);
+
+begin
+   Z := (X with delta F1 => 1);
+
+   --  The base of the delta aggregate can be overloaded, in which case
+   --  the candidate interpretations for the aggregate are those of the
+   --  base, to be resolved from context.
+
+   Z 

Re: [PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Marek Polacek
On Wed, Nov 08, 2017 at 05:22:45PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> Here is an attempt to fix these two PRs.  The C++ FE already has an LVAL

Nice!

> bool that it propagates through constexpr.c functions, or in
> cp-gimplify.c through calling cp_fold_{maybe_,}rvalue where appropriate.
> The C c_fully_fold was instead just calling 
> decl_constant_value_for_optimization
> in some spots where we've previously called c_fully_fold{_internal,} on
> some rvalue.
> decl_constant_value_for_optimization wasn't doing anything when -O0,
> so we got different behavior between -O0 and -O1+ on what is accepted
> in static initializers.  Furthermore, we needed the hack to ignore
> vars with ARRAY_TYPE or BLKmode, so that we actually wouldn't fold lvalues
> to rvalues.
> 
> This patch adds LVAL to c_fully_fold{_internal,} and does what
> decl_constant_value_for_optimization did inside of it (without the hacks,
> furthermore, and additionally for -O0 when in initializers for consistency),
> of course only if LVAL is false.  Additionally, I've added folding of
> "foo"[2] into 'o'.  We have it in gimple-fold.c or so, so that one
> isn't performed when not in_init.
> 
> Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
> false as LVAL for the first operand (condition) and lval as LVAL for the
> other two (i.e. if called with lval == true on the whole *_COND_EXPR
> decl_constant_value_for_optimization etc. isn't performed on op1/op2, while
> without it it is).  Can one take address of the whole COND_EXPR, or
> have it on LHS of anything in C?  

I don't think so, the ?: operator does not yield an lvalue.

> @@ -218,12 +235,51 @@ c_fully_fold_internal (tree expr, bool i
>op2 = TREE_OPERAND (expr, 2);
>op3 = TREE_OPERAND (expr, 3);
>op0 = c_fully_fold_internal (op0, in_init, maybe_const_operands,
> -maybe_const_itself, for_int_const);
> +maybe_const_itself, for_int_const, lval);
>STRIP_TYPE_NOPS (op0);
>op1 = c_fully_fold_internal (op1, in_init, maybe_const_operands,
> -maybe_const_itself, for_int_const);
> +maybe_const_itself, for_int_const, false);
>STRIP_TYPE_NOPS (op1);
> -  op1 = decl_constant_value_for_optimization (op1);
> +  /* Fold "foo"[2] in initializers.  */
> +  if (!lval
> +   && in_init
> +   && TREE_CODE (op0) == STRING_CST
> +   && TREE_CODE (op1) == INTEGER_CST
> +   && TREE_CODE (TREE_TYPE (op0)) == ARRAY_TYPE

Does this also handle 2["foobar"]?

> +   && tree_fits_uhwi_p (op1))
> + {
> +   tree ary = op0;
> +   tree index = op1;
> +   unsigned len = 0;
> +   tree elem_type = TREE_TYPE (TREE_TYPE (ary));
> +   unsigned elem_nchars = (TYPE_PRECISION (elem_type)
> +   / TYPE_PRECISION (char_type_node));
> +   len = (unsigned) TREE_STRING_LENGTH (ary) / elem_nchars;
> +
> +   tree nelts = array_type_nelts (TREE_TYPE (ary));
> +   bool dummy1 = true, dummy2 = true;
> +   nelts = c_fully_fold_internal (nelts, in_init, , ,
> +  for_int_const, false);
> +   unsigned HOST_WIDE_INT i = tree_to_uhwi (index);
> +   if (tree_int_cst_le (index, nelts)
> +   && i < len
> +   && i + elem_nchars <= len)
> + {
> +   tree type = TREE_TYPE (expr);
> +   if (elem_nchars == 1)
> + ret = build_int_cst (type,
> +  TREE_STRING_POINTER (ary)[i]);
> +   else
> + {
> +   const unsigned char *ptr
> + = ((const unsigned char *)TREE_STRING_POINTER (ary)
> ++ i * elem_nchars);
> +   ret = native_interpret_expr (type, ptr, elem_nchars);
> + }
> +   if (ret)
> + goto out;
> + }
> + }

So this code comes from gimple-fold.c?  I would probably move it to a new
function.

Marek


Add support for masked load/store_lanes

2017-11-08 Thread Richard Sandiford
This patch adds support for vectorising groups of IFN_MASK_LOADs
and IFN_MASK_STOREs using conditional load/store-lanes instructions.
This requires new internal functions to represent the result
(IFN_MASK_{LOAD,STORE}_LANES), as well as associated optabs.

The normal IFN_{LOAD,STORE}_LANES functions are const operations
that logically just perform the permute: the load or store is
encoded as a MEM operand to the call statement.  In contrast,
the IFN_MASK_{LOAD,STORE}_LANES functions use the same kind of
interface as IFN_MASK_{LOAD,STORE}, since the memory is only
conditionally accessed.

The AArch64 patterns were added as part of the main LD[234]/ST[234] patch.

Tested on aarch64-linux-gnu (both with and without SVE), x86_64-linux-gnu
and powerpc64le-linux-gnu.  OK to install?

Thanks,
Richard


2017-11-08  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* optabs.def (vec_mask_load_lanes_optab): New optab.
(vec_mask_store_lanes_optab): Likewise.
* internal-fn.def (MASK_LOAD_LANES): New internal function.
(MASK_STORE_LANES): Likewise.
* internal-fn.c (mask_load_lanes_direct): New macro.
(mask_store_lanes_direct): Likewise.
(expand_mask_load_optab_fn): Handle masked operations.
(expand_mask_load_lanes_optab_fn): New macro.
(expand_mask_store_optab_fn): Handle masked operations.
(expand_mask_store_lanes_optab_fn): New macro.
(direct_mask_load_lanes_optab_supported_p): Likewise.
(direct_mask_store_lanes_optab_supported_p): Likewise.
* tree-vectorizer.h (vect_store_lanes_supported): Take a masked_p
parameter.
(vect_load_lanes_supported): Likewise.
* tree-vect-data-refs.c (strip_conversion): New function.
(can_group_stmts_p): Likewise.
(vect_analyze_data_ref_accesses): Use it instead of checking
for a pair of assignments.
(vect_store_lanes_supported): Take a masked_p parameter.
(vect_load_lanes_supported): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Update calls to
vect_store_lanes_supported and vect_load_lanes_supported.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
* tree-vect-stmts.c (replace_mask_load): New function, split
out from vectorizable_mask_load_store.  Keep the group information
up-to-date.
(get_store_op): New function.
(get_group_load_store_type): Take a masked_p parameter.  Don't
allow gaps for masked accesses.  Use get_store_op.  Update calls
to vect_store_lanes_supported and vect_load_lanes_supported.
(get_load_store_type): Take a masked_p parameter and update
call to get_group_load_store_type.
(init_stored_values, advance_stored_values): New functions,
split out from vectorizable_store.
(do_load_lanes, do_store_lanes): New functions.
(get_masked_group_alias_ptr_type): New function.
(vectorizable_mask_load_store): Update call to get_load_store_type.
Handle masked VMAT_LOAD_STORE_LANES.  Update GROUP_STORE_COUNT
when vectorizing a group of stores and only vectorize when we
reach the last statement in the group.  Vectorize the first
statement in a group of loads.  Use an array aggregate type
rather than a vector type for load/store_lanes.  Use
init_stored_values, advance_stored_values, do_load_lanes,
do_store_lanes, get_masked_group_alias_ptr_type and replace_mask_load.
(vectorizable_store): Update call to get_load_store_type.
Use init_stored_values, advance_stored_values and do_store_lanes.
(vectorizable_load): Update call to get_load_store_type.
Use do_load_lanes.
(vect_transform_stmt): Set grouped_store for grouped IFN_MASK_STOREs.
Only set is_store for the last element in the group.

gcc/testsuite/
* gcc.dg/vect/vect-ooo-group-1.c: New test.
* gcc.target/aarch64/sve_mask_struct_load_1.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_1_run.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_2.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_2_run.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_3.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_3_run.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_4.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_5.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_6.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_7.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_load_8.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_store_1.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_store_1_run.c: Likewise.
* gcc.target/aarch64/sve_mask_struct_store_2.c: Likewise.
 

Re: [PATCH] Fix UBSAN errors in dse.c (PR rtl-optimization/82044).

2017-11-08 Thread Jeff Law
On 11/02/2017 07:15 AM, Martin Liška wrote:
> PING^1
I don't see an updated patch in this thread?  THe last message I see is
this one where you indicate you're going to tweak the patch and re-test.

Jeff

> 
> On 10/19/2017 01:36 PM, Martin Liška wrote:
>> On 09/20/2017 10:15 AM, Jakub Jelinek wrote:
>>> On Wed, Sep 20, 2017 at 09:50:32AM +0200, Martin Liška wrote:
 Hello.

 Following patch handles UBSAN (overflow) in dce.c.
>>>
>>> dse.c ;)
>>>
 --- a/gcc/dse.c
 +++ b/gcc/dse.c
 @@ -929,7 +929,9 @@ set_usage_bits (group_info *group, HOST_WIDE_INT 
 offset, HOST_WIDE_INT width,
  {
HOST_WIDE_INT i;
bool expr_escapes = can_escape (expr);
 -  if (offset > -MAX_OFFSET && offset + width < MAX_OFFSET)
 +  if (offset > -MAX_OFFSET
 +  && offset < MAX_OFFSET
 +  && offset + width < MAX_OFFSET)
>>>
>>> This can still overflow if width is close to HOST_WIDE_INT_MAX.
>>> Anyway, I don't remember this code too much, but wonder if either offset or
>>> width or their sum is outside of the -MAX_OFFSET, MAX_OFFSET range if we
>>> still don't want to record usage bits at least in the intersection of
>>> -MAX_OFFSET, MAX_OFFSET and offset, offset + width (the latter performed
>>> with infinite precision; though, if record_store is changed as suggested
>>> below, offset + width shouldn't overflow).
>>>
  for (i=offset; igroup_id = group_id;
store_info->begin = offset;
 -  store_info->end = offset + width;
 +  if (offset > HOST_WIDE_INT_MAX - width)
 +store_info->end = HOST_WIDE_INT_MAX;
 +  else
 +store_info->end = offset + width;
>>>
>>> If offset + width overflows, I think we risk wrong-code by doing this, plus
>>> there are 3 other offset + width computations earlier in record_store
>>> before we reach this.  I think instead we should treat such cases as wild
>>> stores early, i.e.:
>>>if (!canon_address (mem, _id, , ))
>>>  {
>>>clear_rhs_from_active_local_stores ();
>>>return 0;
>>>  }
>>>  
>>>if (GET_MODE (mem) == BLKmode)
>>>  width = MEM_SIZE (mem);
>>>else
>>>  width = GET_MODE_SIZE (GET_MODE (mem));
>>>
>>> +  if (offset > HOST_WIDE_INT_MAX - width)
>>> +{
>>> +  clear_rhs_from_active_local_stores ();
>>> +  return 0;
>>> +}
>>>
>>> or so.
>>>
 +
store_info->is_set = GET_CODE (body) == SET;
store_info->rhs = rhs;
store_info->const_rhs = const_rhs;
 @@ -1976,6 +1982,14 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info)
return;
  }
  
 +  if (offset > MAX_OFFSET)
 +{
 +  if (dump_file && (dump_flags & TDF_DETAILS))
 +  fprintf (dump_file, " reaches MAX_OFFSET.\n");
 +  add_wild_read (bb_info);
 +  return;
 +}
 +
>>
>> Hi.
>>
>> The later one works for me. I'm going to regtest that.
>>
>> Ready after it survives regression tests?
>>
>> Thanks,
>> Martin
>>
>>>
>>> Is offset > MAX_OFFSET really problematic (and not just the width != -1 &&
>>> offset + width overflowing case)?
>>>
if (GET_MODE (mem) == BLKmode)
  width = -1;
else

>>>
>>>
>>> Jakub
>>>
>>
> 



[PATCH] Add LVAL argument to c_fully_fold* and propagate it through (PR c/66618, PR c/69960)

2017-11-08 Thread Jakub Jelinek
Hi!

Here is an attempt to fix these two PRs.  The C++ FE already has an LVAL
bool that it propagates through constexpr.c functions, or in
cp-gimplify.c through calling cp_fold_{maybe_,}rvalue where appropriate.
The C c_fully_fold was instead just calling decl_constant_value_for_optimization
in some spots where we've previously called c_fully_fold{_internal,} on
some rvalue.
decl_constant_value_for_optimization wasn't doing anything when -O0,
so we got different behavior between -O0 and -O1+ on what is accepted
in static initializers.  Furthermore, we needed the hack to ignore
vars with ARRAY_TYPE or BLKmode, so that we actually wouldn't fold lvalues
to rvalues.

This patch adds LVAL to c_fully_fold{_internal,} and does what
decl_constant_value_for_optimization did inside of it (without the hacks,
furthermore, and additionally for -O0 when in initializers for consistency),
of course only if LVAL is false.  Additionally, I've added folding of
"foo"[2] into 'o'.  We have it in gimple-fold.c or so, so that one
isn't performed when not in_init.

Not sure about the COND_EXPR/VEC_COND_EXPR cases, right now I'm passing
false as LVAL for the first operand (condition) and lval as LVAL for the
other two (i.e. if called with lval == true on the whole *_COND_EXPR
decl_constant_value_for_optimization etc. isn't performed on op1/op2, while
without it it is).  Can one take address of the whole COND_EXPR, or
have it on LHS of anything in C?  If not, perhaps I could just pass false
in all the 3 calls.  Though, then likely it would be called with lval == false
anyway.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-08  Jakub Jelinek  

PR c/66618
PR c/69960
c-family/
* c-common.h (c_fully_fold): Add LVAL argument defaulted to false.
c/
* c-parser.c (c_parser_omp_atomic): Pass true as LVAL to c_fully_fold
where needed.
* c-typeck.c (build_unary_op, build_modify_expr, build_asm_expr,
handle_omp_array_sections): Likewise.
(digest_init): Don't call decl_constant_value_for_optimization.
* c-tree.h (decl_constant_value_for_optimization): Removed.
* c-fold.c (c_fully_fold_internal): Add LVAL argument, propagate
it through recursive calls.  For VAR_P call decl_constant_value and
unshare if not LVAL and either optimizing or IN_INIT.  Remove
decl_constant_value_for_optimization calls.  If IN_INIT and not LVAL,
fold ARRAY_REF with STRING_CST and INTEGER_CST operands.
(c_fully_fold): Add LVAL argument, pass it through to
c_fully_fold_internal.
(decl_constant_value_for_optimization): Removed.
cp/
* cp-gimplify.c (c_fully_fold): Add LVAL argument, call
cp_fold_maybe_rvalue instead of cp_fold_rvalue and pass it !LVAL.
testsuite/
* gcc.dg/pr69960.c: New test.
* gcc.dg/pr66618.c: New test.

--- gcc/c-family/c-common.h.jj  2017-10-12 20:51:30.0 +0200
+++ gcc/c-family/c-common.h 2017-11-08 12:20:59.874481318 +0100
@@ -827,7 +827,7 @@ extern tree c_build_bitfield_integer_typ
 extern enum conversion_safety unsafe_conversion_p (location_t, tree, tree, 
tree,
   bool);
 extern bool decl_with_nonnull_addr_p (const_tree);
-extern tree c_fully_fold (tree, bool, bool *);
+extern tree c_fully_fold (tree, bool, bool *, bool = false);
 extern tree c_wrap_maybe_const (tree, bool);
 extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
--- gcc/c/c-parser.c.jj 2017-11-08 12:29:02.0 +0100
+++ gcc/c/c-parser.c2017-11-08 13:53:39.756889500 +0100
@@ -14748,7 +14748,7 @@ c_parser_omp_atomic (location_t loc, c_p
 case NOP_EXPR: /* atomic write */
   v = c_parser_cast_expression (parser, NULL).value;
   non_lvalue_p = !lvalue_p (v);
-  v = c_fully_fold (v, false, NULL);
+  v = c_fully_fold (v, false, NULL, true);
   if (v == error_mark_node)
goto saw_error;
   if (non_lvalue_p)
@@ -14767,7 +14767,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  lhs = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (lhs);
- lhs = c_fully_fold (lhs, false, NULL);
+ lhs = c_fully_fold (lhs, false, NULL, true);
  if (lhs == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14793,7 +14793,7 @@ c_parser_omp_atomic (location_t loc, c_p
{
  v = c_parser_cast_expression (parser, NULL).value;
  non_lvalue_p = !lvalue_p (v);
- v = c_fully_fold (v, false, NULL);
+ v = c_fully_fold (v, false, NULL, true);
  if (v == error_mark_node)
goto saw_error;
  if (non_lvalue_p)
@@ -14814,7 +14814,7 @@ restart:
   lhs = expr.value;
   expr = default_function_array_conversion (eloc, expr);
   unfolded_lhs = expr.value;
-  lhs = c_fully_fold (lhs, false, 

[Ada] Ignore file and unit names when collecting SPARK cross-references

2017-11-08 Thread Pierre-Marie de Rodat
Human-readable file and unit names in SPARK cross-references were only
needed to make the ALI file human-redable. They are now removed (but can
be added to the debug routine dspark if needed).

Modified code is only executed as part of GNATprove, so no impact on the
frontend. Behaviour unaffected, so no test provided.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* spark_xrefs.ads (SPARK_File_Record): Remove string components.
* spark_xrefs.adb (dspark): Remove pretty-printing of removed
SPARK_File_Record components.
* lib-xref-spark_specific.adb (Add_SPARK_File): Do not store string
representation of files/units.

Index: lib-xref-spark_specific.adb
===
--- lib-xref-spark_specific.adb (revision 254540)
+++ lib-xref-spark_specific.adb (working copy)
@@ -208,11 +208,6 @@
   procedure Traverse_Scopes is new
 Traverse_Compilation_Unit (Detect_And_Add_SPARK_Scope);
 
-  --  Local variables
-
-  File_Name  : String_Ptr;
-  Unit_File_Name : String_Ptr;
-
--  Start of processing for Add_SPARK_File
 
begin
@@ -240,29 +235,10 @@
  Traverse_Scopes (CU => Cunit (Ubody));
   end if;
 
-  --  Make entry for new file in file table
-
-  Get_Name_String (Reference_Name (File));
-  File_Name := new String'(Name_Buffer (1 .. Name_Len));
-
-  --  For subunits, also retrieve the file name of the unit. Only do so if
-  --  unit has an associated compilation unit.
-
-  if Present (Cunit (Unit (File)))
-and then Nkind (Unit (Cunit (Unit (File = N_Subunit
-  then
- Get_Name_String (Reference_Name (Main_Source_File));
- Unit_File_Name := new String'(Name_Buffer (1 .. Name_Len));
-  else
- Unit_File_Name := null;
-  end if;
-
   SPARK_File_Table.Append (
-(File_Name  => File_Name,
- Unit_File_Name => Unit_File_Name,
- File_Num   => Dspec,
- From_Scope => From,
- To_Scope   => SPARK_Scope_Table.Last));
+(File_Num   => Dspec,
+ From_Scope => From,
+ To_Scope   => SPARK_Scope_Table.Last));
end Add_SPARK_File;
 
-
Index: spark_xrefs.adb
===
--- spark_xrefs.adb (revision 254539)
+++ spark_xrefs.adb (working copy)
@@ -48,13 +48,6 @@
 Write_Int (Int (Index));
 Write_Str (".  File_Num = ");
 Write_Int (Int (AFR.File_Num));
-Write_Str ("  File_Name = """);
-
-if AFR.File_Name /= null then
-   Write_Str (AFR.File_Name.all);
-end if;
-
-Write_Char ('"');
 Write_Str ("  From = ");
 Write_Int (Int (AFR.From_Scope));
 Write_Str ("  To = ");
Index: spark_xrefs.ads
===
--- spark_xrefs.ads (revision 254539)
+++ spark_xrefs.ads (working copy)
@@ -154,13 +154,6 @@
--  entries have been constructed.
 
type SPARK_File_Record is record
-  File_Name : String_Ptr;
-  --  Pointer to file name in ALI file
-
-  Unit_File_Name : String_Ptr;
-  --  Pointer to file name for unit in ALI file, when File_Name refers to a
-  --  subunit; otherwise null.
-
   File_Num : Nat;
   --  Dependency number in ALI file
 


Re: [PATCH] Remove non needed check in bmp_iter_set_init (PR tree-optimization/82669).

2017-11-08 Thread Jeff Law
On 11/08/2017 12:13 AM, Martin Liška wrote:
> Hello.
> 
> Assert removal is logical as it's used in iteration and if sbitmap is empty, 
> iteration
> macro will not touch any element of a bitmap.
> 
> Ready after it survives regression tests?
> 
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-11-08  Martin Liska  
> 
>   PR tree-optimization/82669
>   * sbitmap.h (bmp_iter_set_init): Remove non needed check.
OK.
jeff


[Ada] Continue decontruction of SPARK frontend cross-references

2017-11-08 Thread Pierre-Marie de Rodat
While traversing the AST to collect SPARK cross-references (which are used
to synthesize Global contracts for code with SPARK_Mode => Off), we always
traverse body stubs. There is no need to configure this by a parameter.

The modified code is only executed as part of GNATprove. Behaviour unchanged,
so no test provided.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* lib-xref.ads, lib-xref-spark_specific.adb (Traverse_Declarations):
Remove Inside_Stubs parameter.

Index: lib-xref-spark_specific.adb
===
--- lib-xref-spark_specific.adb (revision 254539)
+++ lib-xref-spark_specific.adb (working copy)
@@ -230,14 +230,14 @@
  return;
   end if;
 
-  Traverse_Scopes (CU => Cunit (Uspec), Inside_Stubs => True);
+  Traverse_Scopes (CU => Cunit (Uspec));
 
   --  When two units are present for the same compilation unit, as it
   --  happens for library-level instantiations of generics, then add all
   --  scopes to the same SPARK file.
 
   if Ubody /= No_Unit then
- Traverse_Scopes (CU => Cunit (Ubody), Inside_Stubs => True);
+ Traverse_Scopes (CU => Cunit (Ubody));
   end if;
 
   --  Make entry for new file in file table
@@ -1156,10 +1156,7 @@
-- Traverse_Compilation_Unit --
---
 
-   procedure Traverse_Compilation_Unit
- (CU   : Node_Id;
-  Inside_Stubs : Boolean)
-   is
+   procedure Traverse_Compilation_Unit (CU : Node_Id) is
   procedure Traverse_Block  (N : Node_Id);
   procedure Traverse_Declaration_Or_Statement   (N : Node_Id);
   procedure Traverse_Declarations_And_HSS   (N : Node_Id);
@@ -1195,7 +1192,7 @@
 N_Subprogram_Body_Stub,
 N_Task_Body_Stub));
 
-return Inside_Stubs and then Present (Library_Unit (N));
+return Present (Library_Unit (N));
  end Traverse_Stub;
 
   --  Start of processing for Traverse_Declaration_Or_Statement
Index: lib-xref.ads
===
--- lib-xref.ads(revision 254539)
+++ lib-xref.ads(working copy)
@@ -647,12 +647,9 @@
 
   generic
  with procedure Process (N : Node_Id) is <>;
-  procedure Traverse_Compilation_Unit
-(CU   : Node_Id;
- Inside_Stubs : Boolean);
-  --  Call Process on all declarations within compilation unit CU. If
-  --  Inside_Stubs is True, then the body of stubs is also traversed.
-  --  Generic declarations are ignored.
+  procedure Traverse_Compilation_Unit (CU : Node_Id);
+  --  Call Process on all declarations within compilation unit CU. Bodies
+  --  of stubs are also traversed, but generic declarations are ignored.
 
end SPARK_Specific;
 


[Ada] Missing categorization check on generic subprogram body

2017-11-08 Thread Pierre-Marie de Rodat
This patch adds a categorization check on a generic subprogram body, so
that the compiler can reject a generic subprogram marked Pure if its body
depends on an impure unit.

Compiling gf.adb must yield:

   gf.adb:2:06: cannot depend on "Impure" (wrong categorization)
   gf.adb:2:06: pure unit cannot depend on non-pure unit

--
generic
function GF return String with Pure;
---
with Impure;
function GF return String is
begin
   return Impure.Think_Rotten_Thoughts;
end GF;
---
package Impure is
   function Think_Rotten_Thoughts return String;
end;
---
package body Impure is
   Count : Natural := 0;
   function Think_Rotten_Thoughts return String is
   begin
  Count := Count + 1;
  return "Rotten thought" & Natural'Image (Count);
   end;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Ed Schonberg  

* sem_ch6.adb (Analyze_Generic_Subprobram_Body): Validate
categorization dependency of the body, as is done for non-generic
units.
(New_Overloaded_Entity, Visible_Part_Type): Remove linear search
through declarations (Simple optimization, no behavior change).

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 254535)
+++ sem_ch6.adb (working copy)
@@ -1510,6 +1510,7 @@
 
   Process_End_Label (Handled_Statement_Sequence (N), 't', Current_Scope);
   Update_Use_Clause_Chain;
+  Validate_Categorization_Dependency (N, Gen_Id);
   End_Scope;
   Check_Subprogram_Order (N);
 
@@ -10118,7 +10119,6 @@
 
  function Visible_Part_Type (T : Entity_Id) return Boolean is
 P : constant Node_Id := Unit_Declaration_Node (Scope (T));
-N : Node_Id;
 
  begin
 --  If the entity is a private type, then it must be declared in a
@@ -10126,34 +10126,19 @@
 
 if Ekind (T) in Private_Kind then
return True;
-end if;
 
---  Otherwise, we traverse the visible part looking for its
---  corresponding declaration. We cannot use the declaration
---  node directly because in the private part the entity of a
---  private type is the one in the full view, which does not
---  indicate that it is the completion of something visible.
+elsif Is_Type (T) and then Has_Private_Declaration (T) then
+   return True;
 
-N := First (Visible_Declarations (Specification (P)));
-while Present (N) loop
-   if Nkind (N) = N_Full_Type_Declaration
- and then Present (Defining_Identifier (N))
- and then T = Defining_Identifier (N)
-   then
-  return True;
+elsif Is_List_Member (Declaration_Node (T))
+  and then List_Containing (Declaration_Node (T)) =
+ Visible_Declarations (Specification (P))
+then
+   return True;
 
-   elsif Nkind_In (N, N_Private_Type_Declaration,
-  N_Private_Extension_Declaration)
- and then Present (Defining_Identifier (N))
- and then T = Full_View (Defining_Identifier (N))
-   then
-  return True;
-   end if;
-
-   Next (N);
-end loop;
-
-return False;
+else
+   return False;
+end if;
  end Visible_Part_Type;
 
   --  Start of processing for Check_For_Primitive_Subprogram


[Ada] Spurious ineffective use_clause warning on class-wide type

2017-11-08 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby the use of a class-wide type's primitive
did not lead to its base type being recognized as effective - causing to
spurious use_clause warnings. Additionally, class-wide types used as generic
actuals were not checked in certain cases due to not being flagged as
potentially visible.


-- Source --


--  pck.ads

package Pck is
   type R (V : Positive) is abstract tagged private;
private
   type R (V : Positive) is abstract tagged null record;
end Pck;

--  proc.adb

with Ada.Containers.Indefinite_Vectors;
with Pck;
procedure Proc is
   use type Pck.R;
   package V is new Ada.Containers.Indefinite_Vectors (Positive, Pck.R'Class);
begin
   null;
end Proc;


-- Compilation and output --


& gcc -c -gnatwu proc.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Justin Squirek  

* sem_ch8.adb (Mark_Use_Clauses): Add condition to always mark the
primitives of generic actuals.
(Mark_Use_Type): Add recursive call to properly mark class-wide type's
base type clauses as per ARM 8.4 (8.2/3).

Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 254535)
+++ sem_ch8.adb (working copy)
@@ -8320,6 +8320,7 @@
 
   procedure Mark_Use_Type (E : Entity_Id) is
  Curr : Node_Id;
+ Base : Entity_Id;
 
   begin
  --  Ignore void types and unresolved string literals and primitives
@@ -8331,12 +8332,22 @@
 return;
  end if;
 
+ --  Primitives with class-wide operands might additionally render
+ --  their base type's use_clauses effective - so do a recursive check
+ --  here.
+
+ Base := Base_Type (Etype (E));
+
+ if Ekind (Base) = E_Class_Wide_Type then
+Mark_Use_Type (Base);
+ end if;
+
  --  The package containing the type or operator function being used
  --  may be in use as well, so mark any use_package_clauses for it as
  --  effective. There are also additional sanity checks performed here
  --  for ignoring previous errors.
 
- Mark_Use_Package (Scope (Base_Type (Etype (E;
+ Mark_Use_Package (Scope (Base));
 
  if Nkind (E) in N_Op
and then Present (Entity (E))
@@ -8345,7 +8356,7 @@
 Mark_Use_Package (Scope (Entity (E)));
  end if;
 
- Curr := Current_Use_Clause (Base_Type (Etype (E)));
+ Curr := Current_Use_Clause (Base);
  while Present (Curr)
 and then not Is_Effective_Use_Clause (Curr)
  loop
@@ -8397,7 +8408,9 @@
  or else Ekind_In (Id, E_Generic_Function,
E_Generic_Procedure))
and then (Is_Potentially_Use_Visible (Id)
-  or else Is_Intrinsic_Subprogram (Id))
+  or else Is_Intrinsic_Subprogram (Id)
+  or else (Ekind_In (Id, E_Function, E_Procedure)
+and then Is_Generic_Actual_Subprogram (Id)))
  then
 Mark_Parameters (Id);
  end if;


[Ada] Store SPARK cross-references as Entity_Ids, not strings

2017-11-08 Thread Pierre-Marie de Rodat
GNATprove now picks frontend cross-references directly from memory and
not from an ALI file), so there is no need to convert them to strings;
it is cleaner and more efficient to store them as Entity_Ids. No test
provided, because the behaviour is not affected.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* spark_xrefs.ads (SPARK_Xref_Record): Referenced object is now
represented by Entity_Id.
(SPARK_Scope_Record): Referenced scope (e.g. subprogram) is now
represented by Entity_Id; this information is not repeated as
Scope_Entity.
(Heap): Moved from lib-xref-spark_specific.adb, to reside next to
Name_Of_Heap_Variable.
* spark_xrefs.adb (dspark): Adapt debug routine to above changes in
data types.
* lib-xref-spark_specific.adb: Adapt routines for populating SPARK
scope and xrefs tables to above changes in data types.

Index: lib-xref-spark_specific.adb
===
--- lib-xref-spark_specific.adb (revision 254538)
+++ lib-xref-spark_specific.adb (working copy)
@@ -66,9 +66,6 @@
-- Local Variables --
-
 
-   Heap : Entity_Id := Empty;
-   --  A special entity which denotes the heap object
-
package Drefs is new Table.Table (
  Table_Component_Type => Xref_Entry,
  Table_Index_Type => Xref_Entry_Number,
@@ -164,14 +161,13 @@
  --  range.
 
  SPARK_Scope_Table.Append
-   ((Scope_Name => new String'(Unique_Name (E)),
+   ((Scope_Id   => E,
  File_Num   => Dspec,
  Scope_Num  => Scope_Id,
  Spec_File_Num  => 0,
  Spec_Scope_Num => 0,
  From_Xref  => 1,
- To_Xref=> 0,
- Scope_Entity   => E));
+ To_Xref=> 0));
 
  Scope_Id := Scope_Id + 1;
   end Add_SPARK_Scope;
@@ -351,7 +347,7 @@
 
   function Entity_Of_Scope (S : Scope_Index) return Entity_Id is
   begin
- return SPARK_Scope_Table.Table (S).Scope_Entity;
+ return SPARK_Scope_Table.Table (S).Scope_Id;
   end Entity_Of_Scope;
 
   ---
@@ -423,7 +419,7 @@
  function Is_Past_Scope_Entity return Boolean is
  begin
 for Index in SPARK_Scope_Table.First .. S - 1 loop
-   if SPARK_Scope_Table.Table (Index).Scope_Entity = E then
+   if SPARK_Scope_Table.Table (Index).Scope_Id = E then
   return True;
end if;
 end loop;
@@ -435,7 +431,7 @@
 
   begin
  for Index in S .. SPARK_Scope_Table.Last loop
-if SPARK_Scope_Table.Table (Index).Scope_Entity = E then
+if SPARK_Scope_Table.Table (Index).Scope_Id = E then
return True;
 end if;
  end loop;
@@ -634,7 +630,7 @@
  declare
 S : SPARK_Scope_Record renames SPARK_Scope_Table.Table (Index);
  begin
-Set_Scope_Num (S.Scope_Entity, S.Scope_Num);
+Set_Scope_Num (S.Scope_Id, S.Scope_Num);
  end;
   end loop;
 
@@ -800,10 +796,10 @@
 end if;
 
 SPARK_Xref_Table.Append (
-  (Entity_Name => new String'(Unique_Name (Ref.Ent)),
-   File_Num=> Dependency_Num (Ref.Lun),
-   Scope_Num   => Get_Scope_Num (Ref.Ref_Scope),
-   Rtype   => Typ));
+  (Entity=> Unique_Entity (Ref.Ent),
+   File_Num  => Dependency_Num (Ref.Lun),
+   Scope_Num => Get_Scope_Num (Ref.Ref_Scope),
+   Rtype => Typ));
  end;
   end loop;
 
@@ -948,7 +944,7 @@
 declare
Srec : SPARK_Scope_Record renames SPARK_Scope_Table.Table (S);
 begin
-   Entity_Hash_Table.Set (Srec.Scope_Entity, S);
+   Entity_Hash_Table.Set (Srec.Scope_Id, S);
 end;
  end loop;
 
@@ -959,14 +955,14 @@
Srec : SPARK_Scope_Record renames SPARK_Scope_Table.Table (S);
 
Spec_Entity : constant Entity_Id :=
-   Unique_Entity (Srec.Scope_Entity);
+   Unique_Entity (Srec.Scope_Id);
Spec_Scope  : constant Scope_Index :=
Entity_Hash_Table.Get (Spec_Entity);
 
 begin
--  Generic spec may be missing in which case Spec_Scope is zero
 
-   if Spec_Entity /= Srec.Scope_Entity
+   if Spec_Entity /= Srec.Scope_Id
  and then Spec_Scope /= 0
then
   Srec.Spec_File_Num :=
Index: spark_xrefs.adb
===
--- spark_xrefs.adb (revision 254538)
+++ spark_xrefs.adb (working copy)
@@ -23,7 +23,8 @@
 --   

[Ada] Don't collect inessential data about SPARK cross-references

2017-11-08 Thread Pierre-Marie de Rodat
Part of deconstructing SPARK cross-references, which are used to synthesize
Global contracts for code annotated with SPARK_Mode => Off.

Data like line and column numbers was only needed to make the xrefs in the
ALI file more readable. Now that the xrefs are not written to the ALI file
at all, there is no need to collect this data.

Removed code was only executed as part of GNATprove (and its behaviour
remains as it was), so no test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* spark_xrefs.ads (SPARK_Xref_Record): Remove inessential components.
(SPARK_Scope_Record): Remove inessential components.
* spark_xrefs.adb (dspark): Remove pretty-printing of removed record
components.
* lib-xref-spark_specific.adb (Add_SPARK_Scope): Remove setting of
removed record components.
(Add_SPARK_Xrefs): Remove setting of removed record components.

Index: lib-xref-spark_specific.adb
===
--- lib-xref-spark_specific.adb (revision 254535)
+++ lib-xref-spark_specific.adb (working copy)
@@ -120,15 +120,8 @@
   -
 
   procedure Add_SPARK_Scope (N : Node_Id) is
- E   : constant Entity_Id  := Defining_Entity (N);
- Loc : constant Source_Ptr := Sloc (E);
+ E : constant Entity_Id := Defining_Entity (N);
 
- --  The character describing the kind of scope is chosen to be the
- --  same as the one describing the corresponding entity in cross
- --  references, see Xref_Entity_Letters in lib-xrefs.ads
-
- Typ : Character;
-
   begin
  --  Ignore scopes without a proper location
 
@@ -144,18 +137,15 @@
| E_Generic_Package
| E_Generic_Procedure
| E_Package
+   | E_Package_Body
| E_Procedure
+   | E_Protected_Body
| E_Protected_Type
+   | E_Task_Body
| E_Task_Type
-=>
-   Typ := Xref_Entity_Letters (Ekind (E));
-
-when E_Package_Body
-   | E_Protected_Body
| E_Subprogram_Body
-   | E_Task_Body
 =>
-   Typ := Xref_Entity_Letters (Ekind (Unique_Entity (E)));
+   null;
 
 when E_Void =>
 
@@ -179,9 +169,6 @@
  Scope_Num  => Scope_Id,
  Spec_File_Num  => 0,
  Spec_Scope_Num => 0,
- Line   => Nat (Get_Logical_Line_Number (Loc)),
- Stype  => Typ,
- Col=> Nat (Get_Column_Number (Loc)),
  From_Xref  => 1,
  To_Xref=> 0,
  Scope_Entity   => E));
@@ -290,9 +277,6 @@
   function Entity_Of_Scope (S : Scope_Index) return Entity_Id;
   --  Return the entity which maps to the input scope index
 
-  function Get_Entity_Type (E : Entity_Id) return Character;
-  --  Return a character representing the type of entity
-
   function Get_Scope_Num (E : Entity_Id) return Nat;
   --  Return the scope number associated with the entity E
 
@@ -370,20 +354,6 @@
  return SPARK_Scope_Table.Table (S).Scope_Entity;
   end Entity_Of_Scope;
 
-  -
-  -- Get_Entity_Type --
-  -
-
-  function Get_Entity_Type (E : Entity_Id) return Character is
-  begin
- case Ekind (E) is
-when E_Out_Parameter=> return '<';
-when E_In_Out_Parameter => return '=';
-when E_In_Parameter => return '>';
-when others => return '*';
- end case;
-  end Get_Entity_Type;
-
   ---
   -- Get_Scope_Num --
   ---
@@ -651,9 +621,7 @@
 
   --  Local variables
 
-  Col: Nat;
   From_Index : Xref_Index;
-  Line   : Nat;
   Prev_Loc   : Source_Ptr;
   Prev_Typ   : Character;
   Ref_Count  : Nat;
@@ -817,14 +785,6 @@
pragma Assert (Scope_Id <= SPARK_Scope_Table.Last);
 end loop;
 
-if Ref.Ent = Heap then
-   Line := 0;
-   Col  := 0;
-else
-   Line := Nat (Get_Logical_Line_Number (Ref_Entry.Def));
-   Col  := Nat (Get_Column_Number (Ref_Entry.Def));
-end if;
-
 --  References to constant objects without variable inputs (see
 --  SPARK RM 3.3.1) are considered specially in SPARK section,
 --  because these will be translated as constants in the
@@ -841,14 +801,9 @@
 
 SPARK_Xref_Table.Append (
   (Entity_Name => new String'(Unique_Name (Ref.Ent)),
-   Entity_Line => Line,
-   Etype   => Get_Entity_Type (Ref.Ent),
-   Entity_Col  => Col,
File_Num=> 

Re: [PATCH] Replace has_single_use guards in store-merging

2017-11-08 Thread Jakub Jelinek
On Wed, Nov 08, 2017 at 04:20:15PM +0100, Richard Biener wrote:
> Can't you simply use
> 
>unsigned ret = 0;
>FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
>  if (!has_single_use (op))
>++ret;
>return ret;
> 
> ?  Not sure if the bit_not_p handling is required.

Consider the case with most possible statements:
  _1 = load1;
  _2 = load2;
  _3 = ~_1;
  _4 = ~_2;
  _5 = _3 ^ _4;
  store0 = _5;
All these statements construct the value for the store, and if we remove
the old stores and add new stores we'll need similar statements for each
of the new stores.  What the function is attempting to compute how many
of these original statements will be not DCEd.
If _5 has multiple uses, then we'll need all of them, so 5 stmts not
being DCEd.  If _5 has single use, but _3 (and/or _4) has multiple uses,
we'll need the corresponding loads in addition to the BIT_NOT_EXPR
statement(s).  If only _1 (and/or _2) has multiple uses, we'll need
the load(s) but nothing else.
So, there isn't a single stmt I could FOR_EACH_SSA_TREE_OPERAND on.
For BIT_{AND,IOR,XOR}_EXPR doing it just on that stmt would be too rough
approximation and would miss the case when the bitwise binary op result
is used.

> It doesn't seem you handle multi-uses of the BIT_*_EXPR results
> itself?  Or does it only handle multi-uses of the BIT_*_EXPR
> but not the feeding loads?

I believe I handle all those precisely above (the only reason I've talked
about aproximation is that bit field loads/stores are counted as one stmt
and the masking added for handling multiple semi-adjacent bitfield
loads/stores aren't counted either).

Given the above example:
  if (!has_single_use (gimple_assign_rhs1 (stmt)))
{
  ret += 1 + info->ops[0].bit_not_p;
  if (info->ops[1].base_addr)
ret += 1 + info->ops[1].bit_not_p;
  return ret + 1;
}
Above should handle the _5 multiple uses case (the first operand is guaranteed
by the discovery code to be a possibly negated load).
  stmt = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
  /* stmt is now the BIT_*_EXPR.  */
  if (!has_single_use (gimple_assign_rhs1 (stmt)))
ret += 1 + info->ops[0].bit_not_p;
Above should handle the _3 multiple uses.
  else if (info->ops[0].bit_not_p)
{
  gimple *stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
  if (!has_single_use (gimple_assign_rhs1 (stmt2)))
++ret;
}
Above should handle multiple uses of _1.
  if (info->ops[1].base_addr == NULL_TREE)
return ret;
is an early return for the case when second argument to BIT_*_EXPR is
constant.
  if (!has_single_use (gimple_assign_rhs2 (stmt)))
ret += 1 + info->ops[1].bit_not_p;
Above should handle the _4 multiple uses.
  else if (info->ops[1].bit_not_p)
{
  gimple *stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs2 (stmt));
  if (!has_single_use (gimple_assign_rhs1 (stmt2)))
++ret;
}
Above should handle the _2 multiple uses.

And for another example like:
  _6 = load1;
  _7 = ~_6;
  store0 = _7;
  if (!has_single_use (gimple_assign_rhs1 (stmt)))
return 1 + info->ops[0].bit_not_p;
Above should handle _7 multiple uses
  else if (info->ops[0].bit_not_p)
{
  stmt = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
  if (!has_single_use (gimple_assign_rhs1 (stmt)))
return 1;
}
Above should handle _6 multiple uses.

Jakub


[Ada] Remove dead check in collecting SPARK cross-references

2017-11-08 Thread Pierre-Marie de Rodat
GNATprove never collects cross-references to empty entities. Removed code
most likely became dead at some point and this was not noticed. No test,
as the removed code was only executed as part of GNATprove and its behaviour
appears not affected.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* lib-xref-spark_specific.adb (Add_SPARK_Xrefs): Remove dead check for
empty entities.

Index: lib-xref-spark_specific.adb
===
--- lib-xref-spark_specific.adb (revision 254532)
+++ lib-xref-spark_specific.adb (working copy)
@@ -657,7 +657,6 @@
   Prev_Loc   : Source_Ptr;
   Prev_Typ   : Character;
   Ref_Count  : Nat;
-  Ref_Name   : String_Ptr;
   Scope_Id   : Scope_Index;
 
--  Start of processing for Add_SPARK_Xrefs
@@ -818,10 +817,6 @@
pragma Assert (Scope_Id <= SPARK_Scope_Table.Last);
 end loop;
 
-if Present (Ref.Ent) then
-   Ref_Name := new String'(Unique_Name (Ref.Ent));
-end if;
-
 if Ref.Ent = Heap then
Line := 0;
Col  := 0;
@@ -845,7 +840,7 @@
 end if;
 
 SPARK_Xref_Table.Append (
-  (Entity_Name => Ref_Name,
+  (Entity_Name => new String'(Unique_Name (Ref.Ent)),
Entity_Line => Line,
Etype   => Get_Entity_Type (Ref.Ent),
Entity_Col  => Col,


Re: [PATCH] Replace has_single_use guards in store-merging

2017-11-08 Thread Richard Biener
On Mon, 6 Nov 2017, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned earlier, the !has_single_use checks disable store merging
> in many cases, it is enough to have a single multiple-use somewhere and
> all of sudden we break the group.
> 
> The following patch replaces it by heuristics, it is GIMPLE statement count
> based, but I think it should work pretty fine in general.
> The code counts how many statements there are for the group without
> store-merging (i.e. stores, if not storing constant, then associated loads
> as well, and bitwise stmts), and then counts how many are needed
> if store merging is performed and expected DCE gets rid of dead stmts
> - i.e. we count what we'll emit in the store merging sequences for the group
> (without the masking for bitfields with padding bits in between, as that
> is even normal stores to bitfields have that implicitly and only expansion
> makes it explicit into the IL) and whatever original statements had multiple
> uses (and stmts they use if any).
> So, e.g. if we have _1 = t.a; s.a = _1; _2 = t.b; s.b = _2; use (_1); we
> can still store merge it, as there are 4 original stmts and we'd replace it
> with 3 new stmts: _1 = t.a; _3 = [t.a;t.b]; [s.a;s.b] = _3; use (_1);
> while if we have _1 = t.a; s.a = _1; _2 = t.b; s.b = _2; use (_1, _2);
> we don't replace it anymore, as that would be trading 4 original for 4 new
> ones.
> 
> During the combined {x86_64,i686}-linux bootstraps/regtests, without this
> and the today posted patch the counts were:
> integer_cst  215621   442752
> mem_ref  12115   24919
> bit_and_expr  37   88
> bit_ior_expr  19   46
> bit_xor_expr  27   58
> and with the two patches:
> integer_cst  215605   442817
> mem_ref  17320   36133
> bit_and_expr  93   224
> bit_ior_expr  66   153
> bit_xor_expr  56   153
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-11-06  Jakub Jelinek  
> 
>   * gimple-ssa-store-merging.c (count_multiple_uses): New function.
>   (split_group): Add total_orig and total_new arguments, estimate the
>   number of statements related to the store group without store merging
>   and with store merging.
>   (imm_store_chain_info::output_merged_store): Adjust split_group
>   callers, punt if estimated number of statements with store merging
>   is not smaller than estimated number of statements without it.
>   Formatting fix.
>   (handled_load): Remove has_single_use checks.
>   (pass_store_merging::process_store): Likewise.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2017-11-06 08:57:07.0 +0100
> +++ gcc/gimple-ssa-store-merging.c2017-11-06 16:35:38.122437951 +0100
> @@ -1370,6 +1370,65 @@ find_constituent_stores (struct merged_s
>return ret;
>  }
>  
> +/* Return how many SSA_NAMEs used to compute value to store in the INFO
> +   store have multiple uses.  If any SSA_NAME has multiple uses, also
> +   count statements needed to compute it.  */
> +
> +static unsigned
> +count_multiple_uses (store_immediate_info *info)
> +{
> +  gimple *stmt = info->stmt;
> +  unsigned ret = 0;
> +  switch (info->rhs_code)
> +{
> +case INTEGER_CST:
> +  return 0;
> +case BIT_AND_EXPR:
> +case BIT_IOR_EXPR:
> +case BIT_XOR_EXPR:
> +  if (!has_single_use (gimple_assign_rhs1 (stmt)))
> + {
> +   ret += 1 + info->ops[0].bit_not_p;
> +   if (info->ops[1].base_addr)
> + ret += 1 + info->ops[1].bit_not_p;
> +   return ret + 1;
> + }
> +  stmt = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
> +  /* stmt is now the BIT_*_EXPR.  */
> +  if (!has_single_use (gimple_assign_rhs1 (stmt)))
> + ret += 1 + info->ops[0].bit_not_p;
> +  else if (info->ops[0].bit_not_p)
> + {
> +   gimple *stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
> +   if (!has_single_use (gimple_assign_rhs1 (stmt2)))
> + ++ret;
> + }
> +  if (info->ops[1].base_addr == NULL_TREE)
> + return ret;
> +  if (!has_single_use (gimple_assign_rhs2 (stmt)))
> + ret += 1 + info->ops[1].bit_not_p;
> +  else if (info->ops[1].bit_not_p)
> + {
> +   gimple *stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs2 (stmt));
> +   if (!has_single_use (gimple_assign_rhs1 (stmt2)))
> + ++ret;
> + }
> +  return ret;
> +case MEM_REF:
> +  if (!has_single_use (gimple_assign_rhs1 (stmt)))
> + return 1 + info->ops[0].bit_not_p;
> +  else if (info->ops[0].bit_not_p)
> + {
> +   stmt = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
> +   if (!has_single_use (gimple_assign_rhs1 (stmt)))
> + return 1;
> + }
> +  return 0;
> +default:
> +  gcc_unreachable ();
> +}

Can't you simply use

   unsigned ret = 0;
   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
 if (!has_single_use (op))
   ++ret;
   return ret;

?  Not sure if the bit_not_p handling is required.

It doesn't seem you handle multi-uses 

[Ada] Missing finalization during deallocation

2017-11-08 Thread Pierre-Marie de Rodat
This patch corrects the expansion of Unchecked_Deallocation calls to operate
with the available view of the designated type. This ensures that if the type
is visible through a limited with clause, the expansion properly detects the
case where the designated type requires finalization actions.


-- Source --


--  lib.ads

package Lib with Pure is
end Lib;

--  lib-holder.ads

limited private with Lib.Holder.Impl;

package Lib.Holder is
   type Holder is private;
   function Create return Holder;
   procedure Destroy (H : in out Holder);
private
   type Holder is access all Lib.Holder.Impl.Holder_Type;
end Lib.Holder;

--  lib-holder.adb

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Unchecked_Deallocation;

with Lib.Holder.Impl; use Lib.Holder.Impl;

package body Lib.Holder is
   procedure Free is new Ada.Unchecked_Deallocation (Holder_Type, Holder);

   function Create return Holder is
   begin
  return new Holder_Type'(S => To_Unbounded_String ("Hello world"));
   end Create;

   procedure Destroy (H : in out Holder) is
   begin
  Free (H);
   end Destroy;
end Lib.Holder;

--  lib-holder-impl.ads

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;

package Lib.Holder.Impl is
   type Holder_Type is record
  S : Unbounded_String;
   end record;
end Lib.Holder.Impl;

--  main.adb

with Lib.Holder; use Lib.Holder;

procedure Main is
   H : Holder := Create;
begin
   Destroy (H);
end Main;


-- Compilation and output --


$ gnatmake -q main.adb -largs -lgmem
$ ./main
$ gnatmem main >& main.leaks
$ grep -c "non freed allocations" main.leaks
0

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Hristian Kirtchev  

* exp_ch4.adb (Apply_Accessibility_Check): Do not finalize the object
when the associated access type is subject to pragma
No_Heap_Finalization.
* exp_intr.adb (Expand_Unc_Deallocation): Use the available view of the
designated type in case it comes from a limited withed unit.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 254523)
+++ exp_ch4.adb (working copy)
@@ -630,7 +630,9 @@
 
 --[Deep_]Finalize (Obj_Ref.all);
 
-if Needs_Finalization (DesigT) then
+if Needs_Finalization (DesigT)
+  and then not No_Heap_Finalization (PtrT)
+then
Fin_Call :=
  Make_Final_Call
(Obj_Ref =>
Index: exp_intr.adb
===
--- exp_intr.adb(revision 254523)
+++ exp_intr.adb(working copy)
@@ -924,7 +924,8 @@
   Arg   : constant Node_Id:= First_Actual (N);
   Loc   : constant Source_Ptr := Sloc (N);
   Typ   : constant Entity_Id  := Etype (Arg);
-  Desig_Typ : constant Entity_Id  := Designated_Type (Typ);
+  Desig_Typ : constant Entity_Id  :=
+Available_View (Designated_Type (Typ));
   Needs_Fin : constant Boolean:= Needs_Finalization (Desig_Typ);
   Root_Typ  : constant Entity_Id  := Underlying_Type (Root_Type (Typ));
   Pool  : constant Entity_Id  := Associated_Storage_Pool (Root_Typ);


[Ada] Deconstruct storing SPARK cross-references in the ALI files

2017-11-08 Thread Pierre-Marie de Rodat
GNATprove relied on frontend writing cross-references data into the ALI
files to synthesize Global contracts. Now this is done by the GNATprove
itself. This patch deconstructs the frontend code, that is no longer needed.

No test, as only dead code removed.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-11-08  Piotr Trojanek  

* lib-xref-spark_specific.adb (Add_SPARK_Scope): Remove detection of
protected operations.
(Add_SPARK_Xrefs): Simplify detection of empty entities.
* get_spark_xrefs.ads, get_spark_xrefs.adb, put_spark_xrefs.ads,
put_spark_xrefs.adb, spark_xrefs_test.adb: Remove code for writing,
reading and testing SPARK cross-references stored in the ALI files.
* lib-xref.ads (Output_SPARK_Xrefs): Remove.
* lib-writ.adb (Write_ALI): Do not write SPARK cross-references to the
ALI file.
* spark_xrefs.ads, spark_xrefs.adb (pspark): Remove, together
with description of the SPARK xrefs ALI format.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Remove get_spark_refs.o
and put_spark_refs.o.

Index: get_spark_xrefs.adb
===
--- get_spark_xrefs.adb (revision 254523)
+++ get_spark_xrefs.adb (working copy)
@@ -1,493 +0,0 @@
---
---  --
--- GNAT COMPILER COMPONENTS --
---  --
---   G E T _ S P A R K _ X R E F S  --
---  --
--- B o d y  --
---  --
---  Copyright (C) 2011-2016, Free Software Foundation, Inc. --
---  --
--- GNAT is free software;  you can  redistribute it  and/or modify it under --
--- terms of the  GNU General Public License as published  by the Free Soft- --
--- ware  Foundation;  either version 3,  or (at your option) any later ver- --
--- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
--- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
--- or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License --
--- for  more details.  You should have  received  a copy of the GNU General --
--- Public License  distributed with GNAT; see file COPYING3.  If not, go to --
--- http://www.gnu.org/licenses for a complete copy of the license.  --
---  --
--- GNAT was originally developed  by the GNAT team at  New York University. --
--- Extensive contributions were provided by Ada Core Technologies Inc.  --
---  --
---
-
-with SPARK_Xrefs; use SPARK_Xrefs;
-with Types;   use Types;
-
-with Ada.IO_Exceptions; use Ada.IO_Exceptions;
-
-procedure Get_SPARK_Xrefs is
-   C : Character;
-
-   use ASCII;
-   --  For CR/LF
-
-   Cur_File : Nat;
-   --  Dependency number for the current file
-
-   Cur_Scope : Nat;
-   --  Scope number for the current scope entity
-
-   Cur_File_Idx : File_Index;
-   --  Index in SPARK_File_Table of the current file
-
-   Cur_Scope_Idx : Scope_Index;
-   --  Index in SPARK_Scope_Table of the current scope
-
-   Name_Str : String (1 .. 32768);
-   Name_Len : Natural := 0;
-   --  Local string used to store name of File/entity scanned as
-   --  Name_Str (1 .. Name_Len).
-
-   File_Name : String_Ptr;
-   Unit_File_Name : String_Ptr;
-
-   ---
-   -- Local Subprograms --
-   ---
-
-   function At_EOL return Boolean;
-   --  Skips any spaces, then checks if at the end of a line. If so, returns
-   --  True (but does not skip the EOL sequence). If not, then returns False.
-
-   procedure Check (C : Character);
-   --  Checks that file is positioned at given character, and if so skips past
-   --  it, If not, raises Data_Error.
-
-   function Get_Nat return Nat;
-   --  On entry the file is positioned to a digit. On return, the file is
-   --  positioned past the last digit, and the returned result is the decimal
-   --  value read. Data_Error is raised for overflow (value greater than
-   --  Int'Last), or if the initial character is not a digit.
-
-   procedure Get_Name;
-   --  On entry the file is positioned to a name. On return, the file is
-   --  positioned past the last character, and the name scanned is returned
-   --  in Name_Str (1 .. Name_Len).
-
-   

  1   2   >