Re: [Mingw-w64-public] Fwd: [patch] Reimplement GNU threads library on native Windows

2019-06-28 Thread Liu Hao
在 2019/6/29 上午12:10, Jacek Caban 写道:
> 
> You don't really need to store the whole __gthr_win32_thr_desc in TLS.
> If you stored just the handle, this wouldn't need a destructor.
> 
> 

The handle to be stored in the TLS ('the Handle' for short hereinafter)
should be a real handle, so there are a few scenarios that we should
consider:

0) the Handle should be closed upon the spawned thread's exit; in this
   case a destructor is still necessary to prevent handle leaks.
1) the Handle is closed by the creator via either `*_join()` or
   `*_detach()`; in the latter case the Handle becomes invalid while
   the thread is running, so `*_self()` would return an invalid handle.

It seems inappropriate to use handles as thread identifiers (as handles
imply resource ownership and are not unique identifiers); thread IDs (as
`DWORD` or `unsigned long`) would be a better alternative.


-- 
Best regards,
LH_Mouse



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 26/30] Changes to sh

2019-06-28 Thread Oleg Endo
On Tue, 2019-06-25 at 15:22 -0500, acsaw...@linux.ibm.com wrote:
> From: Aaron Sawdey 
> 
>   * config/sh/sh.md (movmemsi): Change name to cpymemsi.
> ---
>  gcc/config/sh/sh.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
> index 8354377..ed70e34 100644
> --- a/gcc/config/sh/sh.md
> +++ b/gcc/config/sh/sh.md
> @@ -8906,7 +8906,7 @@
>  
>  ;; String/block move insn.
>  
> -(define_expand "movmemsi"
> +(define_expand "cpymemsi"
>[(parallel [(set (mem:BLK (match_operand:BLK 0))
>  (mem:BLK (match_operand:BLK 1)))
> (use (match_operand:SI 2 "nonmemory_operand"))

Looks like a trivial change.  It's OK.

Cheers,
Oleg



Re: [PATCH][debug] Fix handling of vlas in lto

2019-06-28 Thread Eric Botcazou
> 2018-08-17  Tom de Vries  
> 
>   * dwarf2out.c (add_scalar_info): Don't add reference to existing die
>   unless the referenced die describes the added property using
>   DW_AT_location or DW_AT_const_value.  Fall back to exprloc case.
>   Otherwise, add a DW_AT_location to the referenced die.

This breaks Ada though, i.e. any array type whose bound depends on a 
discriminant is affected:

   type Array_Type is array (Integer range <>) of Integer;
   type Record_Type (N : Integer) is record
  A : Array_Type (1 .. N);
   end record;

.uleb128 0x6# (DIE (0x66) DW_TAG_array_type)
.long   .LASF5  # DW_AT_name: "p__record_type__T4s"
.long   0x34# DW_AT_type
.long   0x79# DW_AT_sibling
.uleb128 0x7# (DIE (0x73) DW_TAG_subrange_type)
.long   0x2d# DW_AT_type
.byte   0   # end of children of DIE 0x66

Testcase attached, compile with -fgnat-encodings=minimal.

-- 
Eric Botcazoupackage P is

   type Array_Type is array (Integer range <>) of Integer;
   type Record_Type (N : Integer := 16) is record
  A : Array_Type (1 .. N);
   end record;

   R : Record_Type;

end P;


Re: [PATCH 2/2] [ARC] Fix emitting TLS symbols.

2019-06-28 Thread Jeff Law
On 6/28/19 7:39 AM, Claudiu Zissulescu wrote:
> This is another issue found during smoke testing glibc.
> 
> When storing a TLS symbol to memory, always use an intermediate
> register to load it.
> 
> Ok to apply?
> Claudiu
> 
> gcc/
> -xx-xx  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (prepare_move_operands): Always use an
>   intermediate register when storing a TLS symbols.
OK.  If it's dependent on 1/2 of this series, then obviously wait for
1/2 to be approved before applying.

jeff


Re: [PATCH 32/30] Document movmem/cpymem changes in gcc-10/changes.html

2019-06-28 Thread Jeff Law
On 6/27/19 12:39 PM, Aaron Sawdey wrote:
> On 6/25/19 4:43 PM, Jeff Law wrote:
>> On 6/25/19 2:22 PM, acsaw...@linux.ibm.com wrote:
>>> From: Aaron Sawdey 
>>>
>>> * builtins.c (get_memory_rtx): Fix comment.
>>> * optabs.def (movmem_optab): Change to cpymem_optab.
>>> * expr.c (emit_block_move_via_cpymem): Change movmem to cpymem.
>>> (emit_block_move_hints): Change movmem to cpymem.
>>> * defaults.h: Change movmem to cpymem.
>>> * targhooks.c (get_move_ratio): Change movmem to cpymem.
>>> (default_use_by_pieces_infrastructure_p): Ditto.
>> So I think you're missing an update to the RTL/MD documentation.  This
>> is also likely to cause problems for any out-of-tree ports, so it's
>> probably worth a mention in the gcc-10 changes, which will need to be
>> created (in CVS no less, ugh).
>>
>> I think the stuff posted to-date is fine, but it shouldn't go in without
>> the corresponding docs and gcc-10 changes updates.
> 
> Here is the corresponding documentation change for gcc-10/changes.html.
> 
> OK for trunk?
OK
jeff


Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Martin Sebor

On 6/28/19 2:10 PM, Segher Boessenkool wrote:

On Fri, Jun 28, 2019 at 07:46:54PM +0100, Richard Sandiford wrote:

Segher Boessenkool  writes:

On Fri, Jun 28, 2019 at 08:55:00AM -0600, Martin Sebor wrote:

Jeff reminded me in a code review the other day that GCC does
have a guideline for defining POD structs with the keyword
"struct" and classes with ctors/dtors using "class":

   https://gcc.gnu.org/codingconventions.html#Struct_Use

I quickly prototyped a warning to see how closely GCC follows
this convention.  The result shows that out of just under 800
structs and classes defined in GCC sources some 200 use
the keyword 'struct' despite having a ctor or dtor, or about
25%.  So as is the case with most other conventions, without
a tool to help remind us they exist they are unlikely to be
followed with enough consistency to be worth putting in place
to begin with.


The goal is not to have the rules adhered to.  The goal is to have
a more readable / maintainable / etc. codebase.


IMO it's a shame we don't follow that attitude for changelogs,
where apparently the number of spaces between the name and the email
address is of vital importance :-)  Too often a new contributor's
first taste of GCC is a slew of comments about things like that.


Yes.  It is important that people make good enough changelogs though.
But when a new contributor gets no comments other than about their
changelog, well, that is less than ideal, right.


But surely it's a valid point that we're not following our own
conventions on the C++ usage.  I miss how consistent the codebase
was in the C days...


I think you misunderstand me.  We *should* follow our coding conventions.
My point was that even if we break the rules in a quarter of cases, we
still win more than we lose, so this doesn't make the rule useless at all.


That may be so in some simple cases (perhaps formatting) but it
doesn't follow in all of them, and certainly not in this one.

The rationale for using "struct" for PODs and "class" for classes
with ctors or a dtor is "to signal more information."  That is only
accomplished if this signaling is consistent.  Otherwise, to be
sure, the user still has to confirm whether a struct really is
a POD by carefully inspecting its definition.

Speaking from recent experience, when I need to know whether I'm
dealing with a POD or not I need a reliable answer.  It does no
good to me if I use a struct thinking it's "most likely" a POD
in a template like hash_map or vec that only work correctly with
PODs and corrupts memory otherwise.  Or if I use a struct thinking
copying it is as fast as memcpy when it actually has a ctor that
dynamically allocates memory, only to find out much later after
benchmarking that it slowed down some important algorithm.  So
I would argue that without strict enforcement, this particular
convention is worse than useless: because of its unreliability
it's misleading and dangerous.

Martin

PS We know today that relying on convention for something as
important as this is a bad idea, and we have reliable ways of
enforcing the basic properties of types in containers and
algorithms: type traits and static assertions (they are
available in a limited form even in C++ 98 and C11).  With
those mechanisms in place these risk can be eliminated.  But
then, even with enforcement, the convention ceases to serve
its intended purpose.


Fail building modula2.

2019-06-28 Thread Ed Smith-Rowland via gcc-patches

Gaius,

I tried to apply your patch and build and got the following error:
--

/home/ed/obj_modula2/./prev-gcc/xg++ -B/home/ed/obj_modula2/./prev-gcc/ 
-B/home/ed/bin_modula2/x86_64-pc-linux-gnu/bin/ -nostdinc++ 
-B/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-B/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-I/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu 
-I/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/include 
-I/home/ed/gcc_git/libstdc++-v3/libsupc++ 
-L/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-L/home/ed/obj_modula2/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-fno-PIE -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
-Wwrite-strings -Wcast-qual -Wno-error=format-diag?? -fno-common?? 
-DHAVE_CONFIG_H -I. -Im2 -I../../gcc_git/gcc -I../../gcc_git/gcc/m2 
-I../../gcc_git/gcc/../include -I../../gcc_git/gcc/../libcpp/include 
-I/home/ed/obj_modula2/./gmp -I/home/ed/gcc_git/gmp 
-I/home/ed/obj_modula2/./mpfr/src -I/home/ed/gcc_git/mpfr/src 
-I/home/ed/gcc_git/mpc/src?? -I../../gcc_git/gcc/../libdecnumber 
-I../../gcc_git/gcc/../libdecnumber/bid -I../libdecnumber 
-I../../gcc_git/gcc/../libbacktrace -I/home/ed/obj_modula2/./isl/include 
-I/home/ed/gcc_git/isl/include?? \
 -DSTANDARD_STARTFILE_PREFIX=\"../../../\" 
-DSTANDARD_EXEC_PREFIX=\"/home/ed/bin_modula2/lib/gcc/\" 
-DSTANDARD_LIBEXEC_PREFIX=\"/home/ed/bin_modula2/libexec/gcc/\" 
-DDEFAULT_TARGET_VERSION=\"10.0.0\" 
-DDEFAULT_REAL_TARGET_MACHINE=\"x86_64-pc-linux-gnu\" 
-DDEFAULT_TARGET_MACHINE=\"x86_64-pc-linux-gnu\" 
-DSTANDARD_BINDIR_PREFIX=\"/home/ed/bin_modula2/bin/\" 
-DTOOLDIR_BASE_PREFIX=\"../../../../\" -DACCEL_DIR_SUFFIX=\"\" 
-DENABLE_SHARED_LIBGCC -DCONFIGURE_SPECS="\"\"" 
-DTOOL_INCLUDE_DIR=\"/home/ed/bin_modula2/lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../x86_64-pc-linux-gnu/include\" 
-DNATIVE_SYSTEM_HEADER_DIR=\"/usr/include\" \

-DLIBSUBDIR=\"/home/ed/bin_modula2/lib/gcc/x86_64-pc-linux-gnu/10.0.0\" \
?? -DPREFIX=\"/home/ed/bin_modula2\" \
?? -c ../../gcc_git/gcc/m2/gm2spec.c -o 
m2/gm2spec.o)
../../gcc_git/gcc/m2/gm2spec.c: In function ???void add_B_prefix(unsigned 
int*, cl_decoded_option**)???:
../../gcc_git/gcc/m2/gm2spec.c:275:11: error: ???fe_B_prefix??? was not 
declared in this scope; did you mean ???add_B_prefix

?? 275 | fe_B_prefix (xstrdup (path));
?? | ^~~
?? | add_B_prefix
../../gcc_git/gcc/m2/gm2spec.c: In function ???const char* no_link(int, 
const char**)???:
../../gcc_git/gcc/m2/gm2spec.c:1406:3: error: ???allow_linker??? was not 
declared in this scope

??1406 | allow_linker = FALSE;
?? | ^~~~
../../gcc_git/gcc/m2/gm2spec.c: In function ???void 
lang_register_spec_functions()???:
../../gcc_git/gcc/m2/gm2spec.c:1416:3: error: ???fe_add_spec_function??? was 
not declared in this scope; did you mean ???spec_function

??1416 | fe_add_spec_function ("objects", get_objects);
?? | ^~~~
?? | spec_function
../../gcc_git/gcc/m2/Make-lang.in:59: recipe for target 'm2/gm2spec.o' 
failed

make[3]: *** [m2/gm2spec.o] Error 1
make[3]: Leaving directory '/home/ed/obj_modula2/gcc'
Makefile:4734: recipe for target 'all-stage2-gcc' failed
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory '/home/ed/obj_modula2'
Makefile:26901: recipe for target 'stage2-bubble' failed
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory '/home/ed/obj_modula2'
Makefile:1002: recipe for target 'all' failed
make: *** [all] Error 2

--

Thanks,

Ed Smith-Rowland




Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Segher Boessenkool
On Fri, Jun 28, 2019 at 07:46:54PM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > On Fri, Jun 28, 2019 at 08:55:00AM -0600, Martin Sebor wrote:
> >> Jeff reminded me in a code review the other day that GCC does
> >> have a guideline for defining POD structs with the keyword
> >> "struct" and classes with ctors/dtors using "class":
> >> 
> >>   https://gcc.gnu.org/codingconventions.html#Struct_Use
> >> 
> >> I quickly prototyped a warning to see how closely GCC follows
> >> this convention.  The result shows that out of just under 800
> >> structs and classes defined in GCC sources some 200 use
> >> the keyword 'struct' despite having a ctor or dtor, or about
> >> 25%.  So as is the case with most other conventions, without
> >> a tool to help remind us they exist they are unlikely to be
> >> followed with enough consistency to be worth putting in place
> >> to begin with.
> >
> > The goal is not to have the rules adhered to.  The goal is to have
> > a more readable / maintainable / etc. codebase.
> 
> IMO it's a shame we don't follow that attitude for changelogs,
> where apparently the number of spaces between the name and the email
> address is of vital importance :-)  Too often a new contributor's
> first taste of GCC is a slew of comments about things like that.

Yes.  It is important that people make good enough changelogs though.
But when a new contributor gets no comments other than about their
changelog, well, that is less than ideal, right.

> But surely it's a valid point that we're not following our own
> conventions on the C++ usage.  I miss how consistent the codebase
> was in the C days...

I think you misunderstand me.  We *should* follow our coding conventions.
My point was that even if we break the rules in a quarter of cases, we
still win more than we lose, so this doesn't make the rule useless at all.

That isn't saying we shouldn't follow those rules, pretty much the
opposite.


Segher


[Darwin, PPC, committed] Install the same headers as other sub-targets.

2019-06-28 Thread Iain Sandoe
This is primarily in order to improve testsuite coverage, (it allows us to run
the compile tests for the x86 intrinisics support, for example).  We might
elect to prune the list at some point.

tested on powerpc-darwin9
applied to mainline
Iain

2019-06-28  Iain Sandoe  

* config.gcc (powerpc-*-darwin*, powerpc64-*-darwin*): Remove
override on extra_headers.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8e3285b..c281c41 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2627,13 +2627,11 @@ powerpc-*-darwin*)
;;
esac
tmake_file="${tmake_file} t-slibgcc"
-   extra_headers=altivec.h
;;
 powerpc64-*-darwin*)
extra_options="${extra_options} ${cpu_type}/darwin.opt"
tmake_file="${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc"
tm_file="${tm_file} ${cpu_type}/darwin8.h ${cpu_type}/darwin64.h"
-   extra_headers=altivec.h
;;
 powerpc*-*-freebsd*)
tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h ${fbsd_tm_file} 
rs6000/sysv4.h"



Re: [PATCH] PowerPC Prefixed Memory, Patch #3, Update predicates

2019-06-28 Thread Michael Meissner
On Fri, Jun 28, 2019 at 08:20:35AM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Jun 27, 2019 at 08:12:35PM -0400, Michael Meissner wrote:
> > +  return (TARGET_CMODEL == CMODEL_MEDIUM && SYMBOL_REF_P (op)
> > + && SYMBOL_REF_LOCAL_P (op));
> 
> Please break the line before that first && as well?

Ok.

> > +(define_predicate "pcrel_external_address"
> > +  (match_code "symbol_ref,const")
> > +{
> > +  if (!TARGET_PCREL || TARGET_CMODEL != CMODEL_MEDIUM)
> > +return false;
> 
>   if (!(TARGET_PCREL && TARGET_CMODEL == CMODEL_MEDIUM))
> 
> Should there be a helper function for this?  PCREL is only supported
> for medium model -- abstracting that makes both the current code easier
> to read, and if we ever want to support other models, that will be
> simpler to do as well.

Yeah, Bill's suggestion is probably what I should use.

> > +  /* Discard any CONST's.  */
> > +  if (GET_CODE (op) == CONST)
> > +op = XEXP (op, 0);
> 
> That comment says nothing the code already tells.  It's a common thing
> to do anyway; just don't comment on it?

Ok.

> > +;; Return 1 if op is a memory operand to an external variable when we
> > +;; support pc-relative addressing and the PCREL_OPT relocation to
> > +;; optimize references to it.
> > +(define_predicate "pcrel_external_mem_operand"
> > +  (match_code "mem")
> > +{
> > +  return pcrel_external_address (XEXP (op, 0), Pmode);
> > +})
> 
> Is that comment correct?  pcrel_external_address does nothing with
> PCREL_OPT?  Or _its_ comment doesn't say, at least.

Yes, I will re-word it.  We will need this predicate even if PCREL_OPT is not
being used.

> Okay for trunk with those trivialities resolved.  Thanks!
> 
> 
> Segher

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[Darwin, committed] Fix a couple of Wformat-diag build warnings.

2019-06-28 Thread Iain Sandoe
As per $SUBJECT.
tested on a number of Darwin versions
applied to mainline
Iain


2019-06-28  Iain Sandoe  

* config/darwin-c.c (pop_field_alignment): Quote #pragma options.
* config/darwin-driver.c (darwin_default_min_version): Remove newline
from warning.
(darwin_driver_init): Likewise.

diff --git a/gcc/config/darwin-c.c b/gcc/config/darwin-c.c
index aa5d2f2..55b60da 100644
--- a/gcc/config/darwin-c.c
+++ b/gcc/config/darwin-c.c
@@ -79,7 +79,7 @@ pop_field_alignment (void)
   free (entry);
 }
   else
-error ("too many #pragma options align=reset");
+error ("too many %<#pragma options%> align=reset");
 }
 
 /* Handlers for Darwin-specific pragmas.  */
diff --git a/gcc/config/darwin-driver.c b/gcc/config/darwin-driver.c
index 3d85f29..49716fa 100644
--- a/gcc/config/darwin-driver.c
+++ b/gcc/config/darwin-driver.c
@@ -202,7 +202,7 @@ darwin_default_min_version (void)
   const char *checked = validate_macosx_version_min (new_flag);
   if (checked == NULL)
{
- warning (0, "couldn%'t understand version %s\n", new_flag);
+ warning (0, "couldn%'t understand version %s", new_flag);
  return NULL;
}
   new_flag = xstrndup (checked, strlen (checked));
@@ -298,7 +298,7 @@ darwin_driver_init (unsigned int *decoded_options_count,
  vers_string =
validate_macosx_version_min ((*decoded_options)[i].arg);
  if (vers_string == NULL)
-   warning (0, "%qs is not valid for %\n",
+   warning (0, "%qs is not valid for %",
 (*decoded_options)[i].arg);
  else if (vers_string == (*decoded_options)[i].arg)
vers_string = xstrndup ((*decoded_options)[i].arg, 32);



[PATCH] PowerPC Prefixed Memory, Patch #4, Add pc-relative reference support

2019-06-28 Thread Michael Meissner
At the moment, we have two functions that create and look at TOC references:

create_TOC_reference(global function)
use_toc_relative_ref(static function in rs6000.c)

Since I am adding pc-relative support that will be used instead of TOC support,
this patch renames these two functions to be:

create_data_reference
use_toc_or_pc_relative_ref

and it adds some of the foundation for adding pc-relative support.  Note, it
will need future patches to completely flesh out the pc-relative support.  As
per a previous private discussion, I kept the old create_TOC_reference and
added a gcc assert if it is called when pc-relative addressing is used.  There
is currently one place that still calls create_TOC_reference (in a section
using TLS addresses).

I have done a bootstrap on a power8 little endian Linux system with no
regressions.  Can I check this into the trunk?

2019-06-28  Michael Meissner  

* config/rs6000/rs6000-logue.c (create_TOC_reference): Move debug
statements to create_data_reference.  Add gcc_assert to make sure
we are using TOCs and not pc-relative addressing.
(create_data_reference): New function, create either a TOC style
reference or a pc-relative reference to local symbols.
* config/rs6000/rs6000-protos.h (create_data_reference): Add
declaration.
* config/rs6000/rs6000.c (rs6000_legitimize_address): Call
create_data_reference instead of create_TOC_reference.
(use_toc_or_pc_relative_ref): Rename from use_toc_relative_ref and
add support for pc-relative addressing.
(rs6000_emit_move): Call use_toc_or_pc_relative_ref and
create_data_reference instead of use_toc_relative_ref and
create_TOC_reference.
* config/rs6000/rs6000.md (cmp_internal2): Call
create_data_reference instead of create_TOC_reference.

Index: gcc/config/rs6000/rs6000-logue.c
===
--- gcc/config/rs6000/rs6000-logue.c(revision 272714)
+++ gcc/config/rs6000/rs6000-logue.c(working copy)
@@ -1406,23 +1406,13 @@ uses_TOC (void)
 }
 #endif
 
+/* Create a TOC style reference for a symbol.  */
 rtx
 create_TOC_reference (rtx symbol, rtx largetoc_reg)
 {
   rtx tocrel, tocreg, hi;
 
-  if (TARGET_DEBUG_ADDR)
-{
-  if (SYMBOL_REF_P (symbol))
-   fprintf (stderr, "\ncreate_TOC_reference, (symbol_ref %s)\n",
-XSTR (symbol, 0));
-  else
-   {
- fprintf (stderr, "\ncreate_TOC_reference, code %s:\n",
-  GET_RTX_NAME (GET_CODE (symbol)));
- debug_rtx (symbol);
-   }
-}
+  gcc_assert (TARGET_TOC && !TARGET_PCREL);
 
   if (!can_create_pseudo_p ())
 df_set_regs_ever_live (TOC_REGISTER, true);
@@ -1441,6 +1431,39 @@ create_TOC_reference (rtx symbol, rtx la
   return gen_rtx_LO_SUM (Pmode, hi, tocrel);
 }
 
+/* Create either a TOC reference to a locally defined item or a pc-relative
+   reference, depending on the ABI.  */
+rtx
+create_data_reference (rtx symbol, rtx largetoc_reg)
+{
+  if (TARGET_DEBUG_ADDR)
+{
+  const char *toc_or_pcrel = (TARGET_PCREL) ? "pc-relative" : "TOC";
+
+  if (SYMBOL_REF_P (symbol))
+   fprintf (stderr, "\ncreate_data_reference, abi %s, (symbol_ref %s)\n",
+toc_or_pcrel, XSTR (symbol, 0));
+  else
+   {
+ fprintf (stderr, "\ncreate_data_reference, abi %s, code %s:\n",
+  toc_or_pcrel, GET_RTX_NAME (GET_CODE (symbol)));
+ debug_rtx (symbol);
+   }
+}
+
+  /* We don't have to do anything special for pc-relative references.  The
+ SYMBOL_REF bits sayss whether the label is local where we can use
+ pc-relative addressing, or if we have to load the address from the GOT
+ section to use it on external addresses.  */
+  if (TARGET_PCREL)
+{
+  gcc_assert (TARGET_CMODEL == CMODEL_MEDIUM);
+  return symbol;
+}
+
+  return create_TOC_reference (symbol, largetoc_reg);
+}
+
 /* Issue assembly directives that create a reference to the given DWARF
FRAME_TABLE_LABEL from the current function section.  */
 void
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 272714)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -131,6 +131,7 @@ extern void rs6000_emit_swdiv (rtx, rtx,
 extern void rs6000_emit_swsqrt (rtx, rtx, bool);
 extern void output_toc (FILE *, rtx, int, machine_mode);
 extern void rs6000_fatal_bad_address (rtx);
+extern rtx create_data_reference (rtx, rtx);
 extern rtx create_TOC_reference (rtx, rtx);
 extern void rs6000_split_multireg_move (rtx, rtx);
 extern void rs6000_emit_le_vsx_permute (rtx, rtx, machine_mode);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 272766)
+++ 

Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Fri, Jun 28, 2019 at 08:55:00AM -0600, Martin Sebor wrote:
>> Jeff reminded me in a code review the other day that GCC does
>> have a guideline for defining POD structs with the keyword
>> "struct" and classes with ctors/dtors using "class":
>> 
>>   https://gcc.gnu.org/codingconventions.html#Struct_Use
>> 
>> I quickly prototyped a warning to see how closely GCC follows
>> this convention.  The result shows that out of just under 800
>> structs and classes defined in GCC sources some 200 use
>> the keyword 'struct' despite having a ctor or dtor, or about
>> 25%.  So as is the case with most other conventions, without
>> a tool to help remind us they exist they are unlikely to be
>> followed with enough consistency to be worth putting in place
>> to begin with.
>
> The goal is not to have the rules adhered to.  The goal is to have
> a more readable / maintainable / etc. codebase.

IMO it's a shame we don't follow that attitude for changelogs,
where apparently the number of spaces between the name and the email
address is of vital importance :-)  Too often a new contributor's
first taste of GCC is a slew of comments about things like that.

But surely it's a valid point that we're not following our own
conventions on the C++ usage.  I miss how consistent the codebase
was in the C days...

That can be fixed by changing the conventions if we no longer think
they're a good idea.  But if someone's willing to change the codebase
to follow (one of) the coventions instead, and willing to add a warning
to keep things that way, then that sounds like a really good thing.
Especially when it's a common convention that others might want a
warning for too.

And I don't think the current state of the struct/class tags is really
a result of making the codebase more maintainable.  I get the feeling
it just kind-of happened.  ISTM we're trying too hard to find a reason
not to clean this up.

(Just replying because I know Martin has been on the receiving end of a
lot of review comments about patches that didn't follow the conventions.
Not necessarily any more than anyone else, and from what I remember
the comments were accurate.  But after all that, I don't think it's fair
to say that the conventions don't really matter.)

Thanks,
Richard


Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Martin Sebor

On 6/28/19 10:59 AM, Jeff Law wrote:

On 6/28/19 8:55 AM, Martin Sebor wrote:

On 5/28/19 10:31 AM, Martin Sebor wrote:

On 5/28/19 4:24 AM, Martin Liška wrote:

On 5/28/19 11:31 AM, David CARLIER wrote:

Hi,

Here a tiny patch to fix few build warnings.

Kind regards.



Hi.

Well, I see a lot of these struct/class discrepancies when building
GCC with LLVM.
Question is whether it worth changing?


I think it's nice for these to be spelled consistently and no benefit
to mixing and matching them.  If it cleans up common warnings I see
no reason not to make the change.

FWIW, it's also a common convention to use struct for PODs and class
for types with user-defined ctors, and even if GCC doesn't subscribe
to it, make a change in support of it is an improvement independent
of the Visual C++ warning.  (As might be adding such a warning to
GCC to help enforce the convention on projects that do follow it.)


Jeff reminded me in a code review the other day that GCC does
have a guideline for defining POD structs with the keyword
"struct" and classes with ctors/dtors using "class":

   https://gcc.gnu.org/codingconventions.html#Struct_Use

I quickly prototyped a warning to see how closely GCC follows
this convention.  The result shows that out of just under 800
structs and classes defined in GCC sources some 200 use
the keyword 'struct' despite having a ctor or dtor, or about
25%.  So as is the case with most other conventions, without
a tool to help remind us they exist they are unlikely to be
followed with enough consistency to be worth putting in place
to begin with.

This kind of response just makes you look combative.  We have a
convention and we should follow it.  Please do so for your code.


I was responding to myself and adding support to what I suggested
earlier: to add such a warning to GCC to help enforce the convention
on projects that follow it, including GCC.



If we can cleanly implement this kind of diagnostic, then I would
certainly suggest we do so, fix our codebase and turn it on by default
for GCC builds, regardless of its status in -Wall.


I'd be happy to finish and submit the warning when I'm done with
the work whose review prompted me to implement it.

Martin


Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

2019-06-28 Thread Andrew Pinski
On Thu, Jun 27, 2019 at 9:55 PM Li Jia He  wrote:
>
>
>
> On 2019/6/27 11:48 PM, Jeff Law wrote:
> > On 6/27/19 12:11 AM, Li Jia He wrote:
> >> Hi,
> >>
> >> According to the optimizable case described by Qi Feng on
> >> issue 88784, we can combine the cases into the following:
> >>
> >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> >> 2. x >  y  &&  x == XXX_MIN  -->   false
> >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> >>
> >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> >> 5. x <  y  &&  x == XXX_MAX  -->   false
> >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> >>
> >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> >> 8. x <= y  ||  x != XXX_MIN  -->   true
> >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> >>
> >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> >> 11. x >= y  ||  x != XXX_MAX  -->   true
> >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> >>
> >> Note: XXX_MIN represents the minimum value of type x.
> >>XXX_MAX represents the maximum value of type x.
> >>
> >> Here we don't need to care about whether the operation is
> >> signed or unsigned.  For example, in the below equation:
> >>
> >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> >>
> >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> >> optimize it to 'x > y'.  However, if the type of x is unsigned
> >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> >>
> >> The regression testing for the patch was done on GCC mainline on
> >>
> >>  powerpc64le-unknown-linux-gnu (Power 9 LE)
> >>
> >> with no regressions.  Is it OK for trunk ?
> >>
> >> Thanks,
> >> Lijia He
> >>
> >> gcc/ChangeLog
> >>
> >> 2019-06-27  Li Jia He  
> >>  Qi Feng  
> >>
> >>  PR middle-end/88784
> >>  * gimple-fold.c (and_comparisons_contain_equal_operands): New 
> >> function.
> >>  (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> >>  (or_comparisons_contain_equal_operands): New function.
> >>  (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > Would this be better done via match.pd?  ISTM this transformation would
> > be well suited for that framework.
>
> Hi, Jeff
>
> I did this because of the following test case:
> `
> _Bool comp(unsigned x, unsigned y)
> {
>return x > y && x != 0;
> }
> `
> The gimple file dumped on the power platform is:
> `
> comp (unsigned int x, unsigned int y)
> {
>_Bool D.2837;
>int iftmp.0;
>
>if (x > y) goto ; else goto ;
>:
>if (x != 0) goto ; else goto ;
>:
>iftmp.0 = 1;
>goto ;
>:
>iftmp.0 = 0;
>:
>D.2837 = (_Bool) iftmp.0;
>return D.2837;
> }
> `
> However, the gimple file dumped on x86 is
> `
> comp (unsigned int x, unsigned int y)
> {
>_Bool D.2837;
>
>_1 = x > y;
>_2 = x != 0;
>_3 = _1 & _2;
>_4 = (int) _3;
>D.2837 = (_Bool) _4;
>return D.2837;
> }
> `
>
> The reason for the inconsistency between these two behaviors is param
> logical-op-non-short-circuit.  If we add the pattern to the match.pd
> file, we can only optimize the situation in which the statement is in
> the same basic block (logical-op-non-short-circuit=1, x86).  But for
> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> can't handle this situation.
>
> Another reason is that I found out maybe_fold_and_comparisons and
> maybe_fold_or_comparisons are not only called by ifcombine pass but
> also by reassoc pass. Using this method can basically unify param
> logical-op-non-short-circuit=0 or 1.


As mentioned before ifcombine pass should be using gimple-match
instead of fold_build.  Try converting ifcombine over to gimple-match
infrastructure and add these to match.pd.
NOTE tree-ssa-phiopt should also be moved over to gimple-match but
that is a different issue.


Thanks,
Andrew Pinski

>
> Thanks,
> Lijia He
>
> >
> > jeff
> >
>


Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Jeff Law
On 6/28/19 8:55 AM, Martin Sebor wrote:
> On 5/28/19 10:31 AM, Martin Sebor wrote:
>> On 5/28/19 4:24 AM, Martin Liška wrote:
>>> On 5/28/19 11:31 AM, David CARLIER wrote:
 Hi,

 Here a tiny patch to fix few build warnings.

 Kind regards.

>>>
>>> Hi.
>>>
>>> Well, I see a lot of these struct/class discrepancies when building
>>> GCC with LLVM.
>>> Question is whether it worth changing?
>>
>> I think it's nice for these to be spelled consistently and no benefit
>> to mixing and matching them.  If it cleans up common warnings I see
>> no reason not to make the change.
>>
>> FWIW, it's also a common convention to use struct for PODs and class
>> for types with user-defined ctors, and even if GCC doesn't subscribe
>> to it, make a change in support of it is an improvement independent
>> of the Visual C++ warning.  (As might be adding such a warning to
>> GCC to help enforce the convention on projects that do follow it.)
> 
> Jeff reminded me in a code review the other day that GCC does
> have a guideline for defining POD structs with the keyword
> "struct" and classes with ctors/dtors using "class":
> 
>   https://gcc.gnu.org/codingconventions.html#Struct_Use
> 
> I quickly prototyped a warning to see how closely GCC follows
> this convention.  The result shows that out of just under 800
> structs and classes defined in GCC sources some 200 use
> the keyword 'struct' despite having a ctor or dtor, or about
> 25%.  So as is the case with most other conventions, without
> a tool to help remind us they exist they are unlikely to be
> followed with enough consistency to be worth putting in place
> to begin with.
This kind of response just makes you look combative.  We have a
convention and we should follow it.  Please do so for your code.

If we can cleanly implement this kind of diagnostic, then I would
certainly suggest we do so, fix our codebase and turn it on by default
for GCC builds, regardless of its status in -Wall.

jeff


Re: [committed, amdgcn] Disable trampolines on GCN5

2019-06-28 Thread Andrew Stubbs

On 22/05/2019 23:21, Andrew Stubbs wrote:
Trampolines work just fine on GCN3 devices, but GCN5 devices have 
execute permission hardware, and the driver sets permission off for the 
private segment allocation in which the stacks are located.


It may be possible to implement indirect calls to nested functions 
another way, but for now this will prevent unexplained crashes or hangs 
on unsupported devices.


This is now backported to gcc-9-branch.

Andrew


Re: [RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-28 Thread Jeff Law
On 6/28/19 1:55 AM, Rainer Orth wrote:
> Hi Christophe,
> 
>> I've also noticed that
>> FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted
>> redundant store: .*.a = {}"
>> on aarch64
> 
> it also FAILs on sparc, ia64, m68k, and mips64el.
alpha is also affected.

Jeff


Re: [patch, c++ openmp] Improve diagnostics for unmappable types

2019-06-28 Thread Jason Merrill

On 6/28/19 6:46 AM, Andrew Stubbs wrote:
This patch improves the diagnostics given for unmappable C++ types in 
OpenMP programs.


Here is the output *without* the patch, for the new testcase:


unmappable-1.C: In function 'int main()':
unmappable-1.C:16:24: error: 'v' does not have a mappable type in 'map' 
clause

    16 | #pragma omp target map(v)


This is correct, but not very helpful for anything but the most trivial 
C++ types. Anything involving inheritance, templates, typedefs, etc. 
could be extremely difficult to track down.


With the patch applied we now get this (I removed the "dg-message" 
comments for clarity):



unmappable-1.C: In function 'int main()':
unmappable-1.C:16:24: error: 'v' does not have a mappable type in 'map' 
clause

    16 | #pragma omp target map(v)
   |    ^
cc1plus: note: incomplete types are not mappable
unmappable-1.C:4:7: note: types with virtual members are not mappable
     4 | class C
   |   ^
unmappable-1.C:7:14: note: static fields are not mappable
     7 |   static int static_member;
   |  ^


The compiler now reports all the problematic fields in the whole type, 
recursively. If anybody knows how to report the location of incomplete 
array declarations then that would be nice to add.


OK to commit?



+ inform ((decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION),
+ "incomplete types are not mappable");


It's better to use input_location as a fallback; essentially no 
diagnostics should use UNKNOWN_LOCATION.  And let's print the type with %qT.



+   if (notes)
+ result = false;
+   else
+ return false;


Returning early when !notes doesn't seem worth the extra lines of code.

Jason


Re: [RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-28 Thread Jeff Law
On 6/28/19 1:55 AM, Rainer Orth wrote:
> Hi Christophe,
> 
>> I've also noticed that
>> FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted
>> redundant store: .*.a = {}"
>> on aarch64
> 
> it also FAILs on sparc, ia64, m68k, and mips64el.
It's an interaction with CLEAR_RATIO that's causing the differences in
the generic code.  Using -Os seems to bias towards using an empty
constructor over a loop.

I suspect that's all we're going to need to fix this on these targets.

jeff


Re: [Mingw-w64-public] Fwd: [patch] Reimplement GNU threads library on native Windows

2019-06-28 Thread Jacek Caban

Hi Eric,

On 6/28/19 3:42 PM, NightStrike wrote:

FYI, Eric posted this today to the GCC patches list.  This may be of
great interest to many who would prefer native threads instead of the
winpthreads posix style interface.

Great work, Eric!  I look forward to trying this out!

-- Forwarded message -
From: Eric Botcazou 
Date: Fri, Jun 28, 2019 at 6:51 AM
Subject: [patch] Reimplement GNU threads library on native Windows
To: 
Cc: 


Hi,

this reimplements the GNU threads library on native Windows (except for the
Objective-C specific subset) using direct Win32 API calls, in lieu of the
implementation based on semaphores.  This base implementations requires
Windows XP/Server 2003, which is the default minimal setting of MinGW-W64.
This also adds the support required for the C++11 threads, using again direct
Win32 API calls; this additional layer requires Windows Vista/Server 2008 and
is enabled only if _GTHREADS_USE_COND is defined to 1.

This also changes libstdc++ to setting _GTHREADS_USE_COND to 1 when the switch
--enable-libstdcxx-threads is passed, which means that C++11 threads are still
disabled by default on native Windows and that you need to explicitly pass the
switch to enable them.  The 30_threads chapter of the testsuite is clean.

Tested on i686-pc-mingw32 and x86_64-pc-mingw32, OK for the mainline?



It's indeed great to see this. Thank you!



+/* The implementation strategy for the c++0x thread support is as follows.
+
+   A GNU thread is represented by a Win32 HANDLE that is obtained when the
+   Win32 thread is created, except of course for the initial thread.  This
+   Win32 HANDLE is stored in a descriptor keyed from TLS memory for every
+   thread, so the self routine can return it instead of having to duplicate
+   the pseudo-handle returned by GetCurrentThread each time it is invoked.
+   For the initial thread, this Win32 HANDLE is created during the first
+   call to the self routine using the aforementioned technique.
+
+   Note that the equal routine compares the identifier of threads instead
+   of their Win32 HANDLE, which will give the correct positive answer even
+   in the case where distinct Win32 HANDLEs have been created for the same
+   thread by multiple instances of libgcc included in the link.  */


Note that this will cause handle leaks if used across multiple libgcc 
instances, through.


+#include "gthr-win32.h"
+
+/* The thread descriptor keyed from TLS memory.  */
+struct __gthr_win32_thr_desc
+{
+  void *(*func) (void*);
+  void *args;
+  HANDLE h;
+};
+
+/* The TLS key used by one instance of the library.  */
+static __gthread_key_t __gthr_win32_tls = TLS_OUT_OF_INDEXES;
+
+/* The initialization device for the TLS key.  */
+static __gthread_once_t __gthr_win32_tls_once = __GTHREAD_ONCE_INIT;
+
+/* Initialize the TLS key.  */
+
+static void
+__gthr_win32_tls_init (void)
+{
+  if (__gthread_key_create (&__gthr_win32_tls, free))
+abort ();
+}



You don't really need to store the whole __gthr_win32_thr_desc in TLS. If you 
stored just the handle, this wouldn't need a destructor.


Thanks,
Jacek



Re: [RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-28 Thread Jeff Law
On 6/28/19 1:55 AM, Rainer Orth wrote:
> Hi Christophe,
> 
>> I've also noticed that
>> FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted
>> redundant store: .*.a = {}"
>> on aarch64
> 
> it also FAILs on sparc, ia64, m68k, and mips64el.
sparc seems to be emitting a loop when we expect an empty constructor
initializer.  I'm digging into the best way to address.

jeff


Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Segher Boessenkool
On Fri, Jun 28, 2019 at 08:55:00AM -0600, Martin Sebor wrote:
> Jeff reminded me in a code review the other day that GCC does
> have a guideline for defining POD structs with the keyword
> "struct" and classes with ctors/dtors using "class":
> 
>   https://gcc.gnu.org/codingconventions.html#Struct_Use
> 
> I quickly prototyped a warning to see how closely GCC follows
> this convention.  The result shows that out of just under 800
> structs and classes defined in GCC sources some 200 use
> the keyword 'struct' despite having a ctor or dtor, or about
> 25%.  So as is the case with most other conventions, without
> a tool to help remind us they exist they are unlikely to be
> followed with enough consistency to be worth putting in place
> to begin with.

The goal is not to have the rules adhered to.  The goal is to have
a more readable / maintainable / etc. codebase.

And even *if* you evaluate success of a rule just by how well it is
followed, you'll need to look at separate areas of GCC separately,
not just look at one gross number, one randomly defined statistic.


Segher


Re: [PATCH] PowerPC Prefixed Memory, Patch #3, Update predicates

2019-06-28 Thread Bill Schmidt
On 6/28/19 8:20 AM, Segher Boessenkool wrote:
> Hi Mike,
>
> On Thu, Jun 27, 2019 at 08:12:35PM -0400, Michael Meissner wrote:
>> +  return (TARGET_CMODEL == CMODEL_MEDIUM && SYMBOL_REF_P (op)
>> +  && SYMBOL_REF_LOCAL_P (op));
> Please break the line before that first && as well?
>
>> +(define_predicate "pcrel_external_address"
>> +  (match_code "symbol_ref,const")
>> +{
>> +  if (!TARGET_PCREL || TARGET_CMODEL != CMODEL_MEDIUM)
>> +return false;
>   if (!(TARGET_PCREL && TARGET_CMODEL == CMODEL_MEDIUM))
>
> Should there be a helper function for this?  PCREL is only supported
> for medium model -- abstracting that makes both the current code easier
> to read, and if we ever want to support other models, that will be
> simpler to do as well.

I'd suggest

    if (!rs6000_pcrel_p ())

which will also make sure this is ELFv2.

Thanks,
Bill
>
>> +  /* Discard any CONST's.  */
>> +  if (GET_CODE (op) == CONST)
>> +op = XEXP (op, 0);
> That comment says nothing the code already tells.  It's a common thing
> to do anyway; just don't comment on it?
>
>> +;; Return 1 if op is a memory operand to an external variable when we
>> +;; support pc-relative addressing and the PCREL_OPT relocation to
>> +;; optimize references to it.
>> +(define_predicate "pcrel_external_mem_operand"
>> +  (match_code "mem")
>> +{
>> +  return pcrel_external_address (XEXP (op, 0), Pmode);
>> +})
> Is that comment correct?  pcrel_external_address does nothing with
> PCREL_OPT?  Or _its_ comment doesn't say, at least.
>
> Okay for trunk with those trivialities resolved.  Thanks!
>
>
> Segher
>



Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-06-28 Thread Martin Sebor

On 5/28/19 10:31 AM, Martin Sebor wrote:

On 5/28/19 4:24 AM, Martin Liška wrote:

On 5/28/19 11:31 AM, David CARLIER wrote:

Hi,

Here a tiny patch to fix few build warnings.

Kind regards.



Hi.

Well, I see a lot of these struct/class discrepancies when building 
GCC with LLVM.

Question is whether it worth changing?


I think it's nice for these to be spelled consistently and no benefit
to mixing and matching them.  If it cleans up common warnings I see
no reason not to make the change.

FWIW, it's also a common convention to use struct for PODs and class
for types with user-defined ctors, and even if GCC doesn't subscribe
to it, make a change in support of it is an improvement independent
of the Visual C++ warning.  (As might be adding such a warning to
GCC to help enforce the convention on projects that do follow it.)


Jeff reminded me in a code review the other day that GCC does
have a guideline for defining POD structs with the keyword
"struct" and classes with ctors/dtors using "class":

  https://gcc.gnu.org/codingconventions.html#Struct_Use

I quickly prototyped a warning to see how closely GCC follows
this convention.  The result shows that out of just under 800
structs and classes defined in GCC sources some 200 use
the keyword 'struct' despite having a ctor or dtor, or about
25%.  So as is the case with most other conventions, without
a tool to help remind us they exist they are unlikely to be
followed with enough consistency to be worth putting in place
to begin with.

Martin


Re: [PATCH 1/5] Use alternative_mask for add_insn_allocno_copies

2019-06-28 Thread Vladimir Makarov



On 2019-06-28 7:46 a.m., Richard Sandiford wrote:

Ping.  (And thanks for the reviews of the other patches in the series.)

Sorry, I missed this patch.  Yes, the patch is OK, Richard.

Richard Sandiford  writes:

add_insn_allocno_copies and its subroutines used HARD_REG_SET to
represent a bitmask of alternatives.  There's not really any connection
between the number of registers and the maximum number of alternatives,
so this patch uses alternative_mask instead (which wasn't around when
this code was added).

This is just a minor clean-up making way for later patches.


2019-06-21  Richard Sandiford  

gcc/
* ira-int.h (ira_setup_alts, ira_get_dup_out_num): Use
alternative_mask instead of HARD_REG_SET to represent a
bitmask of alternatives.
* ira.c (ira_setup_alts, ira_get_dup_out_num): Likewise.
* ira-conflicts.c (add_insn_allocno_copies): Likewise.



[PATCH 1/2] [ARC] Fix and refurbish the interrupts.

2019-06-28 Thread Claudiu Zissulescu
When entering an interrupt, not only the call save registers needs to
be place on stack but also the call clobbers one. More over, the
ARC700 return from interrupt instruction needs to be rtie, the same
like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX]
instruction. Additionally, we need to save the state of the ZOL
machinery, namely the lp_count, lp_end and lp_start registers. For
architectures which are using extension registers (i.e., HS48) we need
to save/restore them as well.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc-protos.h (arc_output_function_epilogue): Delete
declaration.
(arc_return_address_register): Likewise.
(arc_compute_function_type): Likewise.
(arc_compute_frame_size): Likewise.
(secondary_reload_info): Likewise.
(arc_get_unalign): Likewise.
(arc_can_use_return_insn): Declare.
* config/arc/arc.c (AUX_LP_START): Define
(AUX_LP_END): Likewise.
(arc_frame_info): Update gmask member to 64-bit datum.
(GMASK_LEN): Update.
(arc_compute_function_type): Make it static, move it forward.
(arc_must_save_register): Update, consider the extra regs.
(arc_compute_millicode_save_restore_regs): Update to use the 64
bit gmask.
(arc_compute_frame_size): Likewise.
(arc_enter_leave_p): Likewise.
(arc_save_callee_saves): Likewise.
(arc_restore_callee_saves): Likewise.
(arc_save_callee_enter): Likewise.
(arc_restore_callee_leave): Likewise.
(arc_save_callee_milli): Likewise.
(arc_restore_callee_milli): Likewise.
(arc_expand_prologue): Add new interrupt handling.
(arc_return_address_register): Make it static, move it forward.
(arc_expand_epilogue): Add new interrupt handling.
(arc_get_unalign): Delete.
(arc_epilogue_uses): Make sure we do not remove the extra
saved/restored registers when interrupt.
(arc_can_use_return_insn): New function.
* config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define.
(R33_REG): Likewise.
(R34_REG): Likewise.
(R35_REG): Likewise.
(R36_REG): Likewise.
(R37_REG): Likewise.
(R38_REG): Likewise.
(R39_REG): Likewise.
(R45_REG): Likewise.
(R46_REG): Likewise.
(R47_REG): Likewise.
(R48_REG): Likewise.
(R49_REG): Likewise.
(R50_REG): Likewise.
(R51_REG): Likewise.
(R52_REG): Likewise.
(R53_REG): Likewise.
(R54_REG): Likewise.
(R55_REG): Likewise.
(R56_REG): Likewise.
(R58_REG): Likewise.
(type): Add rtie attribute.
(in_call_delay_slot): Use RETURN_ADDR_REGNUM.
(movsi_insn): Accept moves to lp_count.
(rtie): Update pattern.
(simple_return): Simplify it, don't use this pattern as a return
from an interrupt.
(arc600_rtie): New pattern.
(p_return_i): Clean up.
(return): Likewise.
* config/arc/builtins.def (rtie): Only available for non ARC6xx
family CPUs.
* config/arc/predicates.md (move_src_operand): Consider lp_count
as a register.

gcc/testsuite
-xx-xx  Claudiu Zissulescu  

* gcc.target/arc/arc.exp (check_effective_target_accregs): New
predicate.
* gcc.target/arc/builtin_special.c: Update test/
* gcc.target/arc/interrupt-1.c: Likewise.
* gcc.target/arc/interrupt-10.c: New test.
* gcc.target/arc/interrupt-11.c: Likewise.
---
 gcc/config/arc/arc-protos.h   |   7 +-
 gcc/config/arc/arc.c  | 700 +++---
 gcc/config/arc/arc.md | 139 ++--
 gcc/config/arc/builtins.def   |   2 +-
 gcc/config/arc/predicates.md  |   2 +
 gcc/testsuite/gcc.target/arc/arc.exp  |  18 +
 .../gcc.target/arc/builtin_special.c  |   2 +
 gcc/testsuite/gcc.target/arc/interrupt-1.c|   4 +-
 gcc/testsuite/gcc.target/arc/interrupt-10.c   |  36 +
 gcc/testsuite/gcc.target/arc/interrupt-11.c   |  16 +
 10 files changed, 574 insertions(+), 352 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-10.c
 create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-11.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index f501bc30ee7..0c9f422827d 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -25,7 +25,6 @@ extern machine_mode arc_select_cc_mode (enum rtx_code, rtx, 
rtx);
 extern struct rtx_def *gen_compare_reg (rtx, machine_mode);
 
 /* Declarations for various fns used in the .md file.  */
-extern void arc_output_function_epilogue (FILE *, HOST_WIDE_INT, int);
 extern const char *output_shift (rtx *);
 extern bool compact_sda_memory_operand (rtx, machine_mode, bool);
 extern bool arc_double_limm_p (rtx);
@@ -42,8 +41,6 @@ extern void 

[PATCH 2/2] [ARC] Fix emitting TLS symbols.

2019-06-28 Thread Claudiu Zissulescu
This is another issue found during smoke testing glibc.

When storing a TLS symbol to memory, always use an intermediate
register to load it.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc.c (prepare_move_operands): Always use an
intermediate register when storing a TLS symbols.

gcc/
-xx-xx  Claudiu Zissulescu  

* gcc/testsuite/gcc.target/arc/tls-2.c: New test.
* gcc/testsuite/gcc.target/arc/tls-3.c: Likewise.
---
 gcc/config/arc/arc.c |  2 +-
 gcc/testsuite/gcc.target/arc/tls-2.c | 14 ++
 gcc/testsuite/gcc.target/arc/tls-3.c | 19 +++
 3 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/tls-2.c
 create mode 100644 gcc/testsuite/gcc.target/arc/tls-3.c

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 910c8d2ad30..66a69628980 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -9272,7 +9272,7 @@ prepare_move_operands (rtx *operands, machine_mode mode)
   if (GET_CODE (operands[1]) == SYMBOL_REF)
 {
   enum tls_model model = SYMBOL_REF_TLS_MODEL (operands[1]);
-  if (MEM_P (operands[0]) && flag_pic)
+  if (MEM_P (operands[0]))
operands[1] = force_reg (mode, operands[1]);
   else if (model)
operands[1] = arc_legitimize_tls_address (operands[1], model);
diff --git a/gcc/testsuite/gcc.target/arc/tls-2.c 
b/gcc/testsuite/gcc.target/arc/tls-2.c
new file mode 100644
index 000..3cec309a0b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/tls-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target tls } */
+/* { dg-skip-if "" { arc*-*-elf* } } */
+/* { dg-options "-O2" } */
+
+typedef int type_a;
+__thread int b;
+int c;
+
+extern int bar (char *, int, int *, int, int *, char, type_a, int *, int *);
+int foo (int *f, char buffer, type_a buflen, int *g)
+{
+  bar("", c, (int *)foo, 1, (int *)f, buffer, buflen, (int *)g, );
+}
diff --git a/gcc/testsuite/gcc.target/arc/tls-3.c 
b/gcc/testsuite/gcc.target/arc/tls-3.c
new file mode 100644
index 000..e78b5a22742
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/tls-3.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target tls } */
+/* { dg-skip-if "" { arc*-*-elf* } } */
+/* { dg-options "-Os -fPIC" } */
+
+
+typedef struct
+{
+  int(a);
+  char b[];
+} type_c;
+
+
+extern int bar (char *, int, char *);
+static _Thread_local type_c d;
+int foo(void)
+{
+  bar(d.b, 0,  d.b);
+}
-- 
2.21.0



Re: [PATCH][Arm] Implement vector average patterns in aarch32

2019-06-28 Thread Iain Apreotesei
Hi,

It seems I forgot to include the email's body as well :-)

This patch adds Arm patterns for the new AVG_FLOOR/CEIL operations.
AVG_FLOOR is VHADD and AVG_CEIL is VHRADD.

I have added three DejaGnu tests in the patch to ensure that the 
appropriate assembler is getting generated.

Is this okay for trunk? If so, can someone commit on my behalf,
as I have no commit rights.

Regards and thanks,
Iain Apreotesei

Le 28/06/2019 à 13:56, Iain Apreotesei a écrit :
> gcc/ChangeLog:
> 
> 2019-06-28  Iain Apreotesei  
> 
>      * config/arm/iterators.md (VRHADD, VHADD): Add, update int_iterators.
>      (u) new int_attr.
>      * config/arm/neon.md (avg3_floor, avg3_ceil)
>      (neon_vhadd, neon_vrhadd): Add new patterns.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-28  Iain Apreotesei  
> 
>      * gcc.target/arm/vect_vhadd_1.c: New test.
>      * gcc.target/arm/vect_vhadd_1.h: New test.
>      * gcc.target/arm/vect_vrhadd_1.c: New test.
> 
> Change-Id: Ief4009984ca9974993530b582bd9ba431e42c3ed
> ---
>   gcc/config/arm/iterators.md  |  9 +--
>   gcc/config/arm/neon.md   | 32 
> +---
>   gcc/testsuite/gcc.target/arm/vect_vhadd_1.c  | 22 +
>   gcc/testsuite/gcc.target/arm/vect_vhadd_1.h  | 37 
> 
>   gcc/testsuite/gcc.target/arm/vect_vrhadd_1.c | 22 +
>   5 files changed, 117 insertions(+), 5 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/vect_vhadd_1.h
>   create mode 100644 gcc/testsuite/gcc.target/arm/vect_vrhadd_1.c
> 
> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> index c33e572..43ecc60 100644
> --- a/gcc/config/arm/iterators.md
> +++ b/gcc/config/arm/iterators.md
> @@ -308,8 +308,8 @@
> 
>   (define_int_iterator VADDW [UNSPEC_VADDW_S UNSPEC_VADDW_U])
> 
> -(define_int_iterator VHADD [UNSPEC_VRHADD_S UNSPEC_VRHADD_U
> -    UNSPEC_VHADD_S UNSPEC_VHADD_U])
> +(define_int_iterator VHADD[UNSPEC_VHADD_S UNSPEC_VHADD_U])
> +(define_int_iterator VRHADD[UNSPEC_VRHADD_S UNSPEC_VRHADD_U])
> 
>   (define_int_iterator VQADD [UNSPEC_VQADD_S UNSPEC_VQADD_U])
> 
> @@ -818,6 +818,11 @@
> 
>   ;; Mapping between vector UNSPEC operations and the signed ('s'),
>   ;; unsigned ('u'), poly ('p') or float ('f') nature of their data type.
> +
> +(define_int_attr u[
> +    (UNSPEC_VHADD_S "") (UNSPEC_VHADD_U "u")
> +    (UNSPEC_VRHADD_S "") (UNSPEC_VRHADD_U "u")])
> +
>   (define_int_attr sup [
>     (UNSPEC_VADDL_S "s") (UNSPEC_VADDL_U "u")
>     (UNSPEC_VADDW_S "s") (UNSPEC_VADDW_U "u")
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
> index f9d7ba3..1127bdb 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -2179,15 +2179,41 @@
>     [(set_attr "type" "neon_add_widen")]
>   )
> 
> +(define_expand "avg3_floor"
> +  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
> +    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
> +  (match_operand:VDQIW 2 "s_register_operand" "w")]
> + VHADD))]
> +  "TARGET_NEON"
> +)
> +
> +(define_expand "avg3_ceil"
> +  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
> +    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
> +  (match_operand:VDQIW 2 "s_register_operand" "w")]
> + VRHADD))]
> +  "TARGET_NEON"
> +)
> +
>   ; vhadd and vrhadd.
> 
> -(define_insn "neon_vhadd"
> +(define_insn "neon_vhadd"
>     [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
> -    (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w")
> +    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
>      (match_operand:VDQIW 2 "s_register_operand" "w")]
>     VHADD))]
>     "TARGET_NEON"
> - "vhadd.%#\t%0, %1, %2"
> +  "vhadd.%#\t%0, %1, %2"
> +  [(set_attr "type" "neon_add_halve_q")]
> +)
> +
> +(define_insn "neon_vrhadd"
> +  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
> +    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
> +   (match_operand:VDQIW 2 "s_register_operand" "w")]
> +  VRHADD))]
> +  "TARGET_NEON"
> +  "vrhadd.%#\t%0, %1, %2"
>     [(set_attr "type" "neon_add_halve_q")]
>   )
> 
> diff --git a/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c 
> b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
> new file mode 100644
> index 000..946171c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
> @@ -0,0 +1,22 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target arm_neon_hw } */
> +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */
> +/* { dg-add-options arm_neon } */
> +
> +#include "vect_vhadd_1.h"
> +
> +#define BIAS 0
> +
> +FOR_EACH_SIGNED_TYPE (DEF_FUNC)
> +
> +int
> +main (void)
> +{
> +  FOR_EACH_SIGNED_TYPE (TEST_FUNC);
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\tvhadd\.s[0-9]+} } 

Re: [PATCH] PowerPC Prefixed Memory, Patch #3, Update predicates

2019-06-28 Thread Segher Boessenkool
Hi Mike,

On Thu, Jun 27, 2019 at 08:12:35PM -0400, Michael Meissner wrote:
> +  return (TARGET_CMODEL == CMODEL_MEDIUM && SYMBOL_REF_P (op)
> +   && SYMBOL_REF_LOCAL_P (op));

Please break the line before that first && as well?

> +(define_predicate "pcrel_external_address"
> +  (match_code "symbol_ref,const")
> +{
> +  if (!TARGET_PCREL || TARGET_CMODEL != CMODEL_MEDIUM)
> +return false;

  if (!(TARGET_PCREL && TARGET_CMODEL == CMODEL_MEDIUM))

Should there be a helper function for this?  PCREL is only supported
for medium model -- abstracting that makes both the current code easier
to read, and if we ever want to support other models, that will be
simpler to do as well.

> +  /* Discard any CONST's.  */
> +  if (GET_CODE (op) == CONST)
> +op = XEXP (op, 0);

That comment says nothing the code already tells.  It's a common thing
to do anyway; just don't comment on it?

> +;; Return 1 if op is a memory operand to an external variable when we
> +;; support pc-relative addressing and the PCREL_OPT relocation to
> +;; optimize references to it.
> +(define_predicate "pcrel_external_mem_operand"
> +  (match_code "mem")
> +{
> +  return pcrel_external_address (XEXP (op, 0), Pmode);
> +})

Is that comment correct?  pcrel_external_address does nothing with
PCREL_OPT?  Or _its_ comment doesn't say, at least.

Okay for trunk with those trivialities resolved.  Thanks!


Segher


[PATCH] [ARC][COMMITTED] Fix slsr-13 regressions.

2019-06-28 Thread Claudiu Zissulescu
A recent RTX cost commit has changed the costs for ARC700 leading to
errors in slsr-13.c test.  This commit fixes this issue by reverting
the cost computation for short instructions.

2019-06-28  Claudiu Zissulescu  

* config/arc/arc.c (arc_rtx_costs): All short instructions are
having a lower cost regardless of the speed option.
---
 gcc/ChangeLog| 5 +
 gcc/config/arc/arc.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 35c5163da27..6f028eee3cc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2019-06-28  Claudiu Zissulescu  
+
+   * config/arc/arc.c (arc_rtx_costs): All short instructions are
+   having a lower cost regardless of the speed option.
+
 2019-06-28  Jan Beulich  
 
* config/i386/sse.md (sse2_cvtpd2pi, sse2_cvttpd2pi): Use
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index ff82c9f7136..5decf916884 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -5590,7 +5590,7 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
break;
  }
  }
-   if (nolimm && !speed)
+   if (nolimm)
  {
*total = 0;
return true;
-- 
2.21.0



[PATCH][Arm] Implement vector average patterns in aarch32

2019-06-28 Thread Iain Apreotesei
gcc/ChangeLog:

2019-06-28  Iain Apreotesei  

     * config/arm/iterators.md (VRHADD, VHADD): Add, update int_iterators.
     (u) new int_attr.
     * config/arm/neon.md (avg3_floor, avg3_ceil)
     (neon_vhadd, neon_vrhadd): Add new patterns.

gcc/testsuite/ChangeLog:

2019-06-28  Iain Apreotesei  

     * gcc.target/arm/vect_vhadd_1.c: New test.
     * gcc.target/arm/vect_vhadd_1.h: New test.
     * gcc.target/arm/vect_vrhadd_1.c: New test.

Change-Id: Ief4009984ca9974993530b582bd9ba431e42c3ed
---
  gcc/config/arm/iterators.md  |  9 +--
  gcc/config/arm/neon.md   | 32 +---
  gcc/testsuite/gcc.target/arm/vect_vhadd_1.c  | 22 +
  gcc/testsuite/gcc.target/arm/vect_vhadd_1.h  | 37 

  gcc/testsuite/gcc.target/arm/vect_vrhadd_1.c | 22 +
  5 files changed, 117 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
  create mode 100644 gcc/testsuite/gcc.target/arm/vect_vhadd_1.h
  create mode 100644 gcc/testsuite/gcc.target/arm/vect_vrhadd_1.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index c33e572..43ecc60 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -308,8 +308,8 @@

  (define_int_iterator VADDW [UNSPEC_VADDW_S UNSPEC_VADDW_U])

-(define_int_iterator VHADD [UNSPEC_VRHADD_S UNSPEC_VRHADD_U
-    UNSPEC_VHADD_S UNSPEC_VHADD_U])
+(define_int_iterator VHADD[UNSPEC_VHADD_S UNSPEC_VHADD_U])
+(define_int_iterator VRHADD[UNSPEC_VRHADD_S UNSPEC_VRHADD_U])

  (define_int_iterator VQADD [UNSPEC_VQADD_S UNSPEC_VQADD_U])

@@ -818,6 +818,11 @@

  ;; Mapping between vector UNSPEC operations and the signed ('s'),
  ;; unsigned ('u'), poly ('p') or float ('f') nature of their data type.
+
+(define_int_attr u[
+    (UNSPEC_VHADD_S "") (UNSPEC_VHADD_U "u")
+    (UNSPEC_VRHADD_S "") (UNSPEC_VRHADD_U "u")])
+
  (define_int_attr sup [
    (UNSPEC_VADDL_S "s") (UNSPEC_VADDL_U "u")
    (UNSPEC_VADDW_S "s") (UNSPEC_VADDW_U "u")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index f9d7ba3..1127bdb 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2179,15 +2179,41 @@
    [(set_attr "type" "neon_add_widen")]
  )

+(define_expand "avg3_floor"
+  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
+    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
+  (match_operand:VDQIW 2 "s_register_operand" "w")]
+ VHADD))]
+  "TARGET_NEON"
+)
+
+(define_expand "avg3_ceil"
+  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
+    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
+  (match_operand:VDQIW 2 "s_register_operand" "w")]
+ VRHADD))]
+  "TARGET_NEON"
+)
+
  ; vhadd and vrhadd.

-(define_insn "neon_vhadd"
+(define_insn "neon_vhadd"
    [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
-    (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w")
+    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
     (match_operand:VDQIW 2 "s_register_operand" "w")]
    VHADD))]
    "TARGET_NEON"
- "vhadd.%#\t%0, %1, %2"
+  "vhadd.%#\t%0, %1, %2"
+  [(set_attr "type" "neon_add_halve_q")]
+)
+
+(define_insn "neon_vrhadd"
+  [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
+    (unspec:VDQIW[(match_operand:VDQIW 1 "s_register_operand" "w")
+   (match_operand:VDQIW 2 "s_register_operand" "w")]
+  VRHADD))]
+  "TARGET_NEON"
+  "vrhadd.%#\t%0, %1, %2"
    [(set_attr "type" "neon_add_halve_q")]
  )

diff --git a/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c 
b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
new file mode 100644
index 000..946171c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 --save-temps -ftree-vectorize" } */
+/* { dg-add-options arm_neon } */
+
+#include "vect_vhadd_1.h"
+
+#define BIAS 0
+
+FOR_EACH_SIGNED_TYPE (DEF_FUNC)
+
+int
+main (void)
+{
+  FOR_EACH_SIGNED_TYPE (TEST_FUNC);
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\tvhadd\.s[0-9]+} } } */
+/* { dg-final { scan-assembler {\tvhadd\.s[0-9]+} } } */
+/* { dg-final { scan-assembler {\tvhadd\.s[0-9]+} } } */
+/* { dg-final { scan-assembler-not {\tvrhadd\.s[0-9]+} } } */
diff --git a/gcc/testsuite/gcc.target/arm/vect_vhadd_1.h 
b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.h
new file mode 100644
index 000..e093b42
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vect_vhadd_1.h
@@ -0,0 +1,37 @@
+#include 
+
+#define N 100
+
+#define DEF_FUNC(TYPE, B1, B2, C1, C2)\
+void __attribute__ ((noipa)) \
+f_##TYPE (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c)\
+{ \
+    for (int i = 0; i < N; ++i) \
+    a[i] = (b[i] + c[i] + BIAS) >> 1;\
+}
+
+#define TEST_FUNC(TYPE, B1, B2, C1, C2)\
+{ \
+    TYPE a[N], 

Reorg nonoverlapping_component_refs_p

2019-06-28 Thread Jan Hubicka
Hi,
this patch proceeds with further integrating
nonoverlapping_component_refs_for_decl_p and nonoverlapping_component_refs_p
however in bit more careful way then I intended previously (so we do not need
to xfail testcases and miss disambiguations we handled prevoiusly).  The
problems with an attempt to zap nonoverlapping_component_refs_for_decl_p was
twofold

 1) nonoverlapping_component_refs_for_decl_p is faster and it would be pity
to give up on that
 2) nonoverlapping_component_refs_for_decl_p is safe with -fno-strict-aliasing
while nonoverlapping_component_refs_p is not.

nonoverlapping_component_refs_for_decl_p matches initial segment of access
path and preserves info that if original bases was

  (*) either completely disjoint or equivalent (which is the case of decls)

then the part of access path walked has the same property.

The second simply looks for matching references in the middle of
path which relies on fact that if two record types are same by TBAA they
either overlap perfectly or not at all.

This patch refactors nonoverlapping_component_refs_for_decl_p to
nonoverlapping_component_refs_since_match_p and calls is in the places of
oracle where we establish (*) for refs.  This is done in few places -

 1) for decls that may be the same
 2) for pointers that are equal
 3) in aliasing_component_refs_p when we matched types.

In all these cases we can go to nonoverlapping_component_refs_since_match_p
check and in most cases it will be able to fully resolve aliasing.

If aliasing_component_refs_p did not find the match and TBAA says that one
access path can continue atop of other then there should be no matching types
so there is no need to call more expensive nonoverlapping_component_refs_p

There are corner case is that aliasing_component_refs_p gave up because of -1
in same_type_for_tbaa_p or where access paths contains something that
nonoverlapping_component_refs_since_match_p does not understand (union,
non-merged LTO types and such).  In this case we resort to
nonoverlapping_component_refs_p.

I have more testcases but they need bit more tweaking to
nonoverlapping_component_refs_since_match_p.  I tried to keep this patch small.
There are possible additional optimization, for example
nonoverlapping_component_refs does not need to walk whole path in many cases.

Tramp3d disambiguations change from:

  refs_may_alias_p: 3486566 disambiguations, 3820994 queries
  ref_maybe_used_by_call_p: 6790 disambiguations, 3512955 queries
  call_may_clobber_ref_p: 883 disambiguations, 883 queries
  nonoverlapping_component_refs_p: 23 disambiguations, 64926 queries
  nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2314 queries
  aliasing_component_refs_p: 893 disambiguations, 20671 queries
  TBAA oracle: 1766713 disambiguations 3393231 queries
   536475 are in alias set 0
   694528 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   264570 are dependent in the DAG
   130945 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 639122 disambiguations, 922905 queries
  pt_solutions_intersect: 119319 disambiguations, 506108 queries

to:
  refs_may_alias_p: 3486577 disambiguations, 3821005 queries
  ref_maybe_used_by_call_p: 6790 disambiguations, 3512966 queries
  call_may_clobber_ref_p: 883 disambiguations, 883 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 8953 queries
  nonoverlapping_component_refs_since_match_p: 31 disambiguations, 34795 queries
  aliasing_component_refs_p: 916 disambiguations, 29748 queries
  TBAA oracle: 1766713 disambiguations 3394373 queries
   536477 are in alias set 0
   694529 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   265709 are dependent in the DAG
   130945 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 639122 disambiguations, 922905 queries
  pt_solutions_intersect: 119319 disambiguations, 506118 queries

So 55973 fewer nonoverlapping_component_refs_p calls but
32481 more nonoverlapping_component_refs_since_match_p calls
and 9077 more aliasing_component_refs_p calls.

The decrease of disambiguations of
nonoverlapping_component_refs_p+nonoverlapping_component_refs_since_match_p is
caused by fact that aliasing_component_refs_p disambiguates based on the
nonoverlapping_ranges more frequently now.
Overall there are 11 new disambiguations.

lto-bootstrapped/regtested x86_64-linux. OK?

* tree-ssa-alias.c (nonoverlapping_component_refs_for_decl_p): Rename
to ..
(nonoverlapping_component_refs_since_match_p): ... this one;
handle also non-decl bases; return -1 if search gave up.
(alias_stats): Rename nonoverlapping_component_refs_of_decl_p_may_alias,

[PATCH] make gdbhooks.py idempotent with respect to reloading

2019-06-28 Thread Vladislav Ivanishin
Hi!

It is nice to be able to reload the pretty printers and convenience
functions from gdbhooks.py without exiting GDB: reloading cc1 takes
several seconds (plus, the debugging session is lost).

Previously:

   (gdb) python import imp; imp.reload(gdbhooks);
   RuntimeError: pretty-printer already registered: gcc

Fixing this turned out easier than I expected.
(gdb) py help (gdb.printing)
revealed, that we can pass replace parameter to register_pretty_printer
(which is False by default).

With the patch:

(gdb) python import imp; imp.reload(gdbhooks);
Successfully loaded GDB hooks for GCC

gcc/

* gdbhooks.py: Pass replace=True to
gdb.printing.register_pretty_printer.
---
 gcc/gdbhooks.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index 15a5ceaa56f..26a09749aa3 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -602,7 +602,8 @@ def build_pretty_printer():
 
 gdb.printing.register_pretty_printer(
 gdb.current_objfile(),
-build_pretty_printer())
+build_pretty_printer(),
+replace=True)
 
 def find_gcc_source_dir():
 # Use location of global "g" to locate the source tree
-- 
2.22.0


OK?

BTW, perhaps I should also add a convenience function for 'import imp;
imp.reload(gdbhooks)' or something to that effect?

-- 
Vlad


Re: [PATCH 1/5] Use alternative_mask for add_insn_allocno_copies

2019-06-28 Thread Richard Sandiford
Ping.  (And thanks for the reviews of the other patches in the series.)

Richard Sandiford  writes:
> add_insn_allocno_copies and its subroutines used HARD_REG_SET to
> represent a bitmask of alternatives.  There's not really any connection
> between the number of registers and the maximum number of alternatives,
> so this patch uses alternative_mask instead (which wasn't around when
> this code was added).
>
> This is just a minor clean-up making way for later patches.
>
>
> 2019-06-21  Richard Sandiford  
>
> gcc/
>   * ira-int.h (ira_setup_alts, ira_get_dup_out_num): Use
>   alternative_mask instead of HARD_REG_SET to represent a
>   bitmask of alternatives.
>   * ira.c (ira_setup_alts, ira_get_dup_out_num): Likewise.
>   * ira-conflicts.c (add_insn_allocno_copies): Likewise.
>
> Index: gcc/ira-int.h
> ===
> --- gcc/ira-int.h 2019-06-21 14:34:05.0 +0100
> +++ gcc/ira-int.h 2019-06-21 14:34:05.883715050 +0100
> @@ -963,8 +963,8 @@ extern void ira_print_disposition (FILE
>  extern void ira_debug_disposition (void);
>  extern void ira_debug_allocno_classes (void);
>  extern void ira_init_register_move_cost (machine_mode);
> -extern void ira_setup_alts (rtx_insn *insn, HARD_REG_SET );
> -extern int ira_get_dup_out_num (int op_num, HARD_REG_SET );
> +extern alternative_mask ira_setup_alts (rtx_insn *);
> +extern int ira_get_dup_out_num (int, alternative_mask);
>  
>  /* ira-build.c */
>  
> Index: gcc/ira.c
> ===
> --- gcc/ira.c 2019-06-21 14:34:05.0 +0100
> +++ gcc/ira.c 2019-06-21 14:34:05.887715020 +0100
> @@ -1784,9 +1784,12 @@ setup_prohibited_mode_move_regs (void)
>  
>  
>  
> -/* Setup possible alternatives in ALTS for INSN.  */
> -void
> -ira_setup_alts (rtx_insn *insn, HARD_REG_SET )
> +/* Extract INSN and return the set of alternatives that we should consider.
> +   This excludes any alternatives whose constraints are obviously impossible
> +   to meet (e.g. because the constraint requires a constant and the operand
> +   is nonconstant).  */
> +alternative_mask
> +ira_setup_alts (rtx_insn *insn)
>  {
>/* MAP nalt * nop -> start of constraints for given operand and
>   alternative.  */
> @@ -1798,7 +1801,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG
>  
>extract_insn (insn);
>alternative_mask preferred = get_preferred_alternatives (insn);
> -  CLEAR_HARD_REG_SET (alts);
> +  alternative_mask alts = 0;
>insn_constraints.release ();
>insn_constraints.safe_grow_cleared (recog_data.n_operands
> * recog_data.n_alternatives + 1);
> @@ -1833,8 +1836,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG
>   }
>for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
>   {
> -   if (!TEST_BIT (preferred, nalt)
> -   || TEST_HARD_REG_BIT (alts, nalt))
> +   if (!TEST_BIT (preferred, nalt) || TEST_BIT (alts, nalt))
>   continue;
>  
> for (nop = 0; nop < recog_data.n_operands; nop++)
> @@ -1906,7 +1908,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG
> ;
>   }
> if (nop >= recog_data.n_operands)
> - SET_HARD_REG_BIT (alts, nalt);
> + alts |= ALTERNATIVE_BIT (nalt);
>   }
>if (commutative < 0)
>   break;
> @@ -1916,6 +1918,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG
>if (curr_swapped)
>   break;
>  }
> +  return alts;
>  }
>  
>  /* Return the number of the output non-early clobber operand which
> @@ -1923,7 +1926,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG
> negative value if there is no such operand).  The function takes
> only really possible alternatives into consideration.  */
>  int
> -ira_get_dup_out_num (int op_num, HARD_REG_SET )
> +ira_get_dup_out_num (int op_num, alternative_mask alts)
>  {
>int curr_alt, c, original, dup;
>bool ignore_p, use_commut_op_p;
> @@ -1940,7 +1943,7 @@ ira_get_dup_out_num (int op_num, HARD_RE
>  {
>rtx op = recog_data.operand[op_num];
>
> -  for (curr_alt = 0, ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt),
> +  for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt),
>  original = -1;;)
>   {
> c = *str;
> @@ -1951,7 +1954,7 @@ ira_get_dup_out_num (int op_num, HARD_RE
> else if (c == ',')
>   {
> curr_alt++;
> -   ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt);
> +   ignore_p = !TEST_BIT (alts, curr_alt);
>   }
> else if (! ignore_p)
>   switch (c)
> Index: gcc/ira-conflicts.c
> ===
> --- gcc/ira-conflicts.c   2019-06-21 14:34:05.0 +0100
> +++ gcc/ira-conflicts.c   2019-06-21 14:34:05.883715050 +0100
> @@ -358,7 +358,7 @@ add_insn_allocno_copies (rtx_insn *insn)
>rtx set, operand, dup;
>bool 

Re: [PATCH], PR Fix floatsi{sf,df}2 and floatunssi{sf,df}2 for a future powerpc machine

2019-06-28 Thread Segher Boessenkool
Hi Mike,

On Thu, Jun 27, 2019 at 12:49:00PM -0400, Michael Meissner wrote:
> As I detail in PR 91009, I had some testsuite failures with my patches for a
> future machine.  In the future patches, I added new RTL attributes to support
> the new prefixed load/store instructions (which will include pc-relative
> support).  This new attribute needs to look at the 'type' RTL attribute, and
> that in turn depends on doing a constrain operands to determine whether the
> insn is a load, store, or add instruction.
> 
> This constrain operands call occurs before the first split pass.  The insns
> that convert SImode to SFmode or DFmode which are split in the first pass,
> currently use the "wa" constraint for the floating point result.  
> Unfortunately
> if the user uses the -mno-vsx option, the "wa" constraint becomes NO_REGS.

Yes, constrain_operands is called very early.  It's called during combine
for pretty much all insns, but it's already called by ifcvt ("ce1", the
eighth or so RTL pass) for some.  Instruction costing needs to figure out
which instruction alternatives are enabled, and which not.

> This patch is a rather limited patch that just adds an alternative using the
> "d" constraint, so there is a valid alternative before splitting.

The "before splitting" isn't really relevant: you have to have some
alternative enabled for all insns you generate.  Because lfiwax (the
machine instruction after which this pattern is named) only works on
floating point registers, not all VSX registers, and this pattern is
generated from places that do not check if VSX is enabled, adding a
"d" alternative sounds good yes.

> This patch does not do the cleanup mentioned in PR 90822.  That will still be
> done, but I wanted to fix the actual problem before doing the next iteration 
> of
> the code cleanup.
> 
> I have done a bootstrap and make check, and there were no regressions.  Can I
> check this into the trunk?

[ Please mention on what kind of system, what configuration.  This matters
  when things go wrong.  Something like "powerpc64le-linux, power8" is
  enough info usually, if you used default configurations. ]

> 2019-06-27   Michael Meissner  
> 
>   PR target/91009
>   * config/rs6000/rs6000.md (floatsi2_lfiwax): Add non-VSX
>   alternative.
>   (floatsi2_lfiwax_mem): Add non-VSX alternative.
>   (floatunssi2_lfiwzx): Add non-VSX alternative.
>   (floatunssi2_lfiwzx_mem): Add non-VSX alternative.

The patch is fine for trunk.  Thanks Mike.


Segher


[PATCH, i386]: Use SSE instructions for MMX modes on 32bit x86

2019-06-28 Thread Uros Bizjak
Attached patch enables SSE instructions for MMX modes also for 32bit
x86 targets. We can't use the same approach as with 64bit targets
(disable all MMX instructions to enable autovectorization) since ABI
mandates MMX modes in MMX registers, and sometimes there is no
equivalent SSE insn on 32bit target. OTOH, we shouldn't avoid SSE
instructions, so allow them when 1:1 SSE instruction is available.
Please note that the patch doesn't split MMX insn to a multi-insn SSE
sequence, since we can't avoid MMX insn anyway.

2019-06-28  Uroš Bizjak  

* config/i386/i386.md (mmx_isa): Rename x64, x64_noavx and x64_avx
to sse, sse_noavx and avx.  Update all uses.

2019-06-28  Uroš Bizjak  

* config/i386/mmx.md (sse_movntq): Add "isa" attribute.
(*mmx_3): Ditto.
(*mmx_mulv4hi3"): Ditto.
(*mmx_smulv4hi3_highpart): Ditto.
(*mmx_umulv4hi3_highpart): Ditto.
(*mmx_pmaddwd): Ditto.
(*sse2_umulv1siv1di3): Ditto.
(*mmx_v4hi3): Ditto.
(*mmx_v8qi3): Ditto.
(mmx_ashr3): Ditto.
("mmx_3): Ditto.
(*mmx_eq3): Ditto.
(mmx_gt3): Ditto.
(mmx_andnot3): Ditto.
(*mmx_3): Ditto.
(*mmx_pinsrw): Ditto.
(*mmx_pextrw): Ditto.
(mmx_pshufw_1): Ditto.
(*mmx_uavgv8qi3): Ditto.
(*mmx_uavgv4hi3): Ditto.
("mmx_psadbw): Ditto.
* config/i386/sse.md (sse_cvtps2pi): Ditto.
(sse_cvttps2pi): Ditto.
(ssse3_pmaddubsw): Ditto.
(*ssse3_pmulhrswv4hi3): Ditto.
(ssse3_psign3): Ditto.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

I'll commit the patch to mainline SVN in a couple of days.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 19beeb2a59a8..c362a72411a2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -801,7 +801,7 @@
   (const_string "base"))
 
 ;; Define instruction set of MMX instructions
-(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
+(define_attr "mmx_isa" "base,native,sse,sse_noavx,avx"
   (const_string "base"))
 
 (define_attr "enabled" ""
@@ -845,12 +845,12 @@
 
 (eq_attr "mmx_isa" "native")
   (symbol_ref "!TARGET_MMX_WITH_SSE")
-(eq_attr "mmx_isa" "x64")
+(eq_attr "mmx_isa" "sse")
   (symbol_ref "TARGET_MMX_WITH_SSE")
-(eq_attr "mmx_isa" "x64_avx")
-  (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
-(eq_attr "mmx_isa" "x64_noavx")
+(eq_attr "mmx_isa" "sse_noavx")
   (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
+(eq_attr "mmx_isa" "avx")
+  (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
]
(const_int 1)))
 
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index dc8dabfafc85..70413349390c 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -247,7 +247,8 @@
   "@
movntq\t{%1, %0|%0, %1}
movnti\t{%1, %0|%0, %1}"
-  [(set_attr "mmx_isa" "native,x64")
+  [(set_attr "isa" "*,x64")
+   (set_attr "mmx_isa" "native,*")
(set_attr "type" "mmxmov,ssemov")
(set_attr "mode" "DI")])
 
@@ -594,7 +595,7 @@
(vec_duplicate:V4SF (match_dup 1)))]
   "operands[0] = lowpart_subreg (V4SFmode, operands[0],
 GET_MODE (operands[0]));"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "mmx_isa" "native,sse_noavx,avx")
(set_attr "type" "mmxcvt,ssemov,ssemov")
(set_attr "mode" "DI,TI,TI")])
 
@@ -730,7 +731,8 @@
p\t{%2, %0|%0, %2}
p\t{%2, %0|%0, %2}
vp\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "isa" "*,sse2_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxadd,sseadd,sseadd")
(set_attr "mode" "DI,TI,TI")])
 
@@ -753,7 +755,8 @@
p\t{%2, %0|%0, %2}
p\t{%2, %0|%0, %2}
vp\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "isa" "*,sse2_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxadd,sseadd,sseadd")
(set_attr "mode" "DI,TI,TI")])
 
@@ -781,7 +784,8 @@
pmullw\t{%2, %0|%0, %2}
pmullw\t{%2, %0|%0, %2}
vpmullw\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "isa" "*,sse2_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxmul,ssemul,ssemul")
(set_attr "mode" "DI,TI,TI")])
 
@@ -814,7 +818,8 @@
pmulhw\t{%2, %0|%0, %2}
pmulhw\t{%2, %0|%0, %2}
vpmulhw\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "isa" "*,sse2_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxmul,ssemul,ssemul")
(set_attr "mode" "DI,TI,TI")])
 
@@ -849,7 +854,8 @@
pmulhuw\t{%2, %0|%0, %2}
pmulhuw\t{%2, %0|%0, %2}
vpmulhuw\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+  [(set_attr "isa" "*,sse2_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxmul,ssemul,ssemul")
(set_attr "mode" 

[patch] Reimplement GNU threads library on native Windows

2019-06-28 Thread Eric Botcazou
Hi,

this reimplements the GNU threads library on native Windows (except for the 
Objective-C specific subset) using direct Win32 API calls, in lieu of the 
implementation based on semaphores.  This base implementations requires 
Windows XP/Server 2003, which is the default minimal setting of MinGW-W64.
This also adds the support required for the C++11 threads, using again direct 
Win32 API calls; this additional layer requires Windows Vista/Server 2008 and 
is enabled only if _GTHREADS_USE_COND is defined to 1.

This also changes libstdc++ to setting _GTHREADS_USE_COND to 1 when the switch 
--enable-libstdcxx-threads is passed, which means that C++11 threads are still 
disabled by default on native Windows and that you need to explicitly pass the 
switch to enable them.  The 30_threads chapter of the testsuite is clean.

Tested on i686-pc-mingw32 and x86_64-pc-mingw32, OK for the mainline?


2019-06-28  Eric Botcazou  

libgcc/
* config.host (i[34567]86-*-mingw*): Add thread fragment after EH one
as well as new i386/t-slibgcc-mingw fragment.
(x86_64-*-mingw*): Likewise.
* config/i386/gthr-win32.h: If _GTHREADS_USE_COND is 1, define both
__GTHREAD_HAS_COND & __GTHREADS_CXX0X to 1 and _WIN32_WINNT to 0x0600.
Error out if _GTHREAD_USE_MUTEX_TIMEDLOCK is 1.
Include stdlib.h instead of errno.h and do not include _mingw.h.
(CONST_CAST2): Add specific definition for C++.
(ATTRIBUTE_UNUSED): New macro.
(__UNUSED_PARAM): Delete.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_objc_data_tls): Use TLS_OUT_OF_INDEXES instead of (DWORD)-1.
(__gthread_objc_init_thread_system): Likewise.
(__gthread_objc_thread_get_data): Minor tweak.
(__gthread_objc_condition_allocate): Use ATTRIBUTE_UNUSED.
(__gthread_objc_condition_deallocate): Likewise.
(__gthread_objc_condition_wait): Likewise.
(__gthread_objc_condition_broadcast): Likewise.
(__gthread_objc_condition_signal): Likewise.
Include sys/time.h.
(__gthr_win32_DWORD): New typedef.
(__gthr_win32_HANDLE): Likewise.
(__gthr_win32_CRITICAL_SECTION): Likewise.
(__gthr_win32_CONDITION_VARIABLE): Likewise.
(__gthread_t): Adjust.
(__gthread_key_t): Likewise.
(__gthread_mutex_t): Likewise.
(__gthread_recursive_mutex_t): Likewise.
(__gthread_cond_t): New typedef.
(__gthread_time_t): Likewise.
(__GTHREAD_MUTEX_INIT_DEFAULT): Delete.
(__GTHREAD_RECURSIVE_MUTEX_INIT_DEFAULT): Likewise.
(__GTHREAD_COND_INIT_FUNCTION): Define.
(__GTHREAD_TIME_INIT): Likewise.
(__gthr_i486_lock_cmp_xchg): Delete.
(__gthr_win32_create): Declare.
(__gthr_win32_join): Likewise.
(__gthr_win32_self): Likewise.
(__gthr_win32_detach): Likewise.
(__gthr_win32_equal): Likewise.
(__gthr_win32_yield): Likewise.
(__gthr_win32_mutex_destroy): Likewise.
(__gthr_win32_cond_init_function): Likewise if _GTHREADS_USE_COND is 1.
(__gthr_win32_cond_broadcast): Likewise.
(__gthr_win32_cond_signal): Likewise.
(__gthr_win32_cond_wait): Likewise.
(__gthr_win32_cond_timedwait): Likewise.
(__gthr_win32_recursive_mutex_init_function): Delete.
(__gthr_win32_recursive_mutex_lock): Likewise.
(__gthr_win32_recursive_mutex_unlock): Likewise.
(__gthr_win32_recursive_mutex_destroy): Likewise.
(__gthread_create): New inline function.
(__gthread_join): Likewise.
(__gthread_self): Likewise.
(__gthread_detach): Likewise.
(__gthread_equal): Likewise.
(__gthread_yield): Likewise.
(__gthread_cond_init_function): Likewise if _GTHREADS_USE_COND is 1.
(__gthread_cond_broadcast): Likewise.
(__gthread_cond_signal): Likewise.
(__gthread_cond_wait): Likewise.
(__gthread_cond_timedwait): Likewise.
(__GTHREAD_WIN32_INLINE): New macro.
(__GTHREAD_WIN32_COND_INLINE): Likewise.
(__GTHREAD_WIN32_ACTIVE_P): Likewise.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_once): Minor tweaks.
(__gthread_key_create): Use ATTRIBUTE_UNUSED and TLS_OUT_OF_INDEXES.
(__gthread_key_delete): Minor tweak.
(__gthread_getspecific): Likewise.
(__gthread_setspecific): Likewise.
(__gthread_mutex_init_function): Reimplement.
(__gthread_mutex_destroy): Likewise.
(__gthread_mutex_lock): Likewise.
(__gthread_mutex_trylock): Likewise.
(__gthread_mutex_unlock): Likewise.
(__gthr_win32_abs_to_rel_time): Declare.
(__gthread_recursive_mutex_init_function): Reimplement.
(__gthread_recursive_mutex_destroy): Likewise.
(__gthread_recursive_mutex_lock): Likewise.
(__gthread_recursive_mutex_trylock): Likewise.

[patch, c++ openmp] Improve diagnostics for unmappable types

2019-06-28 Thread Andrew Stubbs
This patch improves the diagnostics given for unmappable C++ types in 
OpenMP programs.


Here is the output *without* the patch, for the new testcase:


unmappable-1.C: In function 'int main()':
unmappable-1.C:16:24: error: 'v' does not have a mappable type in 'map' 
clause

   16 | #pragma omp target map(v)


This is correct, but not very helpful for anything but the most trivial 
C++ types. Anything involving inheritance, templates, typedefs, etc. 
could be extremely difficult to track down.


With the patch applied we now get this (I removed the "dg-message" 
comments for clarity):



unmappable-1.C: In function 'int main()':
unmappable-1.C:16:24: error: 'v' does not have a mappable type in 'map' 
clause

   16 | #pragma omp target map(v)
  |^
cc1plus: note: incomplete types are not mappable
unmappable-1.C:4:7: note: types with virtual members are not mappable
4 | class C
  |   ^
unmappable-1.C:7:14: note: static fields are not mappable
7 |   static int static_member;
  |  ^


The compiler now reports all the problematic fields in the whole type, 
recursively. If anybody knows how to report the location of incomplete 
array declarations then that would be nice to add.


OK to commit?

Andrew
Improve OpenMP map diagnostics.

2019-06-27  Andrew Stubbs  

	gcc/cp/
	* cp-tree.h (cp_omp_emit_unmappable_type_notes): New prototype.
	* decl.c (cp_finish_decl): Call cp_omp_emit_unmappable_type_notes.
	* decl2.c (cp_omp_mappable_type): Move contents to ...
	(cp_omp_mappable_type_1):  ... here and add note output.
	(cp_omp_emit_unmappable_type_notes): New function.
	* semantics.c (finish_omp_clauses): Call
	cp_omp_emit_unmappable_type_notes in four places.

	gcc/testsuite/
	* g++.dg/gomp/unmappable-1.C: New file.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index bf47f67721e..a7b2151e6dd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6533,6 +6533,7 @@ extern int parm_index   (tree);
 extern tree vtv_start_verification_constructor_init_function (void);
 extern tree vtv_finish_verification_constructor_init_function (tree);
 extern bool cp_omp_mappable_type		(tree);
+extern bool cp_omp_emit_unmappable_type_notes	(tree);
 extern void cp_check_const_attributes (tree);
 
 /* in error.c */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5d49535b0d9..74343bc1ec4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7433,8 +7433,11 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 			DECL_ATTRIBUTES (decl));
   complete_type (TREE_TYPE (decl));
   if (!cp_omp_mappable_type (TREE_TYPE (decl)))
-	error ("%q+D in declare target directive does not have mappable type",
-	   decl);
+	{
+	  error ("%q+D in declare target directive does not have mappable"
+		 " type", decl);
+	  cp_omp_emit_unmappable_type_notes (TREE_TYPE (decl));
+	}
   else if (!lookup_attribute ("omp declare target",
   DECL_ATTRIBUTES (decl))
 	   && !lookup_attribute ("omp declare target link",
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 206f04c6320..17deeda75f8 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1406,32 +1406,83 @@ cp_check_const_attributes (tree attributes)
 }
 }
 
-/* Return true if TYPE is an OpenMP mappable type.  */
-bool
-cp_omp_mappable_type (tree type)
+/* Return true if TYPE is an OpenMP mappable type.
+   If NOTES is non-zero, emit a note message for each problem.  */
+static bool
+cp_omp_mappable_type_1 (tree type, bool notes)
 {
+  bool result = true;
+
   /* Mappable type has to be complete.  */
   if (type == error_mark_node || !COMPLETE_TYPE_P (type))
-return false;
+{
+  if (notes)
+	{
+	  tree decl = TYPE_MAIN_DECL (type);
+	  inform ((decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION),
+		  "incomplete types are not mappable");
+	  result = false;
+	}
+  else
+return false;
+}
   /* Arrays have mappable type if the elements have mappable type.  */
   while (TREE_CODE (type) == ARRAY_TYPE)
 type = TREE_TYPE (type);
   /* A mappable type cannot contain virtual members.  */
   if (CLASS_TYPE_P (type) && CLASSTYPE_VTABLES (type))
-return false;
+{
+  if (notes)
+	{
+	  inform (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (type)),
+		  "types with virtual members are not mappable");
+	  result = false;
+	}
+  else
+	return false;
+}
   /* All data members must be non-static.  */
   if (CLASS_TYPE_P (type))
 {
   tree field;
   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 	if (VAR_P (field))
-	  return false;
+	  {
+	if (notes)
+	  {
+		inform (DECL_SOURCE_LOCATION (field),
+			"static fields are not mappable");
+		result = false;
+	  }
+	else
+	  return false;
+	  }
 	/* All fields must have mappable types.  */
 	else if (TREE_CODE (field) == FIELD_DECL
-		 && !cp_omp_mappable_type (TREE_TYPE (field)))
-	  return false;
+		 && 

Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block

2019-06-28 Thread Alexandre Oliva
On Jun 27, 2019, Richard Biener  wrote:

> On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:
>> 
>> The only preexisting use of GIMPLE_EH_ELSE, for transactional memory
>> commits, did not allow exceptions to escape from the ELSE path.  The
>> trick it uses to allow the ELSE path to see the propagating exception
>> does not work very well if the exception cleanup raises further
>> exceptions: the ELSE block is configured to handle exceptions in
>> itself.  This confuses the heck out of CFG and EH cleanups.
>> 
>> Basing the lowering context for the ELSE block on outer_state, rather
>> than this_state, gets us the expected enclosing handler.
>> 
>> Regstrapped on x86_64-linux-gnu.  Ok to install?

> Testcase?

None possible with the current codebase, I'm afraid.  Nothing creates
EH_ELSE yet, and nothing creates GIMPLE_EH_ELSE with
exception-generating cleanups.

The following patch, an extract of a larger patch that is still under
internal review (and pending approval of the EH_ELSE-related changes
;-), is minimized into pointlessness so as to just exercise the
GIMPLE_EH_ELSE lowering issue.

IIRC the problem is NOT immediately triggered in a bootstrap, it
requires more elaborate EH scenarios to trigger it: without the
GIMPLE_EH_ELSE lowering patch, a few libadalang units failed to compile
within delete_unreachable_blocks, using a compiler built with the patch
below and the patch that introduced EH_ELSE that I posted yesterday.

At first, I suspected GIMPLE_EH_ELSE had bit-rotted, as it doesn't seem
to get much use, but the problem turned out to be caused by the
nonsensical CFG resulting from the GIMPLE_EH_ELSE lowering, that breaks
EH CFG optimizations and end up corrupting the CFG.  IIRC it was
unsplit_eh that mishandled self edges that arise after a bunch of other
valid transformations.  I cannot observe the crash with trunk any more,
but the EH tree is visibly wrong in tree dumps with eh and blocks,
compiling such a simple testcase as this (with a patched compiler as
described above):

-- compile with -Ofast -g -c
with GNAT.Semaphores; use GNAT.Semaphores;
package T is
   subtype Mutual_Exclusion is Binary_Semaphore
 (Initially_Available => True,
  Ceiling => Default_Ceiling);
   Lock : aliased Mutual_Exclusion;
end T;


Self edges end up arising because of the way GIMPLE_EH_ELSE was lowered:
the exceptional cleanup was lowered as if within the TRY_FINALLY_EXPR
TRY block, whose exceptions get handled by the exceptional cleanup, but
both cleanup paths should be lowered as if after the TRY_FINALLY_EXPR,
so that an enclosing EH region is used should they throw exceptions.

The current lowering made sense for cleanups that couldn't throw: no EH
edge would come out of the lowered exceptional cleanup block.  However,
EH_ELSE-using code below breaks that assumption.  Fortunately, the
assumption was not essential for GIMPLE_EH_ELSE: the lowering code just
needed this small adjustment to make room for relaxing the requirement
that the cleanups couldn't throw and restoring CFG EH edges that match
what we get without the patch below.


--- gcc/ada/gcc-interface/trans.c
+++ gcc/ada/gcc-interface/trans.c
@@ -6249,7 +6249,7 @@ Exception_Handler_to_gnu_gcc (Node_Id gnat_node)
   if (stmt_list_cannot_alter_control_flow_p (Statements (gnat_node)))
 add_stmt_with_node (stmt, gnat_node);
   else
-add_cleanup (stmt, gnat_node);
+add_cleanup (build2 (EH_ELSE, void_type_node, stmt, stmt), gnat_node);
 
   gnat_poplevel ();
 
@@ -9066,7 +9081,29 @@ add_cleanup (tree gnu_cleanup, Node_Id g
 {
   if (Present (gnat_node))
 set_expr_location_from_node (gnu_cleanup, gnat_node, true);
-  append_to_statement_list (gnu_cleanup, _stmt_group->cleanups);
+  /* An EH_ELSE must be by itself, and that's all we need when we use
+ it.  The assert below makes sure that is so.  Should we ever need
+ more than that, we could combine EH_ELSEs, and copy non-EH_ELSE
+ stmts into both cleanup paths of an EH_ELSE, being careful to
+ make sure the exceptional path doesn't throw.  */
+  if (TREE_CODE (gnu_cleanup) == EH_ELSE)
+{
+  gcc_assert (!current_stmt_group->cleanups);
+  if (Present (gnat_node))
+   {
+ set_expr_location_from_node (TREE_OPERAND (gnu_cleanup, 0),
+  gnat_node, true);
+ set_expr_location_from_node (TREE_OPERAND (gnu_cleanup, 1),
+  gnat_node, true);
+   }
+  current_stmt_group->cleanups = gnu_cleanup;
+}
+  else
+{
+  gcc_assert (!current_stmt_group->cleanups
+ || TREE_CODE (current_stmt_group->cleanups) != EH_ELSE);
+  append_to_statement_list (gnu_cleanup, _stmt_group->cleanups);
+}
 }
 
 /* Set the BLOCK node corresponding to the current code group to GNU_BLOCK.  */


-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU 

Re: [PATCH v2 2/2] x86: improve GFNI insns

2019-06-28 Thread Uros Bizjak
On Fri, Jun 28, 2019 at 10:56 AM Jan Beulich  wrote:
>
> There's no need for three alternatives: "v" without TARGET_AVX512F is
> the same as "x".
>
> gcc/
> 2019-06-28  Jan Beulich  
>
> * config/i386/sse.md (vgf2p8affineinvqb_,
> vgf2p8affineqb_): Eliminate redundant
> alternative.

Please also mention vgf2p8mulb_ in the ChangeLog.

OK with the above change.

> ---
> v2: New, split off from previous bigger patch.
> ---
> In fact I doubt two alternatives are necessary, seeing how other insns
> get away with just one. But I'm not bothered enough to try to actually
> get this right.

You need two alternatives, non-avx one has matched operand.

Uros.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -22072,56 +22072,53 @@
> "vpopcnt\t{%1, %0|%0, %1}")
>
>   (define_insn "vgf2p8affineinvqb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
> -  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
> -  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")
> +  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
>   UNSPEC_GF2P8AFFINEINV))]
> "TARGET_GFNI"
> "@
>  gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3}
> -   vgf2p8affineinvqb\t{%3, %2, %1, %0|
> %0, %1, %2, %3}
>  vgf2p8affineinvqb\t{%3, %2, %1, %0|
> %0, %1, %2, %3}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
>  (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
>  (set_attr "mode" "")])
>
>   (define_insn "vgf2p8affineqb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
> -  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
> -  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")
> +  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
>   UNSPEC_GF2P8AFFINE))]
> "TARGET_GFNI"
> "@
>  gf2p8affineqb\t{%3, %2, %0| %0, %2, %3}
> -   vgf2p8affineqb\t{%3, %2, %1, %0| %0,
> %1, %2, %3}
>  vgf2p8affineqb\t{%3, %2, %1, %0| %0,
> %1, %2, %3}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
>  (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
>  (set_attr "mode" "")])
>
>   (define_insn "vgf2p8mulb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
> -  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "%0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")]
>   UNSPEC_GF2P8MUL))]
> "TARGET_GFNI"
> "@
>  gf2p8mulb\t{%2, %0| %0, %2}
> -   vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}
>  vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
>  (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
>  (set_attr "mode" "")])
>
>   (define_insn "vpshrd_"


Re: RFC: GCC Aarch64 SIMD vectorization question involving libmvec

2019-06-28 Thread Szabolcs Nagy
On 27/06/2019 20:54, Steve Ellcey wrote:
> I am testing the latest GCC with not-yet-submitted GLIBC changes that
> implement libmvec on Aarch64.
> 
> While trying to run SPEC 2017 (specifically 521.wrf_r) I ran into a
> case where GCC was generating a call to _ZGVnN2vv_powf, that is a
> vectorized powf call for 2 (not 4) elements.  This was a problem
> because I only implemented a 4 element 32 bit vectorized powf function
> for libmvec and not a 2 element version.
> 
> I think this is due to aarch64_simd_clone_compute_vecsize_and_simdlen
> which allows for (element count * element size) to be either 64
> or 128.
> 
> I would like some thoughts on what we should do about this, should
> we require glibc/libmvec to provide 2 element 32 bit floating point
> vector functions (as well as the 4 element ones) or should we change
> aarch64_simd_clone_compute_vecsize_and_simdlen to only allow 4
> element (128 total bit size) vectors and not 2 element (64 total bit
> size) ones?

in the "declare simd" syntax "simdlen" can specify
a vector length and then only that will be used.

so either you provide both 4 and 2 length powf or
it has to be declared as

#pragma omp declare simd simdlen(4) notinbranch
float powf(float, float);

now, we have the issue that this still allows
vectorizing powf as an sve call in future compilers,
to deal with that omp 5.0 has "declare variant" which
can specify exactly one simd variant and its name:

#pragma omp declare variant(_ZGVnN4vv_powf) \
 match(construct={simd(simdlen(4), notinbranch)}, device={isa("simd")})
float powf(float, float);

gcc does not support this currently and it will
require a new attribute syntax too, the next vector
abi update will specify this omp pragma and i will
write something about how the attribute should work,
unfortunately i don't see an easy way out of this.

if there is a way to detect that the compiler only
supports advsimd vectror calls and not sve calls
then we could do

#if supported_omp_version >= 5
#pragma omp declare variant(_ZGVnN4vv_powf) \
 match(construct={simd(simdlen(4), notinbranch)}, device={isa("simd")})
#elif only_advsimd_vector_call_support
#pragma omp declare simd simdlen(4) notinbranch
#endif
float powf(float, float);

i don't know yet if we can do this reliably in
glibc math.h (and we have a fortran declaration
problem too).

> 
> This is obviously a question for the pre-SVE vector instructions,
> I am not sure how this would be handled in SVE.
> 
> Steve Ellcey
> sell...@marvell.com
> 
> P.S.  Here a test case in Fortran that generated the 2 element
>   vector call.  It unrolled the loop into one vector call
>   of 2 elements and one scalar call.
> 
>   SUBROUTINE FOO(B,W,P)
>   REAL, DIMENSION (3) :: W, P
>   DO 10 I = 1, 3
>   P(I) = W(I) ** B
> 10CONTINUE
>   END SUBROUTINE FOO
> 



[PATCH v2 2/2] x86: improve GFNI insns

2019-06-28 Thread Jan Beulich
There's no need for three alternatives: "v" without TARGET_AVX512F is
the same as "x".

gcc/
2019-06-28  Jan Beulich  

* config/i386/sse.md (vgf2p8affineinvqb_,
vgf2p8affineqb_): Eliminate redundant
alternative.
---
v2: New, split off from previous bigger patch.
---
In fact I doubt two alternatives are necessary, seeing how other insns
get away with just one. But I'm not bothered enough to try to actually
get this right.

--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -22072,56 +22072,53 @@
"vpopcnt\t{%1, %0|%0, %1}")

  (define_insn "vgf2p8affineinvqb_"
-  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
+  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
(unspec:VI1_AVX512F
- [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
-  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
-  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
+ [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
+  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")
+  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
  UNSPEC_GF2P8AFFINEINV))]
"TARGET_GFNI"
"@
 gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3}
-   vgf2p8affineinvqb\t{%3, %2, %1, %0| 
%0, %1, %2, %3}
 vgf2p8affineinvqb\t{%3, %2, %1, %0| 
%0, %1, %2, %3}"
-  [(set_attr "isa" "noavx,avx,avx512f")
-   (set_attr "prefix_data16" "1,*,*")
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "prefix_data16" "1,*")
 (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,maybe_evex,evex")
+   (set_attr "prefix" "orig,maybe_evex")
 (set_attr "mode" "")])

  (define_insn "vgf2p8affineqb_"
-  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
+  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
(unspec:VI1_AVX512F
- [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
-  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
-  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
+ [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
+  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")
+  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
  UNSPEC_GF2P8AFFINE))]
"TARGET_GFNI"
"@
 gf2p8affineqb\t{%3, %2, %0| %0, %2, %3}
-   vgf2p8affineqb\t{%3, %2, %1, %0| %0, 
%1, %2, %3}
 vgf2p8affineqb\t{%3, %2, %1, %0| %0, 
%1, %2, %3}"
-  [(set_attr "isa" "noavx,avx,avx512f")
-   (set_attr "prefix_data16" "1,*,*")
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "prefix_data16" "1,*")
 (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,maybe_evex,evex")
+   (set_attr "prefix" "orig,maybe_evex")
 (set_attr "mode" "")])

  (define_insn "vgf2p8mulb_"
-  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
+  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
(unspec:VI1_AVX512F
- [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
-  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")]
+ [(match_operand:VI1_AVX512F 1 "register_operand" "%0,v")
+  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")]
  UNSPEC_GF2P8MUL))]
"TARGET_GFNI"
"@
 gf2p8mulb\t{%2, %0| %0, %2}
-   vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}
 vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}"
-  [(set_attr "isa" "noavx,avx,avx512f")
-   (set_attr "prefix_data16" "1,*,*")
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "prefix_data16" "1,*")
 (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,maybe_evex,evex")
+   (set_attr "prefix" "orig,maybe_evex")
 (set_attr "mode" "")])

  (define_insn "vpshrd_"


Re: RFC: GCC Aarch64 SIMD vectorization question involving libmvec

2019-06-28 Thread Richard Sandiford
Steve Ellcey  writes:
> I am testing the latest GCC with not-yet-submitted GLIBC changes that
> implement libmvec on Aarch64.
>
> While trying to run SPEC 2017 (specifically 521.wrf_r) I ran into a
> case where GCC was generating a call to _ZGVnN2vv_powf, that is a
> vectorized powf call for 2 (not 4) elements.  This was a problem
> because I only implemented a 4 element 32 bit vectorized powf function
> for libmvec and not a 2 element version.
>
> I think this is due to aarch64_simd_clone_compute_vecsize_and_simdlen
> which allows for (element count * element size) to be either 64
> or 128.
>
> I would like some thoughts on what we should do about this, should
> we require glibc/libmvec to provide 2 element 32 bit floating point
> vector functions (as well as the 4 element ones) or should we change
> aarch64_simd_clone_compute_vecsize_and_simdlen to only allow 4
> element (128 total bit size) vectors and not 2 element (64 total bit
> size) ones?
>
> This is obviously a question for the pre-SVE vector instructions,
> I am not sure how this would be handled in SVE.

The vector ABI says that "#pragma omp declare simd" without a simdlen
declares both 64-bit and 128-bit functions, so I think the GCC code is
doing the right thing.  If glibc only implements 128-bit functions
for powf then it should use simdlen(4).

It would be nice to support simdlen(2) as well though.  Low-trip-count
loops like the one below would be one use case.  Another would be SLP.
And hopefully at some point in the future we'll be able to turn
vect-epilogues-nomask on by default, in which case we would also have
64-bit vectorisation in the tail of a loop vectorised at 128 bits.

Thanks,
Richard

>
> Steve Ellcey
> sell...@marvell.com
>
> P.S.  Here a test case in Fortran that generated the 2 element
>   vector call.  It unrolled the loop into one vector call
>   of 2 elements and one scalar call.
>
>   SUBROUTINE FOO(B,W,P)
>   REAL, DIMENSION (3) :: W, P
>   DO 10 I = 1, 3
>   P(I) = W(I) ** B
> 10CONTINUE
>   END SUBROUTINE FOO


[PATCH v2 1/2] x86: fix vgf2p8affine*qb insns

2019-06-28 Thread Jan Beulich
The affine transformations are not commutative (the two source operands
have entirely different meaning).

Also the nonimmediate_operand predicate can better be vector_operand.

gcc/
2019-06-28  Jan Beulich  

* config/i386/sse.md (vgf2p8affineinvqb_,
vgf2p8affineqb_): Drop % constraint modifier.
Use vector_operand.

gcc/testsuite/
2019-06-28  Jan Beulich  

* gcc.target/i386/gfni-5.c: New.

---
v2: Retain Bm. Split off removal of the one alternative.

--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -22074,8 +22074,8 @@
  (define_insn "vgf2p8affineinvqb_"
[(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
(unspec:VI1_AVX512F
- [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
-  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
+ [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
+  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
   (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
  UNSPEC_GF2P8AFFINEINV))]
"TARGET_GFNI"
@@ -22092,8 +22092,8 @@
  (define_insn "vgf2p8affineqb_"
[(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
(unspec:VI1_AVX512F
- [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
-  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
+ [(match_operand:VI1_AVX512F 1 "register_operand" "0,x,v")
+  (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,xm,vm")
   (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
  UNSPEC_GF2P8AFFINE))]
"TARGET_GFNI"
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/gfni-5.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mgfni" } */
+
+typedef char __attribute__((vector_size(16))) v16qi_t;
+
+v16qi_t test16a (v16qi_t x, v16qi_t a)
+{
+  asm volatile ("" : "+m" (a));
+  return __builtin_ia32_vgf2p8affineqb_v16qi (x, a, 0);
+}
+
+v16qi_t test16b (v16qi_t x, v16qi_t a)
+{
+  asm volatile ("" : "+m" (x));
+  return __builtin_ia32_vgf2p8affineqb_v16qi (x, a, 0);
+}
+
+/* { dg-final { scan-assembler-times "gf2p8affineqb\[ \t].*\\(" 1 } } */
+/* { dg-final { scan-assembler-times "gf2p8affineqb\[ \t].*%xmm.*%xmm" 
1 } } */


Re: [PATCH][Arm] Remove constraint strings from define_expand constructs in the back end

2019-06-28 Thread Kyrill Tkachov

Hi Dennis,

On 6/27/19 4:58 PM, Dennis Zhang wrote:

Hi Kyrill,

Thanks for the review!

On 6/24/19 5:27 PM, Kyrill Tkachov wrote:

Hi Dennis,

On 6/24/19 4:13 PM, Dennis Zhang wrote:

Hi,

A number of Arm define_expand patterns have specified constraints for
their operands. But the constraint strings are ignored at expand time
and are therefore redundant/useless. We now avoid specifying constraints
in new define_expands, but we should clean up the existing define_expand
definitions.

For example, the constraint "=r" is removed in the following case:
(define_expand "reload_inhi"
      [(parallel [(match_operand:HI 0 "s_register_operand" "=r")
The "" marks with an empty constraint in define_expand are removed as
well.

The patch is tested with the build configuration of
--target=arm-linux-gnueabi and it passes gcc/testsuite.


Thank you for the patch.

Unfortunately I've hit an ICE building an arm-none-eabi target with your
patch.

This appears to be due to:

@@ -6767,9 +6767,9 @@
   ;; temporary if the address isn't offsettable -- push_reload doesn't seem
   ;; to take any notice of the "o" constraints on reload_memory_operand
operand.
   (define_expand "reload_outhi"
-  [(parallel [(match_operand:HI 0 "arm_reload_memory_operand" "=o")
-      (match_operand:HI 1 "s_register_operand"    "r")
-      (match_operand:DI 2 "s_register_operand" "=")])]
+  [(parallel [(match_operand:HI 0 "arm_reload_memory_operand")
+      (match_operand:HI 1 "s_register_operand")
+      (match_operand:DI 2 "s_register_operand")])]
     "TARGET_EITHER"
     "if (TARGET_ARM)
    arm_reload_out_hi (operands);
@@ -6780,9 +6780,9 @@
   )

   (define_expand "reload_inhi"
-  [(parallel [(match_operand:HI 0 "s_register_operand" "=r")
-      (match_operand:HI 1 "arm_reload_memory_operand" "o")
-      (match_operand:DI 2 "s_register_operand" "=")])]
+  [(parallel [(match_operand:HI 0 "s_register_operand")
+      (match_operand:HI 1 "arm_reload_memory_operand")
+      (match_operand:DI 2 "s_register_operand")])]
     "TARGET_EITHER"
     "
     if (TARGET_ARM)


the reload_in and reload_out patterns are somewhat special:

https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html#Standard-Names

where the constraints seems to matter.

We should migrate these patterns to the recommended secondary_reload
hook, but that would be a separate patches.

For now, please try removing changes to these patterns and making sure
that the build succeeds.

Thanks,

Kyrill



The special exception, reload_inm and reload_outm patterns are fixed.
Their constraints are left untouched as original in order to work
correctly in default_secondary_reload, gcc/targhook.c.

The updated patch is tested for targets: arm-none-linux-gnueabi,
arm-none-linux-gnueabihf, arm-linux-gnueabi, and arm-linux-gnueabihf.
And it survives in testsuite regression.


Thanks, I've committed this on your behalf with r272779.

Kyrill



gcc/ChangeLog:

2019-06-21  Dennis Zhang  

* config/arm/arm.md: Remove redundant constraints from
define_expand but leave reload_inm and reload_outm patterns
untouched since they need special constraints to work.
* config/arm/arm-fixed.md: Remove redundant constraints from
define_expand.
* config/arm/iwmmxt.md: Likewise.
* config/arm/neon.md: Likewise.
* config/arm/sync.md: Likewise.
* config/arm/thumb1.md: Likewise.
* config/arm/vec-common.md: Likewise.


Re: [RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-28 Thread Rainer Orth
Hi Christophe,

> I've also noticed that
> FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted
> redundant store: .*.a = {}"
> on aarch64

it also FAILs on sparc, ia64, m68k, and mips64el.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-28 Thread Rainer Orth
Hi Hongtao,

>> > Ok for trunk?
>>
>> ENOPATCH
> Sorry, Here is the patch.

> Changelog
>
> gcc/
> +2019-06-27  Hongtao Liu  
> +
> + * doc/sourcebuild.texi (Effective-Target Keywords, Other
> + hardware attributes): Document avx512vp2intersect.
> +
>
> gcc/testsuite/
> +2019-06-27  Hongtao Liu  
> +
> + * lib/target-supports.exp
> + (check_effective_target_avx512vp2intersect): New proc.
> + * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Add
> + dg-require-effective-target avx512vp2intersect.
> + * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Ditto.
> +

Ok.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University