date:20170928

Re: 0005-Part-5.-Add-x86-CET-documentation

2017-09-28 Thread Sandra Loosemore


On 09/27/2017 09:17 AM, Tsimbalist, Igor V wrote:

Updated version #3.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e52a1ea..accba40 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5655,6 +5655,14 @@ compiled with the @option{-fcf-protection=branch} 
option.  The
 compiler assumes that the function's address is a valid target for a
 control-flow transfer.

+@emph{x86 implementation:} when @option{-fcf-protection} option is
+specified the compiler inserts an @code{endbr} instruction at function's
+prologue if the function's type does not have the @code{nocf_check}
+attribute and addresses to which indirect control-flow transfer can
+happen.  The instruction triggers the HW check if a control-flow
+transfer to the address where @code{endbr} instruction was inserted
+is valid.
+


I think the consensus among Joseph, Jeff, and I is that this doesn't 
belong in the GCC manual at all, but in the ABI documentation.  So 
please delete the implementation note.



@@ -5662,7 +5670,9 @@ not be instrumented when compiled with the
 that the function's address from the pointer is a valid target for
 a control-flow transfer.  A direct function call through a function
 name is assumed to be a safe call thus direct calls are not
-instrumented by the compiler.
+instrumented by the compiler.  For @emph{x86 implementation} the
+compiler inserts a @code{notrack} prefix before an indirect call
+instruction.


Ditto with this implementation note.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c4faa23..189130b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1203,6 +1203,7 @@ See RS/6000 and PowerPC Options.
 -msse4a  -m3dnow  -m3dnowa  -mpopcnt  -mabm  -mbmi  -mtbm  -mfma4  -mxop @gol
 -mlzcnt  -mbmi2  -mfxsr  -mxsave  -mxsaveopt  -mrtm  -mlwp  -mmpx  @gol
 -mmwaitx  -mclzero  -mpku  -mthreads @gol
+-mcet -mibt -mshstk @gol
 -mms-bitfields  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically  -mstringop-strategy=@var{alg} @gol
 -mmemcpy-strategy=@var{strategy}  -mmemset-strategy=@var{strategy} @gol
@@ -11374,6 +11375,14 @@ You can also use the @code{nocf_check} attribute to 
identify
 which functions and calls should be skipped from instrumentation
 (@pxref{Function Attributes}).

+Currently the x86 GNU/Linux target provides an implementation based
+on Intel Control-flow Enforcement Technology (CET).  Instrumentation
+for x86 is controlled by target-specific options @option{-mcet},
+@option{-mibt} and @option{-mshstk} (@pxref{x86 Options}).


This part is OK.


+The compiler also provides a number of built-in functions for
+fine-grained control in a CET-based application.
+See @xref{x86 Built-in Functions}, for more information.


I think these builtins emit instructions in the CET extension explicitly 
and don't affect the GCC's code generation for the -fcf-protection 
option.  So please move this to the discussion of -mcet in the x86 
options section instead



@@ -25779,6 +25792,11 @@ supported architecture, using the appropriate flags.  
In particular,
 the file containing the CPU detection code should be compiled without
 these options.

+The @option{-mcet} option turns on the @option{-mibt} and @option{-mshstk}
+options.  The @option{-mibt} option enables indirect branch tracking support
+and the @option{-mshstk} option enables shadow stack support from
+Intel Control-flow Enforcement Technology (CET).
+


...here.

The patch is OK with those changes.

-Sandra

Re: 0002-Part-2.-Document-finstrument-control-flow-and-notrack attribute

2017-09-28 Thread Sandra Loosemore


On 09/27/2017 06:27 AM, Tsimbalist, Igor V wrote:

Updated version #4.

[snip]
@@ -11348,6 +11349,31 @@ is used to link a program, the GCC driver 
automatically links
 against @file{libmpxwrappers}.  See also @option{-static-libmpxwrappers}.
 Enabled by default.

+@item -fcf-protection==@r{[}full@r{|}branch@r{|}return@r{|}none@r{]}
+@opindex fcf-protection
+Enable code instrumentation of control-flow transfers to increase
+program security by checking that target addresses of control-flow
+transfer instructions (such as indirect function call, function return,
+indirect jump) are valid.  This prevents diverting the flow of control
+to an unexpected target.  This is intended to protect against such
+threats as Return-oriented Programming (ROP), and similarly
+call/jmp-oriented programming (COP/JOP).
+
+For all targets, which do not support the @option{-fcf-protection}
+option, the option usage results in an error message.


Please take this sentence out.  It's ungrammatical and verbose and 
unnecessary.


Note that several of the other options described in this section are not 
enabled on all targets either.  E.g., I've just been looking at fixing 
the nios2 backend to make -fstack-protector work, and there is nothing 
in the manual to say that GCC issues an error if there's no target 
support, even though that's what it does.


The patch is OK to commit with that change.

-Sandra

[PATCH, rs6000] Follow-on fix for PR target/80210: ICE in extract_insn

2017-09-28 Thread Peter Bergner

This patch fixes two new issues exposed (but not caused) by the original 
test case added for PR80210 as well as a modified version of that test case.

The first problem is that the test case pr80210.c ICEs on 32-bit compiles
that do not pass either an explicit or implicit -mcpu=...  option.
I did not see this during my testing, because my powerpc-linux builds were
done on a 64-bit system and I built my compiler using the --with-cpu=default32
configure option which hid the ICE.

This problem is due to a mismatch between TARGET_DEFAULT, which contains
MASK_PPC_GPOPT and the ISA flags for the default "powerpc64"/"powerpc" cpu,
which does not contain MASK_PPC_GPOPT and how rs6000_option_override_internal()
decides which one to use.  The failure scenario is, early on we call
init_all_optabs() which setups up a table which describes which patterns
that generate some HW insns are "valid".  Before we call init_all_optabs(),
rs6000_option_override_internal() gets called with the global_init_p arg
set to "true" and we basically set rs6000_isa_flags to TARGET_DEFAULT.
We also execute the following code because we didn't pass in a -mcpu=
option, so rs6000_cpu_index gets set to "powerpc64"/"powerpc"'s index
into the cpu table.

  if (!have_cpu)
{
  /* PowerPC 64-bit LE requires at least ISA 2.07.  */
  const char *default_cpu = (!TARGET_POWERPC64
 ? "powerpc"
 : (BYTES_BIG_ENDIAN
? "powerpc64"
: "powerpc64le"));

  rs6000_cpu_index = cpu_index = rs6000_cpu_name_lookup (default_cpu);
}

With this, init_all_optabs() thinks we can generate (as it should) a HW sqrt,
so it enables generating its pattern.

Later, after we've scanned the entire file, we go to expand our function
into RTL and we reset our compiler options and we end up calling
rs6000_option_override_internal() again, but this time with the global_init_p
arg now set to false and we encounter this code:

  struct cl_target_option *main_target_opt
= ((global_init_p || target_option_default_node == NULL)
   ? NULL : TREE_TARGET_OPTION (target_option_default_node));

This ends up setting main_target_opt to a non-NULL value, then:

  ...
  else if (main_target_opt != NULL && main_target_opt->x_rs6000_cpu_index >= 0)
{
  rs6000_cpu_index = cpu_index = main_target_opt->x_rs6000_cpu_index;
  have_cpu = true;
}

This causes us to use the saved rs6000_cpu_index value and act as if the
user passed it in, so we restore the rs6000_isa_flags from the saved
default cpu rather than the TARGET_DEFAULT flags.  Since the default
cpus don't include the MASK_PPC_GPOPT flag, we eventually ICE.

This patch fixes the pr80210.c ICE by correctly setting the rs6000_cpu_index
value which in turn makes us use the correct rs6000_isa_flags value.
I also fixed the setting of the rs6000_tune_index value that was also
being set incorrectly sometimes, but of course, it doesn't lead to an
ICE, just wrong scheduling.

The second problem was exposed by compiling the pr80210.c test case, but
with the #pragma moved to the beginning of the file.  In this case, we
should disable the generating of the HW sqrt pattern in the optabs.
The ICE showed that we were still generating the HQ sqrt pattern when
we shouldn't have.  This problem is basically the dual of the other
problem, in that we are not correctly saving and restoring the optab
values.  The problem here is that rs6000_pragma_target_parse () did not
call rs6000_activate_target_options () which ends up resetting the
optabs values associated with the rs6000_isa_flags value.

This passed bootstrap and regtesting on powerpc64le-linux as well as
on powerpc64-linux and running the test suite in both 32-bit and 64-bit
modes.  Ok for trunk?

Peter


gcc/
PR target/80210
* config/rs6000/rs6000.c (rs6000_option_override_internal): Rewrite
function to not use the have_cpu variable.  Do not set cpu_index,
rs6000_cpu_index or rs6000_tune_index if we end up using TARGET_DEFAULT
or the default cpu.
(rs6000_valid_attribute_p): Remove duplicate initializations of
old_optimize and func_optimize.
(rs6000_pragma_target_parse): Call rs6000_activate_target_options ().
(rs6000_activate_target_options): Make global.
* config/rs6000/rs6000-protos.h (rs6000_activate_target_options): Add
prototype.

gcc/testsuite/
PR target/80210
* gcc.target/powerpc/pr80210-2.c: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 253232)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3992,14 +3992,10 @@ static bool
 rs6000_option_override_internal (bool global_init_p)
 {
   bool ret = true;
-  bool have_cpu = false;
-
-  /* The default cpu requested at configure

Re: [PATCH] Asm memory constraints

2017-09-28 Thread Alan Modra

On Mon, Aug 21, 2017 at 10:29:30AM +0930, Alan Modra wrote:
> Fixed in this revised patch.  The only controversial aspect now should
> be whether those array casts ought to be officially blessed.  I've
> checked that "=m" (*(T (*)[]) ptr), "=m" (*(T (*)[n]) ptr), and
> "=m" (*(T (*)[10]) ptr), all generate reasonable MEM_ATTRS handled
> apparently properly by alias.c and other code.

Ping https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01174.html
Needs a global reviewer to bless array casts in asm constraints.

-- 
Alan Modra
Australia Development Lab, IBM

libbacktrace patch committed: Support compressed debug sections

2017-09-28 Thread Ian Lance Taylor

This patch to libbacktrace adds support for compressed debug sections.
Rather than require all users of libbacktrace to link against -lz, I
wrote new code in libbacktrace to inflate a zlib stream.  Because the
code does not have to be as flexible as zlib, and because it is only
used to uncompress from one memory buffer to another and therefore
does not need to provide a streaming interface, and because I wasted a
day speeding it up, it's a few percent faster than zlib (at least as
measured by the simple benchmark in the new ztest.c file).

This fixes PR 67165.

Bootstrapped and ran libbacktrace and Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2017-09-28  Ian Lance Taylor  

PR other/67165
* elf.c (__builtin_prefetch): Define if not __GNUC__.
(unlikely): Define.
(SHF_UNCOMPRESSED, ELFCOMPRESS_ZLIB): Define.
(b_elf_chdr): Define type.
(enum debug_section): Add ZDEBUG_xxx values.
(debug_section_names): Add names for new sections.
(struct debug_section_info): Add compressed field.
(elf_zlib_failed, elf_zlib_fetch): New static functions.
(HUFFMAN_TABLE_SIZE, HUFFMAN_VALUE_MASK): Define.
(HUFFMAN_BITS_SHIFT, HUFFMAN_BITS_MASK): Define.
(HUFFMAN_SECONDARY_SHIFT): Define.
(ZDEBUG_TABLE_SIZE): Define.
(ZDEBUG_TABLE_CODELEN_OFFSET, ZDEBUG_TABLE_WORK_OFFSET): Define.
(final_next_secondary): New static variable if
BACKTRACE_GENERATE_FIXED_HUFFMAN_TABLE.
(elf_zlib_inflate_table): New static function.
(BACKTRACE_GENERATE_FIXED_HUFFMAN_TABLE): If define, define main
function to produce fixed Huffman table.
(elf_zlib_default_table): New static variable.
(elf_zlib_inflate): New static function.
(elf_zlib_verify_checksum): Likewise.
(elf_zlib_inflate_and_verify): Likewise.
(elf_uncompress_zdebug): Likewise.
(elf_uncompress_chdr): Likewise.
(backtrace_uncompress_zdebug): New extern function.
(elf_add): Look for .zdebug sections and SHF_COMPRESSED debug
sections, and uncompress them.
* internal.h (backtrace_compress_zdebug): Declare.
* ztest.c: New file.
* configure.ac: Check for -lz and check whether the linker
supports --compress-debug-sections.
* Makefile.am (ztest_SOURCES): New variable.
(ztest_CFLAGS, ztest_LDADD): New variables.
(check_PROGRAMS): Add ztest.
(ctestg_SOURCES): New variable.
(ctestg_CFLAGS, ctestg_LDFLAGS, ctestg_LDADD): New variables.
(ctesta_SOURCES): New variable.
(ctesta_CFLAGS, ctesta_LDFLAGS, ctesta_LDADD): New variables.
(check_PROGRAMS): Add ctestg and ctesta.
* configure, config.h.in, Makefile.in: Rebuild.
Index: Makefile.am
===
--- Makefile.am (revision 253270)
+++ Makefile.am (working copy)
@@ -101,6 +101,16 @@ stest_LDADD = libbacktrace.la
 
 check_PROGRAMS += stest
 
+ztest_SOURCES = ztest.c testlib.c
+ztest_CFLAGS = -DSRCDIR=\"$(srcdir)\"
+ztest_LDADD = libbacktrace.la
+
+if HAVE_ZLIB
+ztest_LDADD += -lz
+endif
+
+check_PROGRAMS += ztest
+
 edtest_SOURCES = edtest.c edtest2_build.c testlib.c
 edtest_LDADD = libbacktrace.la
 
@@ -132,6 +142,22 @@ dtest: btest
 
 endif HAVE_OBJCOPY_DEBUGLINK
 
+if HAVE_COMPRESSED_DEBUG
+
+ctestg_SOURCES = btest.c testlib.c
+ctestg_CFLAGS = $(AM_CFLAGS) -g
+ctestg_LDFLAGS = -Wl,--compress-debug-sections=zlib-gnu
+ctestg_LDADD = libbacktrace.la
+
+ctesta_SOURCES = btest.c testlib.c
+ctesta_CFLAGS = $(AM_CFLAGS) -g
+ctesta_LDFLAGS = -Wl,--compress-debug-sections=zlib-gabi
+ctesta_LDADD = libbacktrace.la
+
+check_PROGRAMS += ctestg ctesta
+
+endif
+
 endif NATIVE
 
 # We can't use automake's automatic dependency tracking, because it
Index: configure.ac
===
--- configure.ac(revision 253270)
+++ configure.ac(working copy)
@@ -405,6 +405,23 @@ AC_SUBST(PTHREAD_CFLAGS)
 
 AM_CONDITIONAL(HAVE_PTHREAD, test "$libgo_cv_lib_pthread" = yes)
 
+AC_CHECK_LIB([z], [compress], [])
+if test $ac_cv_lib_z_compress = "yes"; then
+  AC_DEFINE(HAVE_ZLIB, 1, [Define if -lz is available.])
+fi
+AM_CONDITIONAL(HAVE_ZLIB, test "$ac_cv_lib_z_compress" = yes)
+
+dnl Test whether the linker supports the --compress_debug_sections option.
+AC_CACHE_CHECK([whether --compress-debug-sections is supported],
+[libgo_cv_ld_compress],
+[LDFLAGS_hold=$LDFLAGS
+LDFLAGS="$LDFLAGS -Wl,--compress-debug-sections=zlib-gnu"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(,)],
+[libgo_cv_ld_compress=yes],
+[libgo_cv_ld_compress=no])
+LDFLAGS=$LDFLAGS_hold])
+AM_CONDITIONAL(HAVE_COMPRESSED_DEBUG, test "$libgo_cv_ld_compress" = yes)
+
 AC_ARG_VAR(OBJCOPY, [location of objcopy])
 AC_CHECK_PROG(OBJCOPY, objcopy, objcopy,)
 AC_CACHE_CHECK([whether objcopy supports debuglink],
Index: elf.c
===
--- elf.c   (revision 253270)
+++ elf.c   (working copy)
@@ -56,6 +56,13 @@ POSSIBILITY OF SUCH DAMAGE.  */
  #define S_ISLNK(m) (((m) & S_IFMT) == S_IFLNK)
 #endif
 
+#ifndef __GNUC__
+#define __builtin_prefetch(p, r, l)
+#define unlikely(x) (x)
+#else
+#define unlikely(x)

Re: [PATCH] PR libstdc++/81469 deprecate std::uncaught_exception for C++17

2017-09-28 Thread Nathan Sidwell


On 09/28/2017 11:52 AM, Jakub Jelinek wrote:

Hi!

On Wed, Sep 20, 2017 at 05:35:26PM +0100, Jonathan Wakely wrote:

C++17 deprecates uncaught_exception in favour of uncaught_exceptions,
so this adds the attribute.

PR libstdc++/81469
* libsupc++/exception (uncaught_exception): Deprecate for C++17.
* testsuite/18_support/exception_ptr/62258.cc: Add -Wno-deprecated.
* testsuite/18_support/uncaught_exception/14026.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


This broke a couple of tests with make check-c++-all, the following patch
should fix that.

Regtested on x86_64-linux and i686-linux, including make check-c++-all, ok
for trunk?


ok, thanks

nathan
--
Nathan Sidwell

Re: [C++ PATCH] Let make check-c++-all test also c++2a

2017-09-28 Thread Nathan Sidwell


On 09/28/2017 11:50 AM, Jakub Jelinek wrote:

Hi!




Here is the third patch, which enables testing c++2a in make check-c++-all.
Regtested and regtested with make check-c++-all on x86_64-linux and
i686-linux, ok for trunk?


yup, thanks.


--
Nathan Sidwell

Re: [C++ PATCH] Fix attribute parsing for bitfields

2017-09-28 Thread Nathan Sidwell


On 09/28/2017 11:47 AM, Jakub Jelinek wrote:


I don't see the first half of 1) related to second half thereof, the
DECL_BIT_FIELD_REPRESENTATIVE is unrelated to the parsing bug, and a needed
precondition of 2).


That's very conscientious of you.



As I said on IRC, I hope [[/__attribute__/alignas early is rare enough that
the tentative parsing shouldn't be a big deal, if it is, we could add some
cheaper function that allows us to skip over attributes (return a peek
offset after the attributes given a starting peek offset).

I agree.


--- gcc/cp/parser.c.jj  2017-09-22 20:51:48.181537880 +0200
+++ gcc/cp/parser.c 2017-09-27 17:50:15.082792676 +0200
@@ -23443,35 +23443,64 @@ cp_parser_member_declaration (cp_parser*



  /* Peek at the next token.  */
  token = cp_lexer_peek_token (parser->lexer);
  
+	  if (cp_next_tokens_can_be_attribute_p (parser)

+ || (token->type == CPP_NAME
+ && cp_nth_tokens_can_be_attribute_p (parser, 2)
+ && (named_bitfld = true)))
+   {


Please insert a comment describing why we're doing the lookahead and our 
supposition that it is rare, so the simplicity of a tentative parse is 
worth the expense.  Also, as I mentioned before, the other alternative 
is to let the non-bitfield declarator parsing get an unnamed decl and 
then look for the ':'. But that's a more complicated change we decided 
not to do.


the mid-condition assignment is ugly, but probably the least ugly.

ok for trunk.

nathan
--
Nathan Sidwell

Re: 0005-Part-5.-Add-x86-CET-documentation

2017-09-28 Thread Jeff Law

On 09/27/2017 09:17 AM, Tsimbalist, Igor V wrote:

>>
> 
> 0005-Add-x86-CET-documentation.patch
> 
> 
> From dda22b06a3a5bde9b0dc57585d878db520769510 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Tue, 4 Jul 2017 13:55:03 +0300
> Subject: [PATCH 5/6] Add x86 CET documentation.
> 
> gcc/
>   * doc/extend.texi: Add x86 specific to 'nocf_check' attribute.
>   List CET intrinsics.
>   * doc/invoke.texi: Add -mcet, -mibt, -mshstk options.  Add x86
>   specific to -fcf-protection option.
Once Sandra has OK'd the doc patches, they should be considered OK for
the trunk.

jeff

Re: 0005-Part-5.-Add-x86-CET-documentation

2017-09-28 Thread Jeff Law

On 09/27/2017 11:01 AM, Joseph Myers wrote:
> On Wed, 27 Sep 2017, Florian Weimer wrote:
> 
>> This is part of the ABI GCC implements, so it has to be documented somewhere,
>> and not just as part of the GCC source code.
>>
>> CET is not properly described in the ABI supplement and I don't think this
>> will change, so detailed documentation in the GCC manual is very much
>> desirable.
> 
> Isn't this a matter to take up further in the thread HJ started on the ABI 
> mailing lists, or a new such thread (possibly e.g. sending pull requests 
> that build further on his wording, or propose alternative wording, to 
> clarify them things left unclear there, with a goal of getting it clearly 
> defined in the master sources for x86_64 and x86)?  Clearly the best 
> result would be proper documentation in the ABI and the GCC manual 
> cross-referencing the relevant ABI documents.

The documentation should be AFAICT independent of the compiler in use --
ie, gcc, llvm and icc all should agree on where/when these new
instructions should be inserted.  Which argues that the documentation
belongs in the ABI docs, not the GCC docs.  *users* aren't really going
to care about these kinds of details.

So I think the summary is that I agree with Joseph on this.  Let's push
it into the ABI docs.  HJ can and should play a central role in making
that happen.

jeff

Re: [C++ PATCH] Stash bitfield width into DECL_BIT_FIELD_REPRESENTATIVE instead of DECL_INITIAL

2017-09-28 Thread Nathan Sidwell


On 09/28/2017 11:49 AM, Jakub Jelinek wrote:


The following patch is the D_B_F_R part of the above.
Bootstrapped/regtested on top of the patch I've just posted, ok for trunk?


Looks good, a couple of nits.  I think the bits out of the cp FE are 
sufficiently obvious given the cp changes.




c/
* c-decl.c (grokfield): Use SET_DECL_C_BIT_FIELD here if
with is non-NULL.

/with/width


--- gcc/c-family/c-attribs.c.jj 2017-09-18 20:48:53.731871226 +0200
+++ gcc/c-family/c-attribs.c2017-09-19 09:51:21.928612658 +0200
@@ -426,7 +426,7 @@ handle_packed_attribute (tree *node, tre



--- gcc/cp/class.c.jj   2017-09-18 20:48:53.509873991 +0200
+++ gcc/cp/class.c  2017-09-19 10:31:35.435961690 +0200
@@ -3231,12 +3231,12 @@ check_bitfield_decl (tree field)
tree w;
  
/* Extract the declared width of the bitfield, which has been

- temporarily stashed in DECL_INITIAL.  */
-  w = DECL_INITIAL (field);
+ temporarily stashed in DECL_BIT_FIELD_REPRESENTATIVE.  */


While you're here, could you point the comment at grokbitfield?



+++ gcc/cp/decl2.c  2017-09-19 10:31:45.066839694 +0200
@@ -1042,7 +1047,8 @@ grokbitfield (const cp_declarator *decla
   TREE_TYPE (width));
else
{
- DECL_INITIAL (value) = width;
+ /* Temporarily stash the width in DECL_BIT_FIELD_REPRESENTATIVE.  */


And likewise here point at ceheck_bitfield_decl? (answering the 
temporarily 'til when? question)



+++ gcc/objc/objc-act.c 2017-09-19 13:01:06.917412755 +0200
@@ -4602,8 +4602,14 @@ check_ivars (tree inter, tree imp)
t1 = TREE_TYPE (intdecls); t2 = TREE_TYPE (impdecls);
  
if (!comptypes (t1, t2)

+#ifdef OBJCPLUS
+ || !tree_int_cst_equal (DECL_BIT_FIELD_REPRESENTATIVE (intdecls),
+ DECL_BIT_FIELD_REPRESENTATIVE (impdecls))
+#else
  || !tree_int_cst_equal (DECL_INITIAL (intdecls),
- DECL_INITIAL (impdecls)))
+ DECL_INITIAL (impdecls))
+#endif


This is a little unfortunate.  Feel like adding a cleanup task to the 
easy-projects wiki?  You'll see I added a few I've been spotting.


ok for trunk.

nathan

--
Nathan Sidwell

Re: 0002-Part-2.-Document-finstrument-control-flow-and-notrack attribute

2017-09-28 Thread Jeff Law

On 09/27/2017 06:27 AM, Tsimbalist, Igor V wrote:

> 
> 
> 0002-Add-documentation-for-fcf-protection-option-and-nocf.patch
> 
> 
> From bc896670fef5eb7324c0e0134747696f3ed66553 Mon Sep 17 00:00:00 2001
> From: Igor Tsimbalist 
> Date: Sun, 17 Sep 2017 14:57:29 +0300
> Subject: [PATCH 2/5] Add documentation for fcf-protection option and
>  nocf_check attribute
> 
> gcc/doc/
>   * extend.texi: Add 'nocf_check' documentation.
>   * gimple.texi: Add second parameter to gimple_build_call_from_tree.
>   * invoke.texi: Add -fcf-protection documentation.
>   * rtl.texi: Add REG_CALL_NOTRACK documenation.
s/documenation/documentation

Otherwise this is fine once Sandra gives her OK.

jeff

Re: 0003-Part-3.-Add-tests-for-finstrument-control-flow-and-notrack attribute

2017-09-28 Thread Jeff Law

On 09/19/2017 07:58 AM, Tsimbalist, Igor V wrote:
> Here is an updated patch (version #2). Mainly attribute and option  names 
> were changed.
> The test for ICF will be introduced in x86 specific tests (patch 0006-Part-6) 
> as the implementation
> checks if the CF instrumentation is on to adjust a hash based on 
> 'nocf'_check' attribute presence.
> In generic part CF instrumentation is off as no implementation exist.
> 
> The patch for x86 specific tests (patch 0006-Part-6) is being reviewed by 
> Uros.
> 
> gcc/testsuite/
>   * c-c++-common/fcf-protection-1.c: New test.
>   * c-c++-common/fcf-protection-2.c: Likewise.
>   * c-c++-common/fcf-protection-3.c: Likewise.
>   * c-c++-common/fcf-protection-4.c: Likewise.
>   * c-c++-common/fcf-protection-5.c: Likewise.
>   * c-c++-common/attr-nocf-check-1.c: Likewise.
>   * c-c++-common/attr-nocf-check-2.c: Likewise.
>   * c-c++-common/attr-nocf-check-3.c: Likewise.
> 
> Is it ok for trunk?
>
This is OK once the CET changes for the compiler are approved.

Jeff

Re: 0001-Part-1.-Add-generic-part-for-Intel-CET-enabling

2017-09-28 Thread Jeff Law

On 09/19/2017 07:39 AM, Tsimbalist, Igor V wrote:
> Here is an updated patch (version #2). The main differences are:
> 
> - Change attribute and option names;
> - Add additional parameter to gimple_build_call_from_tree by adding a type 
> parameter and
>   use it 'nocf_check' attribute propagation;
> - Reimplement fixes in expand_call_stmt to propagate 'nocf_check' attribute;
> - Consider 'nocf_check' attribute in Identical Code Folding (ICF) 
> optimization;
> - Add warning for type inconsistency regarding 'nocf_check' attribute;
> - Many small fixes;
> 
> gcc/c-family/
>   * c-attribs.c (handle_nocf_check_attribute): New function.
>   (c_common_attribute_table): Add 'nocf_check' handling.
>   * c-common.c (check_missing_format_attribute): New function.
>   * c-common.h: Likewise.
> 
> gcc/c/
>   * c-typeck.c (convert_for_assignment): Add check for nocf_check
>   attribute.
>   * gimple-parser.c: Add second argument NULL to
>   gimple_build_call_from_tree.
> 
> gcc/cp/
>   * typeck.c (convert_for_assignment): Add check for nocf_check
>   attribute.
> 
> gcc/
>   * cfgexpand.c (expand_call_stmt): Set REG_CALL_NOCF_CHECK for
>   call insn.
>   * combine.c (distribute_notes): Add REG_CALL_NOCF_CHECK handling.
>   * common.opt: Add fcf-protection flag.
>   * emit-rtl.c (try_split): Add REG_CALL_NOCF_CHECK handling.
>   * flag-types.h: Add enum cf_protection_level.
>   * gimple.c (gimple_build_call_from_tree): Add second parameter.
>   Add 'nocf_check' attribute propagation to gimple call.
>   * gimple.h (gf_mask): Add GF_CALL_NOCF_CHECK.
>   (gimple_call_nocf_check_p): New function.
>   (gimple_call_set_nocf_check): Likewise.
>   * gimplify.c: Add second argument to gimple_build_call_from_tree.
>   * ipa-icf.c: Add nocf_check attribute in statement hash.
>   * recog.c (peep2_attempt): Add REG_CALL_NOCF_CHECK handling.
>   * reg-notes.def: Add REG_NOTE (CALL_NOCF_CHECK).
>   * toplev.c (process_options): Add flag_cf_protection handling.
> 
> Is it ok for trunk?
> 
> Thanks,
> Igor
> 
> 


> 
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index 0337537..77d1909 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -65,6 +65,7 @@ static tree handle_asan_odr_indicator_attribute (tree *, 
> tree, tree, int,
>  static tree handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_nocf_check_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_noicf_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_noipa_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
> @@ -367,6 +368,8 @@ const struct attribute_spec c_common_attribute_table[] =
>{ "patchable_function_entry",  1, 2, true, false, false,
> handle_patchable_function_entry_attribute,
> false },
> +  { "nocf_check",  0, 0, false, true, true,
> +   handle_nocf_check_attribute, false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  
> @@ -783,6 +786,26 @@ handle_noclone_attribute (tree *node, tree name,
>return NULL_TREE;
>  }
>  
> +/* Handle a "nocf_check" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_nocf_check_attribute (tree *node, tree name,
> +   tree ARG_UNUSED (args),
> +   int ARG_UNUSED (flags), bool *no_add_attrs)
> +{
> +  if (TREE_CODE (*node) != FUNCTION_TYPE
> +  && TREE_CODE (*node) != METHOD_TYPE
> +  && TREE_CODE (*node) != FIELD_DECL
> +  && TREE_CODE (*node) != TYPE_DECL)
So curious, is this needed for FIELD_DECL and TYPE_DECL?  ISTM the
attribute is applied to function/method types.

If we do need to handle FIELD_DECL and TYPE_DECL here, can you add a
quick comment why?

> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index b3ec3a0..78a730e 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -7253,6 +7253,26 @@ check_missing_format_attribute (tree ltype, tree rtype)
>  return false;
>  }
>  
> +/* Check for missing nocf_check attributes on function pointers.  LTYPE is
> +   the new type or left-hand side type.  RTYPE is the old type or
> +   right-hand side type.  Returns TRUE if LTYPE is missing the desired
> +   attribute.  */
> +
> +bool
> +check_missing_nocf_check_attribute (tree ltype, tree rtype)
> +{
> +  tree const ttr = TREE_TYPE (rtype), ttl = TREE_TYPE (ltype);
> +  tree ra, la;
> +
> +  for (ra = TYPE_ATTRIBUTES (ttr); ra; ra = TREE_CHAIN (ra))
> +if (is_attribute_p ("nocf_check", TREE_PURPOSE (ra)))
> +  break;
> +

[PATCH], Add PowerPC ISA 3.0 IEEE 128-bit floating point round to odd built-in functions

2017-09-28 Thread Michael Meissner

This patch addss built-in functions on PowerPC ISA 3.0 (power9) that allow the
user to access the round to odd IEEE 128-bit floating point instructions.

I have checked it on a little endian power8 system doing a bootstrap and make
check.  There were no regressions in the testsuite.  I verified that the new
test (float128-odd.c) did run sucessfully.  Can I check this patch into the
trunk?

[gcc]
2017-09-28  Michael Meissner  

* config/rs6000/rs6000-builtin.def (BU_FLOAT128_2_HW): Define new
helper macro for IEEE float128 hardware built-in functions.
(SQRTF128_ODD): Add built-in functions with the round-to-odd
semantics.
(TRUNCF128_ODD): Likewise.
(ADDF128_ODD): Likewise.
(SUBF128_ODD): Likewise.
(MULF128_ODD): Likewise.
(DIVF128_ODD): Likewise.
(FMAF128_ODD): Likewise.
* config/rs6000/rs6000.md (truncsf2_hw): Change the truncate
with round to odd expansion to use float_truncate:DF inside of the
UNSPEC to better document what the insn does.
(add3_odd): Add insns for IEEE 128-bit floating point round
to odd hardware instructions.
(sub3_odd): Likewise.
(mul3_odd): Likewise.
(div3_odd): Likewise.
(sqrt2_odd): Likewise.
(fma4_odd): Likewise.
(fms4_odd): Likewise.
(nfma4_odd): Likewise.
(nfms4_odd): Likewise.
(truncdf2_odd): Change insn format to make it more readable,
and add a generator function.
* doc/extend.texi (PowerPC built-in functions): Update documentation
for existing IEEE float128-bit built-in functions.  Add built-in
functions that generate the IEEE 128-bit floating point round to
odd instructions.

[gcc/testsuite]
2017-09-28  Michael Meissner  

* gcc.target/powerpc/float128-odd.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 253267)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -686,6 +686,14 @@
 | RS6000_BTC_UNARY),   \
CODE_FOR_ ## ICODE) /* ICODE */
 
+#define BU_FLOAT128_2_HW(ENUM, NAME, ATTR, ICODE)   \
+  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
+   "__builtin_" NAME,  /* NAME */  \
+   RS6000_BTM_FLOAT128_HW, /* MASK */  \
+   (RS6000_BTC_ ## ATTR/* ATTR */  \
+| RS6000_BTC_BINARY),  \
+   CODE_FOR_ ## ICODE) /* ICODE */
+
 #define BU_FLOAT128_3_HW(ENUM, NAME, ATTR, ICODE)   \
   RS6000_BUILTIN_3 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
"__builtin_" NAME,  /* NAME */  \
@@ -2365,11 +2373,19 @@ BU_P9_OVERLOAD_2 (CMPEQB,   "byte_in_set")
 BU_FLOAT128_1 (FABSQ,  "fabsq",   CONST, abskf2)
 BU_FLOAT128_2 (COPYSIGNQ,  "copysignq",   CONST, copysignkf3)
 
-/* 1 and 3 argument IEEE 128-bit floating point functions that require ISA 3.0
-   hardware.  These functions use the new 'f128' suffix.  Eventually these
-   should be folded into the common built-in function handling. */
-BU_FLOAT128_1_HW (SQRTF128,"sqrtf128", CONST, sqrtkf2)
-BU_FLOAT128_3_HW (FMAF128, "fmaf128",  CONST, fmakf4_hw)
+/* 1, 2, and 3 argument IEEE 128-bit floating point functions that require ISA
+   3.0 hardware.  These functions use the new 'f128' suffix.  Eventually the
+   standard functions should be folded into the common built-in function
+   handling. */
+BU_FLOAT128_1_HW (SQRTF128, "sqrtf128",   CONST, sqrtkf2)
+BU_FLOAT128_1_HW (SQRTF128_ODD, "sqrtf128_round_to_odd",  CONST, 
sqrtkf2_odd)
+BU_FLOAT128_1_HW (TRUNCF128_ODD, "truncf128_round_to_odd", CONST, 
trunckfdf2_odd)
+BU_FLOAT128_2_HW (ADDF128_ODD,  "addf128_round_to_odd",   CONST, addkf3_odd)
+BU_FLOAT128_2_HW (SUBF128_ODD,  "subf128_round_to_odd",   CONST, subkf3_odd)
+BU_FLOAT128_2_HW (MULF128_ODD,  "mulf128_round_to_odd",   CONST, mulkf3_odd)
+BU_FLOAT128_2_HW (DIVF128_ODD,  "divf128_round_to_odd",   CONST, divkf3_odd)
+BU_FLOAT128_3_HW (FMAF128,  "fmaf128",CONST, fmakf4_hw)
+BU_FLOAT128_3_HW (FMAF128_ODD,  "fmaf128_round_to_odd",   CONST, fmakf4_odd)
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,"vsbox",  CONST, crypto_vsbox)
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision

Re: Make tests less istreambuf_iterator implementation dependent

2017-09-28 Thread Jonathan Wakely


On 28/09/17 21:59 +0200, François Dumont wrote:

On 28/09/2017 14:12, Jonathan Wakely wrote:

On 27/09/17 22:16 +0200, François Dumont wrote:

Hi

    I just committed attached patch as trivial.

    Those tests were highly istreambuf_iterator implementation, it 
is the result of the call to money_get<>::get which is pointing 
immediately beyond the last character recognized to quote Standard 
words.


But according to the standard's specification for istreambuf_iterator
it makes no difference, because both iterators point to the same
streambuf and share the state.



Simply adding a:

VERIFY( *ibeg2 == '0' );

before calling money_get<>::get method would have broken the test.

The current istreambuf_iterator implementation capture the current 
streambuf state each time it is tested for eof or evaluated. This is 
why I considered those tests as fragile.


Yes, and I think that's non-conforming.

As I said in the other thread, I'd really like to see references to
the standard used to justify any changes to our istreambuf_iterator.

Re: [PATCH][GRAPHITE] Make --param loop-block-tile-size=0 disable tiling

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 7:20 AM, Richard Biener  wrote:
>
> Currently ISL aborts on this special value and for debugging (and
> tuning?) it's nice to avoid all the clutter introduced by tiling.
>
> Committed as obvious.
>
> Richard.
>
> 2017-09-27  Richard Biener  
>
> * graphite-optimize-isl.c (get_schedule_for_node_st): Allow
> --param loop-block-tile-size=0 to disable tiling.

Looks good.

Re: [PATCH][GRAPHITE] Allow --param graphite-max-arrays-per-scop=0

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 6:51 AM, Richard Biener  wrote:
>
> The following is to allow making --param graphite-max-arrays-per-scop
> unbounded.  That's a little tricky because the bound is used when
> computing "alias-sets" for scalar constraints.  There's an easy way
> out though as we know the maximum alias-set assigned in the SCOP,
> we only have to remember it.  The advantage (if it matters at all)
> is that we avoid a constraint coefficient gap between that last
> used alias-set and the former PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, SPEC CPU 2006
> tested.  Will apply after testing finished.
>
> Richard.
>
> 2017-09-27  Richard Biener  
>
> * graphite.h (scop::max_alias_set): New member.
> * graphite-scop-detection.c: Remove references to non-existing
> --param in comments.
> (build_alias_sets): Record the maximum alias set used for drs.
> (build_scops): Support zero as unlimited for
> --param graphite-max-arrays-per-scop.
> * graphite-sese-to-poly.c (add_scalar_version_numbers): Remove
> and inline into ...
> (build_poly_sr_1): ... here.  Compute alias set based on the
> maximum alias set used for drs rather than
> PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP
>

Maybe we should keep this limit, and instead of failing to handle
huge scops, we could stop the scop detection to expand the
scop past this limit?

Re: [PATCH][GRAPHITE] Remove another small quadraticness

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 6:48 AM, Richard Biener  wrote:
>
> Turns out loop_nest recorded in scop-info isn't really necessary as
> we can simply process parameters in loop bounds during the gather_bbs
> walk where we encounter each loop (identified by its header) once.
>
> This avoids the linear search in record_loop_in_sese.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu, will apply.
>
> Richard.
>
> 2017-09-27  Richard Biener  
>
> * graphite-scop-detection.c (find_scop_parameters): Move
> loop bound handling ...
> (gather_bbs::before_dom_children): ... here, avoiding the need
> to build scop_info->loop_nest.
> (record_loop_in_sese): Remove.
> * sese.h (sese_info_t::loop_nest): Remove.
> * sese.c (new_sese_info): Do not allocate loop_nest.
> (free_sese_info): Do not free loop_nest.

Looks good.  Thanks!

Re: [PATCH][GRAPHITE] Speedup SCOP detection some more, add region handling to domwalk

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 6:07 AM, Richard Biener  wrote:
>  /* Maximal number of array references in a scop.  */
>
DEFPARAM (PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP,
  "graphite-max-arrays-per-scop",
  "maximum number of arrays per scop.",
  100, 0, 0)

Let's also remove this param as we now have max-isl-operations.

Thanks,
Sebastian

Re: [PATCH][GRAPHITE] Speedup SCOP detection some more, add region handling to domwalk

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 6:07 AM, Richard Biener  wrote:
>
> This removes another quadraticness from SCOP detection, gather_bbs
> domwalk.  This is done by enhancing domwalk to handle SEME regions
> via a special return value from before_dom_children.
>
> With this I'm now confident to remove the
> PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION parameter and its associated limit.
> Being there I've adjusted PARAM_GRAPHITE_MAX_NB_SCOP_PARAMS to its
> documented default value which enables 90 more loos to be processed
> in SPEC CPU 2006.  I've also made a value of zero magic in disabling
> the limit (a trick commonly used in GCC).
>
> Statistics I have gathered a few patches before for SPEC CPU 2006:
>
> 1255 multi-loop SESEs in SCOP processing
> max. params 34, 3 scops >= 20, 15 scops >= 10, 33 scops >= 8
> max. drs per scop 869, 10 scops >= 100
> max. pbbs per scop 36, 12 scops >= 10
> 919 SCOPs fail in build_alias_sets
>
> which shows the default for PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP
> is reasonable (if tuned to SPEC CPU 2006).
>
> I've also included the hunk that allows -fgraphite-identity
> to work ontop of -floop-nest-optimize and for -floop-nest-optimize
> -ftree-parallelize-all also make sure to code-gen loops that
> end up not transformed.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC CPU 2006
> tested, applied to trunk.
>
> Richard.
>
> 2017-09-27  Richard Biener  
>
> * doc/invoke.texi (graphite-max-bbs-per-function): Remove.
> (graphite-max-nb-scop-params): Document special value zero.
> * domwalk.h (dom_walker::STOP): New symbolical constant.
> (dom_walker::dom_walker): Add optional parameter for bb to
> RPO mapping.
> (dom_walker::~dom_walker): Declare.
> (dom_walker::before_dom_children): Document STOP return value.
> (dom_walker::m_user_bb_to_rpo): New member.
> (dom_walker::m_bb_to_rpo): Likewise.
> * domwalk.c (dom_walker::dom_walker): Compute bb to RPO
> mapping here if not provided by the user.
> (dom_walker::~dom_walker): Free bb to RPO mapping if not
> provided by the user.
> (dom_walker::STOP): Define.
> (dom_walker::walk): Do not compute bb to RPO mapping here.
> Support STOP return value from before_dom_children to stop
> walking.
> * graphite-optimize-isl.c (optimize_isl): If the schedule
> is the same still generate code if -fgraphite-identity
> or -floop-parallelize-all are given.
> * graphite-scop-detection.c: Include cfganal.h.
> (gather_bbs::gather_bbs): Get and pass through bb to RPO
> mapping.
> (gather_bbs::before_dom_children): Return STOP for BBs
> not in the region.
> (build_scops): Compute bb to RPO mapping and pass it to
> the domwalk.  Treat --param graphite-max-nb-scop-params=0
> as not limiting the number of params.
> * graphite.c (graphite_initialize): Remove limit on the
> number of basic-blocks in a function.
> * params.def (PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION): Remove.
> (PARAM_GRAPHITE_MAX_NB_SCOP_PARAMS): Adjust to documented
> default value of 10.

The patch looks good.  Thanks!

>
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi (revision 253224)
> +++ gcc/doc/invoke.texi (working copy)
> @@ -10512,13 +10512,9 @@ sequence pairs.  This option only applie
>  @item graphite-max-nb-scop-params
>  To avoid exponential effects in the Graphite loop transforms, the
>  number of parameters in a Static Control Part (SCoP) is bounded.  The
> -default value is 10 parameters.

Now that we have "compute-out" functionality in all supported
versions of isl, let's remove this parameter.

We needed this in the past when isl was not able to stop an
exponential computation, and that happened when operating
on large dimension spaces.

Re: Make tests less istreambuf_iterator implementation dependent

2017-09-28 Thread François Dumont


On 28/09/2017 14:12, Jonathan Wakely wrote:

On 27/09/17 22:16 +0200, François Dumont wrote:

Hi

    I just committed attached patch as trivial.

    Those tests were highly istreambuf_iterator implementation, it is 
the result of the call to money_get<>::get which is pointing 
immediately beyond the last character recognized to quote Standard 
words.


But according to the standard's specification for istreambuf_iterator
it makes no difference, because both iterators point to the same
streambuf and share the state.



Simply adding a:

VERIFY( *ibeg2 == '0' );

before calling money_get<>::get method would have broken the test.

The current istreambuf_iterator implementation capture the current 
streambuf state each time it is tested for eof or evaluated. This is why 
I considered those tests as fragile.


François

Re: [PATCH][GRAPHITE] Simplify SCOP detection

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 2:21 AM, Richard Biener  wrote:
> On Tue, 26 Sep 2017, Sebastian Pop wrote:
>
>> On Tue, Sep 26, 2017 at 7:03 AM, Richard Biener  wrote:
>>
>> >
>> > The following is the result of me trying to understand SCOP detection
>> > and the validity checks spread around the machinery.  It removes several
>> > quadraticnesses by folding validity checks into
>> > scop_detection::harmful_loop_in_region where we already walk over all
>> > BBs in the region and process individual found loops.
>> >
>> > It also rewrites build_scop_depth/build_scop_breadth into something
>> > I can undestand.
>> >
>> > Bootstrap and regtest is running on x86_64-unknown-linux-gnu (graphite.exp
>> > for all langs is happy, so is SPEC CPU 2006 testing where the statistics
>> > agree before/after the patch).
>> >
>> > I'll apply this after the bootstrap finished.
>> >
>>
>> Have you tried to bootstrap with BOOT_CFLAGS="-O2 -fgraphite-identity"?
>
> I do "-O2 -g -floop-nest-optimize"

Very good.

> but I guess -fgraphite-identity
> should catch more issues?

It would systematically exercise the scop detection and code generation.
When isl scheduler does not find a better schedule, we would not bother
running the code gen part.

> Hmm, maybe -floop-nest-optimize and
> -fgraphite-identity should be combinable
>
> Index: gcc/graphite-optimize-isl.c
> ===
> --- gcc/graphite-optimize-isl.c (revision 253203)
> +++ gcc/graphite-optimize-isl.c (working copy)
> @@ -189,7 +189,7 @@ optimize_isl (scop_p scop)
> print_schedule_ast (dump_file, scop->original_schedule, scop);
>isl_schedule_free (scop->transformed_schedule);
>scop->transformed_schedule = isl_schedule_copy
> (scop->original_schedule);
> -  return false;
> +  return flag_graphite_identity || flag_loop_parallelize_all;

Yes.

>  }
>
>return true;
>
> I'll test/commit the above.

ok.

>
>>
>> > Richard.
>> >
>> > 2017-09-26  Richard Biener  
>> >
>> > * graphite-scop-detection.c (scop_detection::build_scop_depth):
>> > Rewrite,
>> > fold in ...
>> > (scop_detection::build_scop_breadth): ... this.  Removed.
>> > (scop_detection::loop_is_valid_in_scop): Fold into single caller.
>> > (scop_detection::harmful_stmt_in_bb): Likewise.
>> > (scop_detection::graphite_can_represent_stmt): Likewise.
>> > (scop_detection::loop_body_is_valid_scop): Likewise.  Remove
>> > recursion.
>> > (scop_detection::can_represent_loop): Remove recursion, fold in
>> > ...
>> > (scop_detection::can_represent_loop_1): ... this.  Removed.
>> > (scop_detection::harmful_loop_in_region): Simplify after inlining
>> > the above and remove more quadraticness.
>> > (build_scops): Adjust.
>> > * tree-data-ref.c (loop_nest_has_data_refs): Remove pointless
>> > quadraticness.
>> >
>> >
>> This goes in the right direction: it cuts down compilation time.
>> As it is not a trivial change, I need some time to understand how
>> the scop detection works with this change.
>
> The only functional change should be that the SESE composition now
> works top-down instead of working its way bottom-up.  It's not clear
> whether we do more or less work that way

So we went from top-down to bottom-up,
and now with this change we go back to top-down.
I think both algorithms are equivalent in terms of number
of times we validate statements.

We explained the current implementation of the scop detection in:
http://impact.gforge.inria.fr/impact2016/papers/impact2016-kumar.pdf
http://impact.gforge.inria.fr/impact2016/papers/impact2016-kumar-slides.pdf

Here is what happens on an example:

loop_1 {
  loop_2 {
stmt_1
  }
  stmt_2
  loop_3 {
stmt_3
  }
}

- with a top down scop detection, we would start the analysis with loop_1,
and start validating that every stmt in its body (stmt_1, stmt_2,
and finally stmt_3) can be represented in the polyhedral representation.
If at any moment the analysis returns "cannot represent", it would go one
level down and try to validate the immediate sub loop loop_2.
Let's assume that stmt_1 can be represented, and so it would try to
extend the scop by validating stmt_2 and then its sibling loop_3, and say
we fail on validating stmt_3.  All done, max scop is stmt_1 in loop_2
followed by stmt_2.

- with a bottom up we would start from the inner loop_2, it passes
validation of stmt_1, then we extend the scop by validating stmt_2,
and then we fail at validation of stmt_3.  All done, max scop is stmt_1 in
loop_2 followed by stmt_2.  In the bottom-up process we don't
have to validate the outer loop_1.

Supposing that there is no fail in the process, then a top-down detection
would be faster as it does not need to validate one by one the inner loops:
it just goes in one pass over the stmts of loop_1 body.

> I think we can

Re: [C PATCH] Fix flags on compound literal VAR_DECLs (PR c/82340)

2017-09-28 Thread Joseph Myers

On Thu, 28 Sep 2017, Jakub Jelinek wrote:

> Hi!
> 
> As the testcase shows, while build_compound_literal had some code to set up
> TREE_READONLY flag on compound literal VAR_DECLs, it didn't handle volatile
> nor restrict quals.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

C++ PATCH for c++/56973 (DR 696), lambda capture of const variables

2017-09-28 Thread Jason Merrill

The G++ lambda implementation previously implemented an early
tentative resolution of DR 696, whereby mentions of an outer constant
variable would immediately decay to the constant value of that
variable.  But the final resolution specified that we should capture
or not depending on how the variable is used: if we use it as an
lvalue, it's captured; if we use it as an rvalue, it isn't.

The first patch is some minor fixes discovered during this work.
The second patch reworks how we find capture proxies to use
local_specializations instead of name lookup.
The third patch delays capture of constant variables until
mark_rvalue_use/mark_lvalue_use.

The third patch also adds calls to mark_*_use in a couple of places
that needed it; I expect more will be necessary as well.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit ea50e0c044c8405aa7bc277628e38f838f1e66a7
Author: Jason Merrill 
Date:   Wed Sep 27 17:06:31 2017 -0400

Small lambda fixes.

* call.c (build_special_member_call): Use the return value of
mark_lvalue_use.
* decl.c (compute_array_index_type): Likewise.
* parser.c (cp_parser_oacc_wait_list): Likewise.
* lambda.c (is_normal_capture_proxy): Handle *this capture.
(add_capture): Clarify internal_error message.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e83cf99dc89..99a7b77efb2 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8845,7 +8845,7 @@ build_special_member_call (tree instance, tree name, 
vec **args,
  && (flags & LOOKUP_DELEGATING_CONS))
check_self_delegation (arg);
  /* Avoid change of behavior on Wunused-var-2.C.  */
- mark_lvalue_use (instance);
+ instance = mark_lvalue_use (instance);
  return build2 (INIT_EXPR, class_type, instance, arg);
}
 }
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 50fa1ba402e..ce45c1140d6 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -9329,7 +9329,7 @@ compute_array_index_type (tree name, tree size, 
tsubst_flags_t complain)
 {
   tree type = TREE_TYPE (size);
 
-  mark_rvalue_use (size);
+  size = mark_rvalue_use (size);
 
   if (cxx_dialect < cxx11 && TREE_CODE (size) == NOP_EXPR
  && TREE_SIDE_EFFECTS (size))
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index e4412569a61..695666abbe3 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -283,6 +283,8 @@ is_normal_capture_proxy (tree decl)
   if (val == error_mark_node)
 return true;
 
+  if (TREE_CODE (val) == ADDR_EXPR)
+val = TREE_OPERAND (val, 0);
   gcc_assert (TREE_CODE (val) == COMPONENT_REF);
   val = TREE_OPERAND (val, 1);
   return DECL_NORMAL_CAPTURE_P (val);
@@ -592,7 +594,8 @@ add_capture (tree lambda, tree id, tree orig_init, bool 
by_reference_p,
   && current_class_type == LAMBDA_EXPR_CLOSURE (lambda))
 {
   if (COMPLETE_TYPE_P (current_class_type))
-   internal_error ("trying to capture %qD after closure is complete", id);
+   internal_error ("trying to capture %qD in instantiation of "
+   "generic lambda", id);
   finish_member_declaration (member);
 }
 
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f9b6f278afb..bb2a8774aa0 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31712,7 +31712,7 @@ cp_parser_oacc_wait_list (cp_parser *parser, location_t 
clause_loc, tree list)
{
  tree c = build_omp_clause (clause_loc, OMP_CLAUSE_WAIT);
 
- mark_rvalue_use (targ);
+ targ = mark_rvalue_use (targ);
  OMP_CLAUSE_DECL (c) = targ;
  OMP_CLAUSE_CHAIN (c) = list;
  list = c;
commit ea60e04977765a8041e1cba59a8d3028982c55e0
Author: Jason Merrill 
Date:   Wed Sep 27 17:08:56 2017 -0400

Use local_specializations to find capture proxies.

* cp-tree.h (DECL_CAPTURED_VARIABLE): New.
* lambda.c (build_capture_proxy): Set it.
(add_capture): Pass initializer to build_capture_proxy.
(start_lambda_function): Likewise.
(insert_capture_proxy): Use register_local_specialization.
(is_lambda_ignored_entity): Always ignore proxies.
* name-lookup.c (qualify_lookup): Don't check
is_lambda_ignored_entity if LOOKUP_HIDDEN is set.
* semantics.c (process_outer_var_ref): Use
retrieve_local_specialization.
* parser.c (cp_parser_lambda_body): Push local_specializations.
* pt.c (tsubst_expr): Pass LOOKUP_HIDDEN when looking for a proxy.
(tsubst_lambda_expr): Push local_specializations sooner.
(tsubst_copy_and_build): Don't register_local_specialization.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 7c1c54c78b5..a6349019543 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -2471,10 +2471,12 @@ struct GTY(()) lang_decl_min {
   union

Re: [PATCH][GRAPHITE] More TLC

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 9:33 AM, Richard Biener  wrote:
> Looks like even when hacking the Fortran FE to produce nested
> ARRAY_REFs we run into the same issue for
>
> (gdb) p debug_data_reference (dr)
> #(Data Ref:
> #  bb: 17
> #  stmt:
> VIEW_CONVERT_EXPR(*y_117(D))[_24]{lb:
> 1 sz: _20 * 8}[_26]{lb: 1 sz: _21 * 8}[_28]{lb: 1 sz: _22 * 8}[_29]{lb: 1
> sz: 8} = 0.0;
> #  ref:
> VIEW_CONVERT_EXPR(*y_117(D))[_24]{lb:
> 1 sz: _20 * 8}[_26]{lb: 1 sz: _21 * 8}[_28]{lb: 1 sz: _22 * 8}[_29]{lb: 1
> sz: 8};
> #  base_object:
> VIEW_CONVERT_EXPR(*y_117(D));
> #  Access function 0: {1, +, 1}_4
> #  Access function 1: (integer(kind=8)) {(unsigned long) stride.88_92, +,
> (unsigned long) stride.88_92}_3;
> #  Access function 2: (integer(kind=8)) {(unsigned long) stride.90_96, +,
> (unsigned long) stride.90_96}_2;
> #  Access function 3: (integer(kind=8)) {(unsigned long) stride.92_100, +,
> (unsigned long) stride.92_100}_1;
>
> so it looks like simple strided (where stride is a parameter) access
> is not handled either.

Yes, this is the first option I was mentioning: it could work,
could you please make sure that you don't have a bug in the "hack patch"
where the outer dimension should not contain the parameter
(inner array dimension) times the access function.

Example in C:
int A[100][N];
A[i][j] is linearized as *(A + i * N * 4 + j * 4)
and you may have a bug if you delinearized it in the Fortran FE as A[i * N][j]
Could you please check that it would delinearize back to A[i][j]?

>
> GCCs dependence analysis can at least compute distances of two
> DRs when the difference of the access CHRECs is constant.  Within
> the polyhedral model those cases cannot be handled?

The difficulty for the polyhedral model is in the representation
of a multiplication of parameter times loop index variable.
The delinearization removes these difficulties by creating linear expressions.

Think about multiplication as something introducing exponentiality
and you realize that any such expression would not fit in the
linear model of polyhedra.
A parameter is nothing else than an outer loop index to which we don't
have access to that loop level as it may be outside the current function
in which we get that parameter in.

Sebastian

Re: [PATCH] Enhance PHI processing in VN

2017-09-28 Thread Richard Sandiford

David Edelsohn  writes:
> On Wed, Sep 27, 2017 at 6:58 PM, Richard Sandiford
>  wrote:
>> David Edelsohn  writes:
>>> On Fri, Sep 15, 2017 at 2:53 AM, Richard Biener  wrote:
 On Thu, 14 Sep 2017, David Edelsohn wrote:

> * tree-ssa-sccvn.c (visit_phi): Merge undefined values similar
> to VN_TOP.
>
> This seems to have regressed
>
> FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 0" 2
> FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 2" 1
> FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 3" 1

 Hmm, I don't see these FAILs.  Looking at the testcase there are
 no undefined uses so I wonder how the patch could have any effect.

 Can you re-check and open a bugreport?
>>>
>>> It disappeared again.  A different failure appeared and disappeared a
>>> few weeks ago.  Something in the testsuite infrastructure appears to
>>> not be stable, at least on AIX.  Sorry for the incorrect report.
>>
>> Perhaps this is unrelated, but when doing the "has this patch
>> changed assembly on these targets?" testing, I noticed that AIX
>> had differences like:
>>
>> --- old/powerpc-ibm-aix7.0/test/-O3/g++.dg/init/constant1.s
>> +++ new/powerpc-ibm-aix7.0/test/-O3/g++.dg/init/constant1.s
>> @@ -4,21 +4,21 @@
>> .csect ..text.startup[PR],2
>> .align 2
>> .align 4
>> - .globl 
>> _GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0xbb0d20e181e3a401
>> - .globl 
>> ._GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0xbb0d20e181e3a401
>> - .csect 
>> _GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0xbb0d20e181e3a401[DS]
>> -_GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0xbb0d20e181e3a401:
>> - .long 
>> ._GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0xbb0d20e181e3a401,
>>  TOC[tc0], 0
>> + .globl 
>> _GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0x5610f7ec143966c9
>> + .globl 
>> ._GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0x5610f7ec143966c9
>> + .csect 
>> _GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0x5610f7ec143966c9[DS]
>> +_GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0x5610f7ec143966c9:
>> + .long 
>> ._GLOBAL__I_65535_0_.._.._.._testsuite_g__.dg_._init_constant1.C__0x5610f7ec143966c9,
>>  TOC[tc0], 0
>>
>> even though -frandom-seed is forced to the same value for both runs.
>>
>> Is this behaviour deliberate?  I thought runs from a few weeks ago
>> had stable names, but maybe I just misremember.
>
> AIX does not have init/fini sections, so it is built at link time as a
> C file by collect2-ld.  I suspect that collect2-ld doesn't use
> -frandom-seed to build its temporary file.

This is just comparing -S output, so the difference seems to be in
the compiler itself.

Thanks,
Richard

Re: [PATCH][GRAPHITE] More TLC

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 8:04 AM, Richard Biener  wrote:
>
> Another thing I notice is that we don't handle the multi-dimensional
> accesses the fortran frontend produces:
>
> (gdb) p debug_data_reference (dr)
> #(Data Ref:
> #  bb: 18
> #  stmt: _43 = *a_141(D)[_42];
> #  ref: *a_141(D)[_42];
> #  base_object: *a_141(D);
> #  Access function 0: {{(_38 + stride.88_115) + 1, +, 1}_4, +,
> stride.88_115}_5
>
> ultimatively we fail here because we try to build a constraint for
>
> {{(_38 + stride.88_115) + 1, +, 1}_4, +, stride.88_115}_5
>
> which ends up computing isl_pw_aff_mul (A, stride.88_115) with
> A being the non-constant constraint generated for
> {(_38 + stride.88_115) + 1, +, 1}_4 and stride.88_115 being
> a parameter.  ISL doesn't like that multiplication as the result
> isn't affine (well - it is, we just have parameters in there).
>
> I suppose ISL doesn't handle this form of accesses given the
> two "dimensions" in this scalarized form may overlap?  So we'd
> really need to turn those into references with different access
> functions (even if that's not 100% a valid semantic transformation
> as scalarization isn't reversible without extra information)?

You are right.
This multivariate memory access would be better handled in
delinearized form:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66981
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61000
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741

There are two ways to handle this issue:
- fix the FORTRAN front-end to emit multi dimensions ARRAY_REFs,
- implement an array delinearization pass, as I implemented in LLVM
http://llvm.org/doxygen/Delinearization_8cpp_source.html
"On Recovering Multi-Dimensional Arrays in Polly"
http://impact.gforge.inria.fr/impact2015/papers/impact2015-grosser.pdf
"Optimistic Delinearization of Parametrically Sized Arrays"
https://dl.acm.org/citation.cfm?id=2751248
LLVM does not have an equivalent for multi-dim ARRAY_REF description
it only reasons about linearized memory accesses like in GCC's RTL:
gep = Get Element Pointer, so we had no other option than to delinearize.

Sebastian

Re: [PATCH] PR libstdc++/81469 deprecate std::uncaught_exception for C++17

2017-09-28 Thread Jakub Jelinek

Hi!

On Wed, Sep 20, 2017 at 05:35:26PM +0100, Jonathan Wakely wrote:
> C++17 deprecates uncaught_exception in favour of uncaught_exceptions,
> so this adds the attribute.
> 
>   PR libstdc++/81469
>   * libsupc++/exception (uncaught_exception): Deprecate for C++17.
>   * testsuite/18_support/exception_ptr/62258.cc: Add -Wno-deprecated.
>   * testsuite/18_support/uncaught_exception/14026.cc: Likewise.
> 
> Tested powerpc64le-linux, committed to trunk.

This broke a couple of tests with make check-c++-all, the following patch
should fix that.

Regtested on x86_64-linux and i686-linux, including make check-c++-all, ok
for trunk?

2017-09-28  Jakub Jelinek  

* g++.dg/eh/uncaught1.C: Add -Wno-deprecated for c++17.
* g++.dg/eh/uncaught2.C: Likewise.
* g++.dg/eh/uncaught3.C: Likewise.
* g++.dg/eh/uncaught4.C: Likewise.
* g++.old-deja/g++.mike/eh48.C: Likewise.

--- gcc/testsuite/g++.dg/eh/uncaught1.C.jj  2014-01-28 14:03:44.0 
+0100
+++ gcc/testsuite/g++.dg/eh/uncaught1.C 2017-09-28 14:33:08.758343406 +0200
@@ -1,6 +1,7 @@
 // PR libstdc++/10606
 // { dg-do run }
 // { dg-options "-fuse-cxa-get-exception-ptr" { target powerpc*-*-darwin* } }
+// { dg-additional-options "-Wno-deprecated" { target c++17 } }
 
 #include 
 #include 
--- gcc/testsuite/g++.dg/eh/uncaught2.C.jj  2008-09-05 12:55:05.0 
+0200
+++ gcc/testsuite/g++.dg/eh/uncaught2.C 2017-09-28 14:33:16.761250186 +0200
@@ -1,6 +1,7 @@
 // { dg-do compile }
 // { dg-final { scan-assembler-not "__cxa_get_exception" } }
 // { dg-options "-fno-use-cxa-get-exception-ptr" }
+// { dg-additional-options "-Wno-deprecated" { target c++17 } }
 
 #include 
 #include 
--- gcc/testsuite/g++.dg/eh/uncaught3.C.jj  2008-09-05 12:55:05.0 
+0200
+++ gcc/testsuite/g++.dg/eh/uncaught3.C 2017-09-28 14:33:23.180175417 +0200
@@ -1,6 +1,7 @@
 // { dg-do compile { target powerpc*-*-darwin* } }
 // { dg-final { scan-assembler-not "__cxa_get_exception" } }
 // { dg-options "-mmacosx-version-min=10.4" }
+// { dg-additional-options "-Wno-deprecated" { target c++17 } }
 
 #include 
 #include 
--- gcc/testsuite/g++.dg/eh/uncaught4.C.jj  2014-01-28 14:03:44.0 
+0100
+++ gcc/testsuite/g++.dg/eh/uncaught4.C 2017-09-28 14:33:29.811098178 +0200
@@ -1,5 +1,6 @@
 // PR c++/41174
 // { dg-do run }
+// { dg-additional-options "-Wno-deprecated" { target c++17 } }
 
 #include 
 
--- gcc/testsuite/g++.old-deja/g++.mike/eh48.C.jj   2008-09-05 
12:54:56.0 +0200
+++ gcc/testsuite/g++.old-deja/g++.mike/eh48.C  2017-09-28 14:34:09.792632463 
+0200
@@ -1,5 +1,6 @@
 // { dg-do run { xfail sparc64-*-elf arm-*-pe } }
 // { dg-options "-fexceptions" }
+// { dg-additional-options "-Wno-deprecated" { target c++17 } }
 
 #include 
 #include 


Jakub

[C++ PATCH] Let make check-c++-all test also c++2a

2017-09-28 Thread Jakub Jelinek

Hi!

On Wed, Sep 27, 2017 at 07:55:20AM -0700, Nathan Sidwell wrote:
> I'm sorry for my tardiness.  It think the patch would be better broken
> apart:
>   1) fix the parsing bug you found and move to (ab)using
> DECL_BIT_FIELD_REPRESENTATIVE
> 
>   2) the new c++2a feature
> 
> Is that feasible?

Here is the third patch, which enables testing c++2a in make check-c++-all.
Regtested and regtested with make check-c++-all on x86_64-linux and
i686-linux, ok for trunk?

2017-09-28  Jakub Jelinek  

* Make-lang.in (check-c++-all): Test also c++2a.

--- gcc/cp/Make-lang.in.jj  2017-09-18 20:48:53.592872957 +0200
+++ gcc/cp/Make-lang.in 2017-09-19 09:51:21.918612788 +0200
@@ -176,7 +176,7 @@ check-c++17:
 
 # Run the testsuite in all standard conformance levels.
 check-c++-all:
-   $(MAKE) RUNTESTFLAGS="$(RUNTESTFLAGS) --stds=98,11,14,17,concepts" 
check-g++
+   $(MAKE) RUNTESTFLAGS="$(RUNTESTFLAGS) --stds=98,11,14,17,2a,concepts" 
check-g++
 
 # Run the testsuite with garbage collection at every opportunity.
 check-g++-strict-gc:


Jakub

[C++ PATCH] Stash bitfield width into DECL_BIT_FIELD_REPRESENTATIVE instead of DECL_INITIAL

2017-09-28 Thread Jakub Jelinek

Hi!

On Wed, Sep 27, 2017 at 07:55:20AM -0700, Nathan Sidwell wrote:
>   1) fix the parsing bug you found and move to (ab)using
> DECL_BIT_FIELD_REPRESENTATIVE

The following patch is the D_B_F_R part of the above.
Bootstrapped/regtested on top of the patch I've just posted, ok for trunk?

2017-09-28  Jakub Jelinek  

c-family/
* c-attribs.c (handle_packed_attribute): Test DECL_C_BIT_FIELD
rather than DECL_INITIAL.
(common_handle_aligned_attribute): Likewise.
c/
* c-decl.c (grokfield): Use SET_DECL_C_BIT_FIELD here if
with is non-NULL.
(finish_struct): Test DECL_C_BIT_FIELD instead of DECL_INITIAL,
don't SET_DECL_C_BIT_FIELD here.
cp/
* class.c (check_bitfield_decl): Retrieve and clear width from
DECL_BIT_FIELD_REPRESENTATIVE rather than DECL_INITIAL.
(check_field_decls): Test DECL_BIT_FIELD_REPRESENTATIVE rather than
DECL_INITIAL.
(remove_zero_width_bit_fields): Adjust comment.
* decl2.c (grokbitfield): Stash width into
DECL_BIT_FIELD_REPRESENTATIVE rather than DECL_INITIAL.
* pt.c (tsubst_decl): For DECL_C_BIT_FIELD, tsubst_expr
DECL_BIT_FIELD_REPRESENTATIVE rather than DECL_INITIAL for width.
objc/
* objc-act.c (check_ivars, gen_declaration): For OBJCPLUS look at
DECL_BIT_FIELD_REPRESENTATIVE rather than DECL_INITIAL.

--- gcc/c-family/c-attribs.c.jj 2017-09-18 20:48:53.731871226 +0200
+++ gcc/c-family/c-attribs.c2017-09-19 09:51:21.928612658 +0200
@@ -426,7 +426,7 @@ handle_packed_attribute (tree *node, tre
 {
   if (TYPE_ALIGN (TREE_TYPE (*node)) <= BITS_PER_UNIT
  /* Still pack bitfields.  */
- && ! DECL_INITIAL (*node))
+ && ! DECL_C_BIT_FIELD (*node))
warning (OPT_Wattributes,
 "%qE attribute ignored for field of type %qT",
 name, TREE_TYPE (*node));
@@ -1773,7 +1773,7 @@ common_handle_aligned_attribute (tree *n
 {
   if (warn_if_not_aligned_p)
{
- if (TREE_CODE (decl) == FIELD_DECL && !DECL_INITIAL (decl))
+ if (TREE_CODE (decl) == FIELD_DECL && !DECL_C_BIT_FIELD (decl))
{
  SET_DECL_WARN_IF_NOT_ALIGN (decl, (1U << i) * BITS_PER_UNIT);
  warn_if_not_aligned_p = false;
--- gcc/c/c-decl.c.jj   2017-09-12 21:57:59.0 +0200
+++ gcc/c/c-decl.c  2017-09-19 10:43:30.898898784 +0200
@@ -7602,6 +7602,8 @@ grokfield (location_t loc,
 
   finish_decl (value, loc, NULL_TREE, NULL_TREE, NULL_TREE);
   DECL_INITIAL (value) = width;
+  if (width)
+SET_DECL_C_BIT_FIELD (value);
 
   if (warn_cxx_compat && DECL_NAME (value) != NULL_TREE)
 {
@@ -7946,12 +7948,11 @@ finish_struct (location_t loc, tree t, t
   if (C_DECL_VARIABLE_SIZE (x))
C_TYPE_VARIABLE_SIZE (t) = 1;
 
-  if (DECL_INITIAL (x))
+  if (DECL_C_BIT_FIELD (x))
{
  unsigned HOST_WIDE_INT width = tree_to_uhwi (DECL_INITIAL (x));
  DECL_SIZE (x) = bitsize_int (width);
  DECL_BIT_FIELD (x) = 1;
- SET_DECL_C_BIT_FIELD (x);
}
 
   if (TYPE_PACKED (t)
--- gcc/cp/class.c.jj   2017-09-18 20:48:53.509873991 +0200
+++ gcc/cp/class.c  2017-09-19 10:31:35.435961690 +0200
@@ -3231,12 +3231,12 @@ check_bitfield_decl (tree field)
   tree w;
 
   /* Extract the declared width of the bitfield, which has been
- temporarily stashed in DECL_INITIAL.  */
-  w = DECL_INITIAL (field);
+ temporarily stashed in DECL_BIT_FIELD_REPRESENTATIVE.  */
+  w = DECL_BIT_FIELD_REPRESENTATIVE (field);
   gcc_assert (w != NULL_TREE);
   /* Remove the bit-field width indicator so that the rest of the
- compiler does not treat that value as an initializer.  */
-  DECL_INITIAL (field) = NULL_TREE;
+ compiler does not treat that value as a qualifier.  */
+  DECL_BIT_FIELD_REPRESENTATIVE (field) = NULL_TREE;
 
   /* Detect invalid bit-field type.  */
   if (!INTEGRAL_OR_ENUMERATION_TYPE_P (type))
@@ -3571,7 +3571,8 @@ check_field_decls (tree t, tree *access_
DECL_PACKED (x) = 1;
}
 
-  if (DECL_C_BIT_FIELD (x) && integer_zerop (DECL_INITIAL (x)))
+  if (DECL_C_BIT_FIELD (x)
+ && integer_zerop (DECL_BIT_FIELD_REPRESENTATIVE (x)))
/* We don't treat zero-width bitfields as making a class
   non-empty.  */
;
@@ -5268,9 +5269,9 @@ remove_zero_width_bit_fields (tree t)
 {
   if (TREE_CODE (*fieldsp) == FIELD_DECL
  && DECL_C_BIT_FIELD (*fieldsp)
-  /* We should not be confused by the fact that grokbitfield
+ /* We should not be confused by the fact that grokbitfield
 temporarily sets the width of the bit field into
-DECL_INITIAL (*fieldsp).
+DECL_BIT_FIELD_REPRESENTATIVE (*fieldsp).
 check_bitfield_decl eventually sets DECL_SIZE (*fieldsp)
 to that width.  */
  && (DECL_SIZE (*fieldsp) == NULL_TREE
--- gcc/cp/decl2.c.jj

[C++ PATCH] Fix attribute parsing for bitfields

2017-09-28 Thread Jakub Jelinek

On Wed, Sep 27, 2017 at 07:55:20AM -0700, Nathan Sidwell wrote:
> Jakub,
> > The following patch implements P0386R1 - NSDMIs for bit-fields.
> > While working on that, I've discovered our parser mishandles attributes
> > on bitfields, already C++11 says:
> > identifier[opt] attribute-specifier-seq[opt] : constant-expression
> > in the grammar, but we actually parsed
> > identifier[opt] : constant-expression attribute-specifier-seq[opt]
> 
> I'm sorry for my tardiness.  It think the patch would be better broken
> apart:
>   1) fix the parsing bug you found and move to (ab)using
> DECL_BIT_FIELD_REPRESENTATIVE
> 
>   2) the new c++2a feature

I don't see the first half of 1) related to second half thereof, the
DECL_BIT_FIELD_REPRESENTATIVE is unrelated to the parsing bug, and a needed
precondition of 2).

Therefore, I'm going to split the patch into 4 patches, one the
parsing bug (make sure we allow attributes where the standard allows them),
attached below, including pedwarn.  Note we already error for the old
spot when using C++11 [[ ]] attributes there, like int a : 8 [[gnu::packed]];
because when parsing the width expression we see [ after 8 and so it could
be say 8[var] or similar, but error out because [[ must not appear in that
spot.  But alignas there didn't result in error, neither __attribute__.
The other patches will follow, second patch will be
DECL_BIT_FIELD_REPRESENTATIVE, third one the Make-lang.in tweak and fourth
one, to be rewritten tomorrow on top of the rest will be the new c++2a
feature.

The following has been bootstrapped/regtested on x86_64-linux and i686-linux
(together with the Make-lang.in patch, but without the second and fourth
patch), ok for trunk?

As I said on IRC, I hope [[/__attribute__/alignas early is rare enough that
the tentative parsing shouldn't be a big deal, if it is, we could add some
cheaper function that allows us to skip over attributes (return a peek
offset after the attributes given a starting peek offset).

2017-09-28  Jakub Jelinek  

cp/
* parser.c (cp_parser_member_declaration): Parse attributes before
colon of a bitfield in addition to after colon.
testsuite/
* g++.dg/ext/bitfield7.C: New test.
* g++.dg/ext/bitfield8.C: New test.
* g++.dg/ext/bitfield9.C: New test.

--- gcc/cp/parser.c.jj  2017-09-22 20:51:48.181537880 +0200
+++ gcc/cp/parser.c 2017-09-27 17:50:15.082792676 +0200
@@ -23443,35 +23443,64 @@ cp_parser_member_declaration (cp_parser*
{
  tree attributes = NULL_TREE;
  tree first_attribute;
+ bool is_bitfld = false;
+ bool named_bitfld = false;

  /* Peek at the next token.  */
  token = cp_lexer_peek_token (parser->lexer);

+ if (cp_next_tokens_can_be_attribute_p (parser)
+ || (token->type == CPP_NAME
+ && cp_nth_tokens_can_be_attribute_p (parser, 2)
+ && (named_bitfld = true)))
+   {
+ cp_parser_parse_tentatively (parser);
+ if (named_bitfld)
+   cp_lexer_consume_token (parser->lexer);
+ cp_parser_attributes_opt (parser);
+ token = cp_lexer_peek_token (parser->lexer);
+ is_bitfld = cp_lexer_next_token_is (parser->lexer, CPP_COLON);
+ cp_parser_abort_tentative_parse (parser);
+   }
+
  /* Check for a bitfield declaration.  */
- if (token->type == CPP_COLON
+ if (is_bitfld
+ || token->type == CPP_COLON
  || (token->type == CPP_NAME
- && cp_lexer_peek_nth_token (parser->lexer, 2)->type
- == CPP_COLON))
+ && cp_lexer_nth_token_is (parser->lexer, 2, CPP_COLON)
+ && (named_bitfld = true)))
{
  tree identifier;
  tree width;
+ tree late_attributes = NULL_TREE;

- /* Get the name of the bitfield.  Note that we cannot just
-check TOKEN here because it may have been invalidated by
-the call to cp_lexer_peek_nth_token above.  */
- if (cp_lexer_peek_token (parser->lexer)->type != CPP_COLON)
+ if (named_bitfld)
identifier = cp_parser_identifier (parser);
  else
identifier = NULL_TREE;

+ /* Look for attributes that apply to the bitfield.  */
+ attributes = cp_parser_attributes_opt (parser);
+
  /* Consume the `:' token.  */
  cp_lexer_consume_token (parser->lexer);
+
  /* Get the width of the bitfield.  */
- width
-   = cp_parser_constant_expression (parser);
+ width = cp_parser_constant_expression (parser);
+
+ /* Look for attributes that apply to the bitfield after
+the `:' token and width.  This is where GCC used to
+parse attributes in the past, pedwarn if there is
+

Re: [PATCH][GRAPHITE] More TLC

2017-09-28 Thread Sebastian Pop

On Wed, Sep 27, 2017 at 7:18 AM, Richard Biener  wrote:

> On Tue, 26 Sep 2017, Sebastian Pop wrote:
>
> > On Mon, Sep 25, 2017 at 8:12 AM, Richard Biener 
> wrote:
> >
> > > On Fri, 22 Sep 2017, Sebastian Pop wrote:
> > >
> > > > On Fri, Sep 22, 2017 at 8:03 AM, Richard Biener 
> > > wrote:
> > > >
> > > > >
> > > > > This simplifies canonicalize_loop_closed_ssa and does other minimal
> > > > > TLC.  It also adds a testcase I reduced from a stupid mistake I
> made
> > > > > when reworking canonicalize_loop_closed_ssa.
> > > > >
> > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to
> trunk.
> > > > >
> > > > > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> > > > > -Ofast -march=haswell -floop-nest-optimize are
> > > > >
> > > > >  61 loop nests "optimized"
> > > > >  45 loop nest transforms cancelled because of code generation
> issues
> > > > >  21 loop nest optimizations timed out the 35 ISL "operations"
> we
> > > allow
> > > > >
> > > > > I say "optimized" because the usual transform I've seen is static
> > > tiling
> > > > > as enforced by GRAPHITE according to --param loop-block-tile-size.
> > > > > There's no way to automagically figure what kind of transform ISL
> did
> > > > >
> > > >
> > > > Here is how to automate (without magic) the detection
> > > > of the transform that isl did.
> > > >
> > > > The problem solved by isl is the minimization of strides
> > > > in memory, and to do this, we need to tell the isl scheduler
> > > > the validity dependence graph, in graphite-optimize-isl.c
> > > > see the validity (RAW, WAR, WAW) and the proximity
> > > > (RAR + validity) maps.  The proximity does include the
> > > > read after read, as the isl scheduler needs to minimize
> > > > strides between consecutive reads.
>
> Ah, so I now see why we do not perform interchange on trivial cases like
>
> double A[1024][1024], B[1024][1024];
>
> void foo(void)
> {
>   for (int i = 0; i < 1024; ++i)
> for (int j = 0; j < 1024; ++j)
>   A[j][i] = B[j][i];
> }
>
> which is probably because
>
>   /* FIXME: proximity should not be validity.  */
>   isl_union_map *proximity = isl_union_map_copy (validity);
>
> falls apart when there is _no_ dependence?
>

You are right.  The proximity needs to account for spatial
locality as well if you want to interchange the loop.
To describe the spatial locality, I would recommend adding
to the proximity relation the array accesses from two
successive iterations of the innermost loop:
A[j][i] -> A[j][i+1] and B[j][i] -> B[j][i+1]
With these two extra relations in the proximity map,
isl should be able to interchange the above loop.


>
> I can trick GRAPHITE into performing the interchange for
>
> double A[1024][1024], B[1024][1024];
>
> void foo(void)
> {
>   for (int i = 1; i < 1023; ++i)
> for (int j = 0; j < 1024; ++j)
>   A[j][i] = B[j][i-1] + A[j][i+1];
> }
>
> because now there is a dependence.  Any idea on how to rewrite
> scop_get_dependences to avoid "simplifying"?  I suppose the
> validity constraints _do_ also specify kind-of a proximity
>

Correct: the validity map specifies a subset (it is missing
RAR dependences) of data reuse.


> we just may not prune / optimize them in the same way as
> dependences?
>

Validity constraints are there to "keep the wind blowing
in the same direction" after the transform (otherwise the
result of the transformed computation may be wrong.)

The proximity map should contain a description of
- reuse of memory (temporal locality)
- how close together the access elements are (spatial locality.)
isl will optimize for both if the proximity map has a description
of both.

For the moment the proximity map is initialized only with the
current validity constraints, as you quoted the FIXME comment,
which would only describe a subset of the temporal locality.

Sebastian

[committed] Fix i386/pr82260-*.c testcases (PR target/82342)

2017-09-28 Thread Jakub Jelinek

Hi!

When BMI2 is on, such as for -march=haswell defaulting gcc or when
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m64,-m64/-march=haswell\}'
etc., these testcases emit sarx which is something the testcases didn't mean
to test, they want to test what kind of insn is emitted to load the shift
count into the cl register for normal shift.

This patch fixes it by adding -mno-bmi2.  Regtested on x86_64-linux and
i686-linux, committed as obvious to trunk.

2017-09-28  Jakub Jelinek  

PR target/82342
* gcc.target/i386/pr82260-1.c: Add -mno-bmi2 to dg-options.
* gcc.target/i386/pr82260-2.c: Likewise.

--- gcc/testsuite/gcc.target/i386/pr82260-1.c.jj2017-09-21 
09:26:42.0 +0200
+++ gcc/testsuite/gcc.target/i386/pr82260-1.c   2017-09-27 16:56:13.0 
+0200
@@ -1,6 +1,6 @@
 /* PR target/82260 */
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-Os -mtune=generic -masm=att" } */
+/* { dg-options "-Os -mtune=generic -masm=att -mno-bmi2" } */
 /* movl %esi, %ecx is shorter than movb %sil, %cl.  While
movl %edx, %ecx is the same size as movb %dl, %cl and
movl %r8d, %ecx is the same size as movb %r8b, %cl, movl
--- gcc/testsuite/gcc.target/i386/pr82260-2.c.jj2017-09-21 
09:26:42.0 +0200
+++ gcc/testsuite/gcc.target/i386/pr82260-2.c   2017-09-27 16:56:27.0 
+0200
@@ -1,6 +1,6 @@
 /* PR target/82260 */
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-Os -mtune=generic -masm=att 
-mtune-ctrl=^partial_reg_dependency" } */
+/* { dg-options "-Os -mtune=generic -masm=att 
-mtune-ctrl=^partial_reg_dependency -mno-bmi2" } */
 /* { dg-final { scan-assembler-not {\mmovb\t%sil, %cl} } } */
 /* { dg-final { scan-assembler {\mmovl\t%esi, %ecx} } } */
 /* { dg-final { scan-assembler {\mmovb\t%dl, %cl} } } */

Jakub

[C PATCH] Fix flags on compound literal VAR_DECLs (PR c/82340)

2017-09-28 Thread Jakub Jelinek

Hi!

As the testcase shows, while build_compound_literal had some code to set up
TREE_READONLY flag on compound literal VAR_DECLs, it didn't handle volatile
nor restrict quals.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-09-28  Jakub Jelinek  

PR c/82340
* c-decl.c (build_compound_literal): Use c_apply_type_quals_to_decl
instead of trying to set just TREE_READONLY manually.

* gcc.dg/tree-ssa/pr82340.c: New test.

--- gcc/c/c-decl.c.jj   2017-09-19 16:38:14.0 +0200
+++ gcc/c/c-decl.c  2017-09-27 15:18:53.066566172 +0200
@@ -5247,9 +5247,7 @@ build_compound_literal (location_t loc,
   DECL_ARTIFICIAL (decl) = 1;
   DECL_IGNORED_P (decl) = 1;
   TREE_TYPE (decl) = type;
-  TREE_READONLY (decl) = (TYPE_READONLY (type)
- || (TREE_CODE (type) == ARRAY_TYPE
- && TYPE_READONLY (TREE_TYPE (type;
+  c_apply_type_quals_to_decl (TYPE_QUALS (strip_array_types (type)), decl);
   store_init_value (loc, decl, init, NULL_TREE);
 
   if (TREE_CODE (type) == ARRAY_TYPE && !COMPLETE_TYPE_P (type))
--- gcc/testsuite/gcc.dg/tree-ssa/pr82340.c.jj  2017-09-27 16:01:36.696296732 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr82340.c 2017-09-27 16:02:09.262886860 
+0200
@@ -0,0 +1,14 @@
+/* PR c/82340 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ssa" } */
+/* { dg-final { scan-tree-dump "D.\[0-9]*\\\[0\\\] ={v} 77;" "ssa" } } */
+
+int
+foo (void)
+{
+  int i;
+  volatile char *p = (volatile char[1]) { 77 };
+  for (i = 1; i < 10; i++)
+*p = 4;
+  return *p;
+}

Jakub

Re: [PATCH][GRAPHITE] More TLC

2017-09-28 Thread Sebastian Pop

Hi skimo,

On Tue, Sep 26, 2017 at 10:15 AM, Sven Verdoolaege <
sven.verdoola...@gmail.com> wrote:

> On Tue, Sep 26, 2017 at 09:19:50AM -0500, Sebastian Pop wrote:
> > Sven, is there already a function that computes the sum of all
> > strides in a proximity map?  Maybe you have code that does
> > something similar in pet or ppcg?
>
> What exactly do you want to sum?

If this involves any counting, then it cannot currently
>

I think that it does involve counting: we need to know
the distance between all pairs of array accesses, that is the
number of points in the dependence polyhedron.


> be done in pet or ppcg since isl does not support counting yet
> and the public version of barvinok is GPL licensed.
>
> Also, it's better to ask such questions on the isl mailing list
> isl-developm...@googlegroups.com
>
>
We are trying to find a metric that shows that isl's scheduler
did a useful transform.  Something like a diff tool that shows
before and after scheduling the strides of array accesses.

Could the isl scheduler output a description of what it did?
We would like to use that output to build testcases that match
the behavior of the compiler on different patterns.

Thanks,
Sebastian

[PATCH, i386]: Do not check index when encoding %esp as %rsp to avoid 0x67 prefix

2017-09-28 Thread Uros Bizjak

As mentioned in the PR, SP_REG can not be encoded as index.

2017-09-28  Uros Bizjak  

* config/i386/i386.c (ix86_print_operand_address_as): Do not check
index when encoding %esp as %rsp to avoid 0x67 prefix.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253254)
+++ config/i386/i386.c  (working copy)
@@ -19953,12 +19953,11 @@ ix86_print_operand_address_as (FILE *file, rtx add
  code = 'k';
}
 
-  /* Since the upper 32 bits of RSP are always zero for x32, we can
-encode %esp as %rsp to avoid 0x67 prefix if there is no index or
-base register.  */
+  /* Since the upper 32 bits of RSP are always zero for x32,
+we can encode %esp as %rsp to avoid 0x67 prefix if
+there is no index register.  */
   if (TARGET_X32 && Pmode == SImode
- && ((!index && base && REG_P (base) && REGNO (base) == SP_REG)
- || (!base && index && REGNO (index) == SP_REG)))
+ && !index && base && REG_P (base) && REGNO (base) == SP_REG)
code = 'q';
 
   if (ASSEMBLER_DIALECT == ASM_ATT)

Re: [PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Uros Bizjak

On Thu, Sep 28, 2017 at 4:05 PM, Shalnov, Sergey
 wrote:
> Sorry. The patch is changed as you proposed.

OK for mainline and committed.

Thanks,
Uros.

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Thursday, September 28, 2017 3:17 PM
> To: Shalnov, Sergey 
> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Senkevich, Andrew 
> ; Ivchenko, Alexander 
> ; Peryt, Sebastian 
> Subject: Re: [PATCH, i386] Avoid 512-bit vector return constant for Intel 
> AVX512 configuration
>
> On Thu, Sep 28, 2017 at 3:08 PM, Shalnov, Sergey  
> wrote:
>> Hi,
>> GCC uses full 512-bit register to return the constant from the function.
>> The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
>>
>> 2017-09-28  Sergey Shalnov  
>>
>> gcc/
>> * config/i386/i386.md(*movsf_internal, *movdf_internal):
>> Return 256-bit AVX modes for TARGET_PREFER_AVX256.
>>
>> gcc/testsuite/
>> * gcc.target/i386/avx512f-constant-float-return.c: New test.
>>
>
> -(match_test "TARGET_AVX512F")
> +(match_test "TARGET_AVX512F && !TARGET_PREFER_AVX256")
>
> Please use
>
> (and (match_test "TARGET_AVX512F)
> (not (match_test "TARGET_PREFER_AVX256)))
>
> Uros.

Re: [PATCH] Fix fortran/81509

2017-09-28 Thread Steve Kargl

On Thu, Sep 28, 2017 at 10:46:06AM +0100, Paul Richard Thomas wrote:
> 
> I'll take your word for it on the F2008 contraints. Given that the
> patch is very good - OK for trunk.
> 

The text for IAND from F2008 is 

Arguments.

I   shall be of type integer or a boz-literal-constant.
J   shall be of type integer or a boz-literal-constant. If both I and J
are of type integer, they shall have the same kind type parameter.
I and J shall not both be boz-literal-constants.

Result Characteristics.

Same as I if I is of type integer; otherwise, same as J.

Result Value.

If either I or J is a boz-literal-constant, it is first converted as if
by the intrinsic function INT to type integer with the kind type parameter
of the other.

Prior to my patch one could do IAND(1_4, 4_8).  gfortran would convert 
the 1_4 to 1_8 and return effectively IAND(1_8, 4_8).  This violates 
the 2nd sentence in the description of J.  One might argue that the 
extension makes sense except the documentation does not state what
occurs.  The real problem comes with IAND(42_2,z'DEAD') where 
z'DEAD' is some mask user wants to apply to an INTEGER(2) entity.
On x86_64 and due to the implementation of a BOZ to statisfy 
requirements of the DATA statement from F95, z'DEAD' is implicitly an
INTEGER(16).  So, IAND(42_2,z'DEAD') convert 42_2 to 42_16 and returns
a INTEGER(16).  This is a violation of F2008 'Result Value' statement.

In hindsight, some 13 years ago I should have introduced a BT_BOZ basic
type definition.  I need to check, but at one time gfortran for a boz
literal constant set x->is_boz=1, x->ts.type=BT_INTEGER, and x->ts.kind=16. 
where x is a gfc_expr *.  The is_boz is unneeded with BT_BOZ and the
kind can be set to the appropriate kind when needed.  A bonus with
BT_BOZ would automatic fix all other intrinsics which currently accept
a boz, e.g., ABS(z'DEAD').

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

[patch, libfortran] Fix thead sanitizer issue with libgfortran

2017-09-28 Thread Thomas Koenig


Hello world,

the attached patch fixes the problem reported in PR 66756:  When
opeing a file, the main lock for all units was acquired, the unit lock
was acquired, and then the main lock was released and re-aquired.  To
the thread sanitizer, this is a lock-order inversion.

One option would have been to simply close the bug, because this
only occurs in opening a file, when the gfc_unit has not yet had
a chance to escape to another thread. However, it appears that
this causes trouble debugging parallel applications, hence this
patch.

What this patch does is to change the assumptions for insert_unit:
Previously, this used to lock the newly created unit, and the caller
had to unlock. Now, gfc_get_unit can do the locking after releasing
the global lock.

This gets rid of the thread sanitizer issue; the thread sanitizer
output is clean.  However, I would appreciate feedback about whether
this approach (and my code) is correct.

Regression-tested.

Comments? Suggestions for improvements/other approaches? Close the PR
as WONTFIX instead? OK for trunk?

Regards

Thomas

2017-09-28  Thomas Koenig  

PR fortran/66756
* io/fbuf.c (fbuf_destroy): Lock unit before freeing the buffer.
* io/unit.c (insert_unit): Do not create lock and lock, move to
(gfc_get_unit): here; lock after insert_unit has succeded.
(init_units): Do not unlock unit locks for stdin, stdout and
stderr.
Index: io/fbuf.c
===
--- io/fbuf.c	(Revision 253162)
+++ io/fbuf.c	(Arbeitskopie)
@@ -50,9 +50,11 @@ fbuf_destroy (gfc_unit *u)
 {
   if (u->fbuf == NULL)
 return;
+  __gthread_mutex_lock (>lock);
   free (u->fbuf->buf);
   free (u->fbuf);
   u->fbuf = NULL;
+  __gthread_mutex_unlock (>lock);
 }
 
 
Index: io/unit.c
===
--- io/unit.c	(Revision 253162)
+++ io/unit.c	(Arbeitskopie)
@@ -221,23 +221,14 @@ insert (gfc_unit *new, gfc_unit *t)
   return t;
 }
 
+/* insert_unit()-- Create a new node, insert it into the treap.  It is assumed
+   that the caller holds unit_lock.  */
 
-/* insert_unit()-- Create a new node, insert it into the treap.  */
-
 static gfc_unit *
 insert_unit (int n)
 {
   gfc_unit *u = xcalloc (1, sizeof (gfc_unit));
   u->unit_number = n;
-#ifdef __GTHREAD_MUTEX_INIT
-  {
-__gthread_mutex_t tmp = __GTHREAD_MUTEX_INIT;
-u->lock = tmp;
-  }
-#else
-  __GTHREAD_MUTEX_INIT_FUNCTION (>lock);
-#endif
-  __gthread_mutex_lock (>lock);
   u->priority = pseudo_random ();
   unit_root = insert (u, unit_root);
   return u;
@@ -361,9 +352,20 @@ retry:
 
   if (created)
 {
-  /* Newly created units have their lock held already
-	 from insert_unit.  Just unlock UNIT_LOCK and return.  */
+#ifdef __GTHREAD_MUTEX_INIT
+  {
+	__gthread_mutex_t tmp = __GTHREAD_MUTEX_INIT;
+	p->lock = tmp;
+  }
+#else
+  __GTHREAD_MUTEX_INIT_FUNCTION (>lock);
+#endif
   __gthread_mutex_unlock (_lock);
+
+  /* Nobody outside this address has seen this unit yet.  We could safely
+	 keep it unlocked until now.  */
+  
+  __gthread_mutex_lock (>lock);
   return p;
 }
 
@@ -618,8 +620,6 @@ init_units (void)
   u->filename = strdup (stdin_name);
 
   fbuf_init (u, 0);
-
-  __gthread_mutex_unlock (>lock);
 }
 
   if (options.stdout_unit >= 0)
@@ -649,8 +649,6 @@ init_units (void)
   u->filename = strdup (stdout_name);
 
   fbuf_init (u, 0);
-
-  __gthread_mutex_unlock (>lock);
 }
 
   if (options.stderr_unit >= 0)
@@ -680,8 +678,6 @@ init_units (void)
 
   fbuf_init (u, 256);  /* 256 bytes should be enough, probably not doing
   any kind of exotic formatting to stderr.  */
-
-  __gthread_mutex_unlock (>lock);
 }
 
   /* Calculate the maximum file offset in a portable manner.

[committed] jit: document function pointers

2017-09-28 Thread David Malcolm

This patch adds a new function-pointers.rst topic page to the libgccjit
docs.

Committed to trunk as r253257.

gcc/jit/ChangeLog:
* docs/topics/expressions.rst (Function calls): Add link to
gcc_jit_context_new_function_ptr_type.
(Function pointers): Convert to cross-references to
function-pointers.rst, moving material there.
* docs/topics/function-pointers.rst: New page.
* docs/topics/index.rst: Add function-pointers.rst.
* docs/topics/types.rst (Function pointer types): New section.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
---
 gcc/jit/docs/topics/expressions.rst   | 21 
 gcc/jit/docs/topics/function-pointers.rst | 80 +++
 gcc/jit/docs/topics/index.rst |  1 +
 gcc/jit/docs/topics/types.rst |  6 +++
 4 files changed, 96 insertions(+), 12 deletions(-)
 create mode 100644 gcc/jit/docs/topics/function-pointers.rst

diff --git a/gcc/jit/docs/topics/expressions.rst 
b/gcc/jit/docs/topics/expressions.rst
index f5c2d0f..76aa4eb 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -416,7 +416,8 @@ Function calls
 int numargs, \
 gcc_jit_rvalue **args)
 
-   Given an rvalue of function pointer type, and the given table of
+   Given an rvalue of function pointer type (e.g. from
+   :c:func:`gcc_jit_context_new_function_ptr_type`), and the given table of
argument rvalues, construct a call to the function pointer, with the
result as an rvalue.
 
@@ -452,19 +453,15 @@ Function calls
 Function pointers
 *
 
-.. function:: gcc_jit_rvalue *\
- gcc_jit_function_get_address (gcc_jit_function *fn,\
-gcc_jit_location *loc)
-
-   Get the address of a function as an rvalue, of function pointer
-   type.
+Function pointers can be obtained:
 
-   This entrypoint was added in :ref:`LIBGCCJIT_ABI_9`; you can test
-   for its presence using
-
-   .. code-block:: c
+  * from a :c:type:`gcc_jit_function` using
+:c:func:`gcc_jit_function_get_address`, or
 
-  #ifdef LIBGCCJIT_HAVE_gcc_jit_function_get_address
+  * from an existing function using
+:c:func:`gcc_jit_context_new_rvalue_from_ptr`,
+using a function pointer type obtained using
+:c:func:`gcc_jit_context_new_function_ptr_type`.
 
 Type-coercion
 *
diff --git a/gcc/jit/docs/topics/function-pointers.rst 
b/gcc/jit/docs/topics/function-pointers.rst
new file mode 100644
index 000..b5b9d1b
--- /dev/null
+++ b/gcc/jit/docs/topics/function-pointers.rst
@@ -0,0 +1,80 @@
+.. Copyright (C) 2017 Free Software Foundation, Inc.
+   Originally contributed by David Malcolm 
+
+   This is free software: you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see
+   .
+
+.. default-domain:: c
+
+Function pointers
+=
+
+You can generate calls that use a function pointer via
+:c:func:`gcc_jit_context_new_call_through_ptr`.
+
+To do requires a :c:type:`gcc_jit_rvalue` of the correct function pointer type.
+
+Function pointers for a :c:type:`gcc_jit_function` can be obtained
+via :c:func:`gcc_jit_function_get_address`.
+
+.. function:: gcc_jit_rvalue *\
+ gcc_jit_function_get_address (gcc_jit_function *fn,\
+gcc_jit_location *loc)
+
+   Get the address of a function as an rvalue, of function pointer
+   type.
+
+   This entrypoint was added in :ref:`LIBGCCJIT_ABI_9`; you can test
+   for its presence using
+
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_function_get_address
+
+Alternatively, given an existing function, you can obtain a pointer
+to it in :c:type:`gcc_jit_rvalue` form using
+:c:func:`gcc_jit_context_new_rvalue_from_ptr`, using a function pointer
+type obtained using :c:func:`gcc_jit_context_new_function_ptr_type`.
+
+Here's an example of creating a function pointer type corresponding to C's
+:c:type:`void (*) (int, int, int)`:
+
+.. code-block:: c
+
+  gcc_jit_type *void_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_VOID);
+  gcc_jit_type *int_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+
+  /* Build the function ptr type.  */
+  gcc_jit_type *param_types[3];
+  param_types[0] = int_type;
+  param_types[1] =

Re: [PATCH, GCC/ARM, ping] Remove ARMv8-M code for D17-D31

2017-09-28 Thread Thomas Preudhomme


Committed (sorry for delay).

Best regards,

Thomas

On 06/09/17 09:12, Kyrill Tkachov wrote:

Hi Thomas,

On 05/09/17 10:04, Thomas Preudhomme wrote:

Ping?



This is ok if a bootstrap and test run on arm-none-linux-gnueabihf shows no 
problems.

Thanks,
Kyrill


Best regards,

Thomas

On 25/08/17 12:18, Thomas Preudhomme wrote:

Hi,

I've now also added a couple more changes:

* size to_clear_bitmap according to maxregno to be consistent with its use
* use directly TARGET_HARD_FLOAT instead of clear_vfpregs


Original message below (ChangeLog unchanged):

Function cmse_nonsecure_entry_clear_before_return has code to deal with
high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do
not support more than 16 double VFP registers (D0-D15). This makes this
security-sensitive code harder to read for not much benefit since
libcall for cmse_nonsecure_call functions do not deal with those high
VFP registers anyway.

This commit gets rid of this code for simplicity and fixes 2 issues in
the same function:

- stop the first loop when reaching maxregno to avoid dealing with VFP
   registers if targetting Thumb-1 or using -mfloat-abi=soft
- include maxregno in that loop

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme 

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.

Testing: Testsuite shows no regression when run for ARMv8-M Baseline and
ARMv8-M Mainline.

Is this ok for trunk?

Best regards,

Thomas

On 23/08/17 11:56, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 17/07/17 17:25, Thomas Preudhomme wrote:
My bad, found an off-by-one error in the sizing of bitmaps. Please find 
fixed patch in attachment.


ChangeLog entry is unchanged:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme 

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.

Best regards,

Thomas

On 17/07/17 09:52, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 12/07/17 09:59, Thomas Preudhomme wrote:

Hi Richard,

On 07/07/17 15:19, Richard Earnshaw (lists) wrote:


Hmm, I think that's because really this is a partial conversion.  It
looks like doing this properly would involve moving that existing code
to use sbitmaps as well.  I think doing that would be better for
long-term maintenance perspectives, but I'm not going to insist that you
do it now.


There's also the assert later but I've found a way to improve it 
slightly. While switching to auto_sbitmap I also changed the code 
slightly to allocate directly bitmaps to the right size. Since the change 
is probably bigger than what you had in mind I'd appreciate if you can 
give me an OK again. See updated patch in attachment. ChangeLog entry is 
unchanged:


2017-06-13  Thomas Preud'homme 

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.



As a result I'll let you take the call as to whether you keep this
version or go back to your earlier patch.  If you do decide to keep this
version, then see the comment below.


Given the changes I'm more happy with how the patch looks now and making 
it go in can be a nice incentive to change other ARMv8-M Security 
Extension related code later on.


Best regards,

Thomas

[committed] jit: handle equality of function pointer types

2017-09-28 Thread David Malcolm

libgccjit was being overzealous when type-checking function pointers,
requiring exact pointer equality of recording::function_type
instances, defeating attempts by client code to work with function
pointers as data.

This patch removes the overzealous checking, and should allow
function pointers to be used as expected.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
takes jit.sum from 9889 to 9909 PASS results.

Committed to trunk as r253255.

gcc/jit/ChangeLog:
* jit-recording.c
(gcc::jit::recording::function_type::is_same_type_as): New function.
* jit-recording.h: In namespace gcc::jit::recording::
(type::accepts_writes_from): Use is_same_type_as rather than pointer
equality.
(type::is_same_type_as): New virtual function.
(function_type::is_same_type_as): New override.

gcc/testsuite/ChangeLog:
* jit.dg/test-error-mismatching-types-in-assignment-fn-ptr.c: New
test case.
* jit.dg/test-returning-function-ptr.c (create_code): Update to
create a function pointer type independently of the call to
gcc_jit_function_get_address, and assign the pointer to a local
before returning it, to exercise the function pointer type
comparison code.
---
 gcc/jit/jit-recording.c| 47 +++
 gcc/jit/jit-recording.h|  9 ++-
 ...-error-mismatching-types-in-assignment-fn-ptr.c | 92 ++
 gcc/testsuite/jit.dg/test-returning-function-ptr.c | 33 ++--
 4 files changed, 173 insertions(+), 8 deletions(-)
 create mode 100644 
gcc/testsuite/jit.dg/test-error-mismatching-types-in-assignment-fn-ptr.c

diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
index 8481280..6d7dc80 100644
--- a/gcc/jit/jit-recording.c
+++ b/gcc/jit/jit-recording.c
@@ -2643,6 +2643,53 @@ recording::function_type::dereference ()
   return NULL;
 }
 
+/* Implementation of virtual hook recording::type::is_same_type_as for
+   recording::function_type.
+
+   We override this to avoid requiring identity of function pointer types,
+   so that if client code has obtained the same signature in
+   different ways (e.g. via gcc_jit_context_new_function_ptr_type
+   vs gcc_jit_function_get_address), the different function_type
+   instances are treated as compatible.
+
+   We can't use type::accepts_writes_from for this as we need a stronger
+   notion of "sameness": if we have a fn_ptr type that has args that are
+   themselves fn_ptr types, then those args still need to match exactly.
+
+   Alternatively, we could consolidate attempts to create identical
+   function_type instances so that pointer equality works, but that runs
+   into issues about the lifetimes of the cache (w.r.t. nested contexts).  */
+
+bool
+recording::function_type::is_same_type_as (type *other)
+{
+  gcc_assert (other);
+
+  function_type *other_fn_type = other->dyn_cast_function_type ();
+  if (!other_fn_type)
+return false;
+
+  /* Everything must match.  */
+
+  if (!m_return_type->is_same_type_as (other_fn_type->m_return_type))
+return false;
+
+  if (m_param_types.length () != other_fn_type->m_param_types.length ())
+return false;
+
+  unsigned i;
+  type *param_type;
+  FOR_EACH_VEC_ELT (m_param_types, i, param_type)
+if (!param_type->is_same_type_as (other_fn_type->m_param_types[i]))
+  return false;
+
+  if (m_is_variadic != other_fn_type->m_is_variadic)
+return false;
+
+  /* Passed all tests.  */
+  return true;
+}
+
 /* Implementation of pure virtual hook recording::memento::replay_into
for recording::function_type.  */
 
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 8918124..9123645 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -491,7 +491,12 @@ public:
   virtual bool accepts_writes_from (type *rtype)
   {
 gcc_assert (rtype);
-return this->unqualified () == rtype->unqualified ();
+return this->unqualified ()->is_same_type_as (rtype->unqualified ());
+  }
+
+  virtual bool is_same_type_as (type *other)
+  {
+return this == other;
   }
 
   /* Strip off "const" etc */
@@ -751,6 +756,8 @@ public:
   function_type *dyn_cast_function_type () FINAL OVERRIDE { return this; }
   function_type *as_a_function_type () FINAL OVERRIDE { return this; }
 
+  bool is_same_type_as (type *other) FINAL OVERRIDE;
+
   bool is_int () const FINAL OVERRIDE { return false; }
   bool is_float () const FINAL OVERRIDE { return false; }
   bool is_bool () const FINAL OVERRIDE { return false; }
diff --git 
a/gcc/testsuite/jit.dg/test-error-mismatching-types-in-assignment-fn-ptr.c 
b/gcc/testsuite/jit.dg/test-error-mismatching-types-in-assignment-fn-ptr.c
new file mode 100644
index 000..4faa3b4
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-error-mismatching-types-in-assignment-fn-ptr.c
@@ -0,0 +1,92 @@
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code

Re: correct attribute ifunc C++ type safety (PR 82301)

2017-09-28 Thread Nathan Sidwell


On 09/24/2017 06:03 PM, Martin Sebor wrote:

r253041 enhanced type checking for alias and ifunc attributes to
detect declarations of incompatible aliases, or ifunc resolvers
that return pointers to functions of an incompatible type.  More
extensive testing exposed a bug in the implementation of the ifunc
attribute handling in C++ where the checker expected the ifunc
resolver to return a pointer to a member function when the
implementation actually expects it return a pointer to a non-
member function.

In a discussion of the test suite failures, Jakub also suggested
to break the enhanced warning out of -Wattributes and issue it
under a different option.

The attached patch corrects the C++ problem and moves the warning
under -Wincompatible-pointer-types.  Since this is a C-only option,
the patch also enables for it C++.  Since the option is enabled by
default, the patch further requires -Wextra to issue the warning
for ifunc resolvers returning void*.  However, the patched checker
diagnoses other incompatibilities without it.

Martin


I find the maybe_diag_incompatible_alias function confusing.


+/* Check declaration of the type of ALIAS for compatibility with its TARGET
+   (which may be an ifunc resolver) and issue a diagnostic when they are
+   not compatible according to language rules (plus a C++ extension for
+   non-static member functions).  */
+
+static void
+maybe_diag_incompatible_alias (tree alias, tree target)
+{
+  tree altype = TREE_TYPE (alias);
+  tree targtype = TREE_TYPE (target);
+
+  bool ifunc = lookup_attribute ("ifunc", DECL_ATTRIBUTES (alias));
+  if (ifunc)


I think it might be clearer if this was broken out into a diag_ifunc 
function?  But see below ...



+{
+  /* Handle attribute ifunc first.  */
+
+  tree funcptr = altype;
+
+  /* Set FUNCPTR to the type of the alias target.  If the type
+is a non-static member function of class C, construct a type
+of an ordinary function taking C* as the first argument,
+followed by the member function argument list, and use it
+instead to check for compatibilties.  FUNCPTR is used only
+in diagnostics.  */


This comment is self-contradictory.
  1 Set FUNCPTR
  2 Do some method-type shenanigans
  3 Use it to check for incompatibilites
  4 FUNCPTR is only used in diags

Which of #3 and #4 is true?



+
+  if (TREE_CODE (altype) == METHOD_TYPE)
+   {


IMHO put the description of the METHOD_TYPE chicanery inside the block 
doing it?  FWIW, although the change being made works on many (most?) 
ABIs, it's not formally correct and I think fails on some where 'this' 
is passed specially. You might want to note that?



+ tree rettype = TREE_TYPE (altype);
+ tree args = TYPE_ARG_TYPES (altype);
+ altype = build_function_type (rettype, args);
+ funcptr = altype;
+   }
+



+ if ((!FUNC_OR_METHOD_TYPE_P (targtype)
+  || (prototype_p (altype)
+  && prototype_p (targtype)
+  && !types_compatible_p (altype, targtype
+   {
+ funcptr = build_pointer_type (funcptr);
+
+ if (warning_at (DECL_SOURCE_LOCATION (target),
+ OPT_Wincompatible_pointer_types,
+ "% resolver for %qD should return %qT",
+ alias, funcptr))
+   inform (DECL_SOURCE_LOCATION (alias),
+   "resolver indirect function declared here");
+   }


this block is almost the same as the non-ifunc block.  Surely they can 
be the same code? (by generalizing one of the cases until it turns into 
the other?)




+ /* Deal with static member function pointers.  */


I do not understand this comment or condition. We seem to have dealt 
with pointers already and the conditions seem confused.



+ if (TREE_CODE (targtype) != RECORD_TYPE
+ || TYPE_FIELDS (targtype)
+ || TREE_CODE (TREE_TYPE (TYPE_FIELDS (targtype))) != POINTER_TYPE
+ || (TREE_CODE (TREE_TYPE (TREE_TYPE (TYPE_FIELDS (targtype
+ != METHOD_TYPE))


if
  not a record,
  or has TYPE_FIELDS non-NULL
  or the first field doesn't have pointer type (we can't get here)
  or something else about the first field

oh, I think it's trying to spot the pointer to NON-static member 
function internal record type.  But brokenly. I think pmf record_types 
have DECL_ARTIFICIAL and BUILTIN_LOCATION, that might be useful.



+   {
+ funcptr = build_pointer_type (funcptr);
+
+ error ("% resolver for %qD must return %qT",
+alias, funcptr);
+   
+ inform (DECL_SOURCE_LOCATION (alias),
+ "resolver indirect function declared here");
+   }
+   }
+



+  if ((!FUNC_OR_METHOD_TYPE_P (targtype)
+   || (prototype_p (altype)
+  && prototype_p

RE: [PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Shalnov, Sergey

Sorry. The patch is changed as you proposed.



-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Thursday, September 28, 2017 3:17 PM
To: Shalnov, Sergey 
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Senkevich, Andrew 
; Ivchenko, Alexander 
; Peryt, Sebastian 
Subject: Re: [PATCH, i386] Avoid 512-bit vector return constant for Intel 
AVX512 configuration

On Thu, Sep 28, 2017 at 3:08 PM, Shalnov, Sergey  
wrote:
> Hi,
> GCC uses full 512-bit register to return the constant from the function.
> The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
>
> 2017-09-28  Sergey Shalnov  
>
> gcc/
> * config/i386/i386.md(*movsf_internal, *movdf_internal):
> Return 256-bit AVX modes for TARGET_PREFER_AVX256.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-constant-float-return.c: New test.
>

-(match_test "TARGET_AVX512F")
+(match_test "TARGET_AVX512F && !TARGET_PREFER_AVX256")

Please use

(and (match_test "TARGET_AVX512F)
(not (match_test "TARGET_PREFER_AVX256)))

Uros.


0001-Avoid-useing-zmm-if-TARGET_PREFER_AVX256.patch
Description: 0001-Avoid-useing-zmm-if-TARGET_PREFER_AVX256.patch

Re: Enable ifunc attribute by default for SPARC GNU/Linux

2017-09-28 Thread Eric Botcazou

> Similar to other architectures with IFUNC binutils/glibc support, this
> patch enables the ifunc attribute for SPARC GNU/Linux.  This is needed
> for building glibc with the current checks on IFUNC resolver types
> (and use of the attribute in glibc rather than manually created IFUNCs
> is beneficial anyway because it results in better debug info).
> 
> Tested compilation of glibc (in conjunction with a glibc patch to
> support using the attribute on SPARC) with build-many-glibcs.py.  I
> have not run the GCC tests for SPARC.  OK to commit?

I presume so, although I don't really understand all the consequences.

-- 
Eric Botcazou

Re: [ARM,testsuite] Some tests require arm_neon_hw

2017-09-28 Thread Richard Earnshaw

On 28/09/17 14:09, Christophe Lyon wrote:
> Hi,
> 
> I've noticed that a few dg-run arm tests require Neon to execute, but
> do not ensure that. They only check that the compiler options are
> support. This small patch fixes that by adding arm_neon_hw
> effective-target.
> 
> This makes the tests unsupported on the related configurations (eg
> cortex-m3 , arm10tdmi).
> 
> OK?
> 
> Thanks,
> 
> Christophe
> 

OK.

R.

Re: [PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Uros Bizjak

On Thu, Sep 28, 2017 at 3:08 PM, Shalnov, Sergey
 wrote:
> Hi,
> GCC uses full 512-bit register to return the constant from the function.
> The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
>
> 2017-09-28  Sergey Shalnov  
>
> gcc/
> * config/i386/i386.md(*movsf_internal, *movdf_internal):
> Return 256-bit AVX modes for TARGET_PREFER_AVX256.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-constant-float-return.c: New test.
>

-(match_test "TARGET_AVX512F")
+(match_test "TARGET_AVX512F && !TARGET_PREFER_AVX256")

Please use

(and (match_test "TARGET_AVX512F)
(not (match_test "TARGET_PREFER_AVX256)))

Uros.

[ARM,testsuite] Some tests require arm_neon_hw

2017-09-28 Thread Christophe Lyon

Hi,

I've noticed that a few dg-run arm tests require Neon to execute, but
do not ensure that. They only check that the compiler options are
support. This small patch fixes that by adding arm_neon_hw
effective-target.

This makes the tests unsupported on the related configurations (eg
cortex-m3 , arm10tdmi).

OK?

Thanks,

Christophe
2017-09-28  Christophe Lyon  

* gcc.target/arm/aapcs/align4.c: Require arm_neon_hw effective target.
* gcc.target/arm/aapcs/align_rec4.c: Likewise.
* gcc.target/arm/aapcs/neon-vect1.c: Likewise.
* gcc.target/arm/aapcs/neon-vect2.c: Likewise.
* gcc.target/arm/aapcs/neon-vect3.c: Likewise.
* gcc.target/arm/aapcs/neon-vect4.c: Likewise.
* gcc.target/arm/aapcs/neon-vect5.c: Likewise.
* gcc.target/arm/aapcs/neon-vect6.c: Likewise.
* gcc.target/arm/aapcs/neon-vect7.c: Likewise.
* gcc.target/arm/aapcs/neon-vect8.c: Likewise.
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/align4.c 
b/gcc/testsuite/gcc.target/arm/aapcs/align4.c
index 5535c55..df52335 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/align4.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/align4.c
@@ -2,7 +2,8 @@
 
 /* { dg-do run { target arm_eabi } } */
 /* { dg-require-effective-target arm32 } */
-/* { dg-require-effective-target arm_neon_ok  } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-options "-O" } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/align_rec4.c 
b/gcc/testsuite/gcc.target/arm/aapcs/align_rec4.c
index 907b90a..6732fa6 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/align_rec4.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/align_rec4.c
@@ -3,6 +3,7 @@
 /* { dg-do run { target arm_eabi } } */
 /* { dg-require-effective-target arm32 } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-options "-O -fno-inline" } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect1.c 
b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect1.c
index 64f9466..1a85761 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect1.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect1.c
@@ -1,8 +1,9 @@
 /* Test AAPCS layout (VFP variant for Neon types) */
 
 /* { dg-do run { target arm_eabi } } */
-/* { dg-require-effective-target arm_hard_vfp_ok  } */
-/* { dg-require-effective-target arm_neon_ok  } */
+/* { dg-require-effective-target arm_hard_vfp_ok } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-require-effective-target arm32 } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect2.c 
b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect2.c
index f5d4609..66d73ce 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect2.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect2.c
@@ -3,6 +3,7 @@
 /* { dg-do run { target arm_eabi } } */
 /* { dg-require-effective-target arm_hard_vfp_ok } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-require-effective-target arm32 } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect3.c 
b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect3.c
index 31fb1da..38c04ab 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect3.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect3.c
@@ -3,6 +3,7 @@
 /* { dg-do run { target arm_eabi } } */
 /* { dg-require-effective-target arm_hard_vfp_ok } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-require-effective-target arm32 } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect4.c 
b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect4.c
index bfefccc..1e6a0a5 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect4.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect4.c
@@ -1,8 +1,9 @@
 /* Test AAPCS layout (VFP variant for Neon types) */
 
 /* { dg-do run { target arm_eabi } } */
-/* { dg-require-effective-target arm_hard_vfp_ok  } */
+/* { dg-require-effective-target arm_hard_vfp_ok } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* { dg-require-effective-target arm32 } */
 /* { dg-add-options arm_neon } */
 
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect5.c 
b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect5.c
index ff7a857..fd78be2 100644
--- a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect5.c
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect5.c
@@ -3,6 +3,7 @@
 /* { dg-do run { target arm_eabi } } */
 /* { dg-require-effective-target arm_hard_vfp_ok } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_neon_hw } */
 /* {

[PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Shalnov, Sergey

Hi,
GCC uses full 512-bit register to return the constant from the function.
The patch avoid 512-bit register usage if "-mprefer-avx256" option used.

2017-09-28  Sergey Shalnov  

gcc/
* config/i386/i386.md(*movsf_internal, *movdf_internal):
Return 256-bit AVX modes for TARGET_PREFER_AVX256.

gcc/testsuite/
* gcc.target/i386/avx512f-constant-float-return.c: New test.



0001-Avoid-useing-zmm-if-TARGET_PREFER_AVX256.patch
Description: 0001-Avoid-useing-zmm-if-TARGET_PREFER_AVX256.patch

Re: Avoid assembler warnings from AArch64 constructor/destructor priorities

2017-09-28 Thread Richard Earnshaw (lists)

On 28/09/17 13:31, Joseph Myers wrote:
> Many GCC tests fail for AArch64 with current binutils because of
> assembler warnings of the form "Warning: ignoring incorrect section
> type for .init_array.00100".  The same issue was fixed for ARM in
> r247015 by using SECTION_NOTYPE when creating those sections; this
> patch applies the same fix to AArch64.
> 
> Tested with no regressions with cross to aarch64-linux-gnu.  OK to
> commit?
> 
> 2017-09-28  Joseph Myers  
> 
>   * config/aarch64/aarch64.c (aarch64_elf_asm_constructor)
>   (aarch64_elf_asm_destructor): Pass SECTION_NOTYPE to get_section
>   when creating .init_array and .fini_array sections with priority
>   specified.
> 

OK.

R.

> Index: gcc/config/aarch64/aarch64.c
> ===
> --- gcc/config/aarch64/aarch64.c  (revision 253248)
> +++ gcc/config/aarch64/aarch64.c  (working copy)
> @@ -6095,7 +6095,7 @@ aarch64_elf_asm_constructor (rtx symbol, int prior
>   -Wformat-truncation false positive, use a larger size.  */
>char buf[23];
>snprintf (buf, sizeof (buf), ".init_array.%.5u", priority);
> -  s = get_section (buf, SECTION_WRITE, NULL);
> +  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
>switch_to_section (s);
>assemble_align (POINTER_SIZE);
>assemble_aligned_integer (POINTER_BYTES, symbol);
> @@ -6115,7 +6115,7 @@ aarch64_elf_asm_destructor (rtx symbol, int priori
>   -Wformat-truncation false positive, use a larger size.  */
>char buf[23];
>snprintf (buf, sizeof (buf), ".fini_array.%.5u", priority);
> -  s = get_section (buf, SECTION_WRITE, NULL);
> +  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
>switch_to_section (s);
>assemble_align (POINTER_SIZE);
>assemble_aligned_integer (POINTER_BYTES, symbol);
>

[RFA gfortran] PR 25071: dummy argument larger than actual argument

2017-09-28 Thread Dominique d'Humières

Hi all,

In the PR there was some consensus to turn the warnings into errors. 

This is what the patch does: the warnings are kept with -std=legacy, errors are 
emitted otherwise.
 I am sure it may exist a better solution, but I did not find it. 

While regtesting I have found several regressions on top of 
gfortran.dg/warn_argument_mismatch_1.f90

At this point, I had two options: use -std=legacy everywhere or replace the 
dg-warning with dg-error. I have chosen the later except for 
warn_argument_mismatch_1.f90.

Regtested on x86_64-apple-darwin16.

Comments are welcome and I’ll provide the change logs once the dust has settled.

TIA

Dominique



patch-25071b
Description: Binary data

Re: [PATCH] libstdc++: istreambuf_iterator keep attached streambuf

2017-09-28 Thread Jonathan Wakely


On 28/09/17 15:06 +0300, Petr Ovtchenkov wrote:

On Thu, 28 Sep 2017 11:34:25 +0100
Jonathan Wakely  wrote:


On 23/09/17 09:54 +0300, Petr Ovtchenkov wrote:
>istreambuf_iterator should not forget about attached
>streambuf when it reach EOF.
>
>Checks in debug mode has no infuence more on character
>extraction in istreambuf_iterator increment operators.
>In this aspect behaviour in debug and non-debug mode
>is similar now.
>
>Test for detached srteambuf in istreambuf_iterator:
>When istreambuf_iterator reach EOF of istream, it should not
>forget about attached streambuf.
From fact "EOF in stream reached" not follow that
>stream reach end of life and input operation impossible
>more.
>---
> libstdc++-v3/include/bits/streambuf_iterator.h | 41 +++
> .../24_iterators/istreambuf_iterator/3.cc  | 61 ++
> 2 files changed, 80 insertions(+), 22 deletions(-)
> create mode 100644 
libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
>
>diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
>b/libstdc++-v3/include/bits/streambuf_iterator.h index f0451b1..45c3d89 100644
>--- a/libstdc++-v3/include/bits/streambuf_iterator.h
>+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
>@@ -136,12 +136,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   istreambuf_iterator&
>   operator++()
>   {
>-   __glibcxx_requires_cond(!_M_at_eof(),
>+   __glibcxx_requires_cond(_M_sbuf,
>_M_message(__gnu_debug::__msg_inc_istreambuf)
>._M_iterator(*this));
>if (_M_sbuf)
>  {
>+#ifdef _GLIBCXX_DEBUG_PEDANTIC
>+   int_type _tmp =
>+#endif
>_M_sbuf->sbumpc();
>+#ifdef _GLIBCXX_DEBUG_PEDANTIC
>+   
__glibcxx_requires_cond(!traits_type::eq_int_type(_tmp,traits_type::eof()),
>+   _M_message(__gnu_debug::__msg_inc_istreambuf)
>+   ._M_iterator(*this));
>+#endif
>_M_c = traits_type::eof();
>  }
>return *this;
>@@ -151,14 +159,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   istreambuf_iterator
>   operator++(int)
>   {
>-   __glibcxx_requires_cond(!_M_at_eof(),
>+_M_get();
>+   __glibcxx_requires_cond(_M_sbuf
>+   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
>_M_message(__gnu_debug::__msg_inc_istreambuf)
>._M_iterator(*this));
>
>istreambuf_iterator __old = *this;
>if (_M_sbuf)
>  {
>-   __old._M_c = _M_sbuf->sbumpc();
>+   _M_sbuf->sbumpc();
>_M_c = traits_type::eof();
>  }
>return __old;
>@@ -177,18 +187,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   _M_get() const
>   {
>const int_type __eof = traits_type::eof();
>-   int_type __ret = __eof;
>-   if (_M_sbuf)
>- {
>-   if (!traits_type::eq_int_type(_M_c, __eof))
>- __ret = _M_c;
>-   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
>-  __eof))
>- _M_c = __ret;
>-   else
>- _M_sbuf = 0;
>- }
>-   return __ret;
>+   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
>+  _M_c = _M_sbuf->sgetc();
>+   return _M_c;
>   }
>
>   bool
>@@ -339,7 +340,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
>   typedef typename traits_type::int_type   int_type;
>
>-  if (__first._M_sbuf && !__last._M_sbuf)
>+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
>{
>  streambuf_type* __sb = __first._M_sbuf;
>  int_type __c = __sb->sgetc();
>@@ -374,7 +375,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
>   typedef typename traits_type::int_type   int_type;
>
>-  if (__first._M_sbuf && !__last._M_sbuf)
>+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
>{
>  const int_type __ival = traits_type::to_int_type(__val);
>  streambuf_type* __sb = __first._M_sbuf;
>@@ -395,11 +396,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  else
>__c = __sb->snextc();
>}
>-
>- if (!traits_type::eq_int_type(__c, traits_type::eof()))
>-   __first._M_c = __c;
>- else
>-   __first._M_sbuf = 0;
>+ __first._M_c = __c;
>}
>   return __first;
> }
>diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
>b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc new file mode 
100644
>index 000..803ede4
>--- /dev/null
>+++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
>@@ -0,0 +1,61 @@
>+// { dg-options "-std=gnu++17" }
>+
>+// Copyright (C) 2017 Free Software Foundation, Inc.
>+//
>+// This file is part of the GNU ISO C++ Library.  This library is free
>+// software; you can redistribute it and/or modify it under the
>+//

Avoid assembler warnings from AArch64 constructor/destructor priorities

2017-09-28 Thread Joseph Myers

Many GCC tests fail for AArch64 with current binutils because of
assembler warnings of the form "Warning: ignoring incorrect section
type for .init_array.00100".  The same issue was fixed for ARM in
r247015 by using SECTION_NOTYPE when creating those sections; this
patch applies the same fix to AArch64.

Tested with no regressions with cross to aarch64-linux-gnu.  OK to
commit?

2017-09-28  Joseph Myers  

* config/aarch64/aarch64.c (aarch64_elf_asm_constructor)
(aarch64_elf_asm_destructor): Pass SECTION_NOTYPE to get_section
when creating .init_array and .fini_array sections with priority
specified.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 253248)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -6095,7 +6095,7 @@ aarch64_elf_asm_constructor (rtx symbol, int prior
  -Wformat-truncation false positive, use a larger size.  */
   char buf[23];
   snprintf (buf, sizeof (buf), ".init_array.%.5u", priority);
-  s = get_section (buf, SECTION_WRITE, NULL);
+  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
   switch_to_section (s);
   assemble_align (POINTER_SIZE);
   assemble_aligned_integer (POINTER_BYTES, symbol);
@@ -6115,7 +6115,7 @@ aarch64_elf_asm_destructor (rtx symbol, int priori
  -Wformat-truncation false positive, use a larger size.  */
   char buf[23];
   snprintf (buf, sizeof (buf), ".fini_array.%.5u", priority);
-  s = get_section (buf, SECTION_WRITE, NULL);
+  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
   switch_to_section (s);
   assemble_align (POINTER_SIZE);
   assemble_aligned_integer (POINTER_BYTES, symbol);

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Make tests less istreambuf_iterator implementation dependent

2017-09-28 Thread Jonathan Wakely


On 27/09/17 22:16 +0200, François Dumont wrote:

Hi

    I just committed attached patch as trivial.

    Those tests were highly istreambuf_iterator implementation, it is 
the result of the call to money_get<>::get which is pointing 
immediately beyond the last character recognized to quote Standard 
words.


But according to the standard's specification for istreambuf_iterator
it makes no difference, because both iterators point to the same
streambuf and share the state.

Re: [libgomp, testsuite] Remove superfluous -fopenmp from libgomp testcases

2017-09-28 Thread Tom de Vries


On 09/28/2017 10:37 AM, Thomas Schwinge wrote:

Hi Tom!

On Thu, 28 Sep 2017 09:41:15 +0200, Tom de Vries  wrote:

On 11/07/2013 09:11 AM, Jakub Jelinek wrote:

On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:

From: Thomas Schwinge 

libgomp/
* testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
ALWAYS_CFLAGS.
* testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
* testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
Likewise.


Note that my patch just moved *where* the flag gets set, so...


Following up on this, how about we drop the now superfluous -fopenmp in
current test-cases?


... it has already been superfluous before.  ;-)

Anyway: ACK conceptually.


Tested on x86_64. Verified by analyzing libgomp.log that -fopenmp is
still passed to test-cases as required.

OK for trunk?



--- a/libgomp/testsuite/libgomp.c++/for-12.C
+++ b/libgomp/testsuite/libgomp.c++/for-12.C
@@ -1,5 +1,3 @@
-/* { dg-options "-fopenmp" } */


As far as I remember, this means that instead of "-fopenmp" the
"DEFAULT_CFLAGS" will then be used: "-O2", so this effectively changes
testing from "-O2" to "-O2".


I think you mean from "-O0" to "-O2", but yes.

I was explicit about this in an earlier commit ( 
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00487.html ), but forgot 
again about it here.


Anyway, I think we only care about preserving explicit "-O0" settings, 
so this should be ok.



With that fixed: Reviewed-by: Thomas Schwinge 


You just got the first review credits in the commit log since 2004 ;)

Thanks,
- Tom

Re: [PATCH] libstdc++: istreambuf_iterator keep attached streambuf

2017-09-28 Thread Petr Ovtchenkov

On Thu, 28 Sep 2017 11:34:25 +0100
Jonathan Wakely  wrote:

> On 23/09/17 09:54 +0300, Petr Ovtchenkov wrote:
> >istreambuf_iterator should not forget about attached
> >streambuf when it reach EOF.
> >
> >Checks in debug mode has no infuence more on character
> >extraction in istreambuf_iterator increment operators.
> >In this aspect behaviour in debug and non-debug mode
> >is similar now.
> >
> >Test for detached srteambuf in istreambuf_iterator:
> >When istreambuf_iterator reach EOF of istream, it should not
> >forget about attached streambuf.
> From fact "EOF in stream reached" not follow that
> >stream reach end of life and input operation impossible
> >more.
> >---
> > libstdc++-v3/include/bits/streambuf_iterator.h | 41 +++
> > .../24_iterators/istreambuf_iterator/3.cc  | 61 
> > ++
> > 2 files changed, 80 insertions(+), 22 deletions(-)
> > create mode 100644 
> > libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
> >
> >diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
> >b/libstdc++-v3/include/bits/streambuf_iterator.h index f0451b1..45c3d89 
> >100644
> >--- a/libstdc++-v3/include/bits/streambuf_iterator.h
> >+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
> >@@ -136,12 +136,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   istreambuf_iterator&
> >   operator++()
> >   {
> >-__glibcxx_requires_cond(!_M_at_eof(),
> >+__glibcxx_requires_cond(_M_sbuf,
> > _M_message(__gnu_debug::__msg_inc_istreambuf)
> > ._M_iterator(*this));
> > if (_M_sbuf)
> >   {
> >+#ifdef _GLIBCXX_DEBUG_PEDANTIC
> >+int_type _tmp =
> >+#endif
> > _M_sbuf->sbumpc();
> >+#ifdef _GLIBCXX_DEBUG_PEDANTIC
> >+
> >__glibcxx_requires_cond(!traits_type::eq_int_type(_tmp,traits_type::eof()),
> >+
> >_M_message(__gnu_debug::__msg_inc_istreambuf)
> >+._M_iterator(*this));
> >+#endif
> > _M_c = traits_type::eof();
> >   }
> > return *this;
> >@@ -151,14 +159,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   istreambuf_iterator
> >   operator++(int)
> >   {
> >-__glibcxx_requires_cond(!_M_at_eof(),
> >+_M_get();
> >+__glibcxx_requires_cond(_M_sbuf
> >+&& 
> >!traits_type::eq_int_type(_M_c,traits_type::eof()),
> > _M_message(__gnu_debug::__msg_inc_istreambuf)
> > ._M_iterator(*this));
> >
> > istreambuf_iterator __old = *this;
> > if (_M_sbuf)
> >   {
> >-__old._M_c = _M_sbuf->sbumpc();
> >+_M_sbuf->sbumpc();
> > _M_c = traits_type::eof();
> >   }
> > return __old;
> >@@ -177,18 +187,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   _M_get() const
> >   {
> > const int_type __eof = traits_type::eof();
> >-int_type __ret = __eof;
> >-if (_M_sbuf)
> >-  {
> >-if (!traits_type::eq_int_type(_M_c, __eof))
> >-  __ret = _M_c;
> >-else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
> >-   __eof))
> >-  _M_c = __ret;
> >-else
> >-  _M_sbuf = 0;
> >-  }
> >-return __ret;
> >+if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
> >+  _M_c = _M_sbuf->sgetc();
> >+return _M_c;
> >   }
> >
> >   bool
> >@@ -339,7 +340,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
> >   typedef typename traits_type::int_type   int_type;
> >
> >-  if (__first._M_sbuf && !__last._M_sbuf)
> >+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
> > {
> >   streambuf_type* __sb = __first._M_sbuf;
> >   int_type __c = __sb->sgetc();
> >@@ -374,7 +375,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
> >   typedef typename traits_type::int_type   int_type;
> >
> >-  if (__first._M_sbuf && !__last._M_sbuf)
> >+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
> > {
> >   const int_type __ival = traits_type::to_int_type(__val);
> >   streambuf_type* __sb = __first._M_sbuf;
> >@@ -395,11 +396,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >   else
> > __c = __sb->snextc();
> > }
> >-
> >-  if (!traits_type::eq_int_type(__c, traits_type::eof()))
> >-__first._M_c = __c;
> >-  else
> >-__first._M_sbuf = 0;
> >+  __first._M_c = __c;
> > }
> >   return __first;
> > }
> >diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
> >b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc new file mode 
> >100644
> >index 000..803ede4
> >--- /dev/null
> >+++

Re: [PATCH 4/5] New target check: vect_nopeel - v2

2017-09-28 Thread Andreas Krebbel

On 09/27/2017 07:30 PM, Sandra Loosemore wrote:
> On 09/27/2017 03:05 AM, Rainer Orth wrote:
>> Hi Andreas,
>>
>>> On 09/27/2017 10:10 AM, Rainer Orth wrote:
 Hi Andreas,

> On 09/26/2017 02:26 PM, Rainer Orth wrote:
>> Hi Andreas,
>>
>>> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
>>> index 307c726..3acfd85 100644
>>> --- a/gcc/doc/sourcebuild.texi
>>> +++ b/gcc/doc/sourcebuild.texi
>>> @@ -1398,6 +1398,9 @@ Target supports a vector misalign access.
>>>   @item vect_no_align
>>>   Target does not support a vector alignment mechanism.
>>>
>>> +@item vect_no_peel
>>> +Target does not require any loop peeling for alignment purposes.
>>> +
>>>   @item vect_no_int_min_max
>>>   Target does not support a vector min and max instruction on 
>>> @code{int}.
>>
>> please keep the items sorted alphabetically.
>
> The items do not appear to be sorted alphabetically.

 they should be.  Your patch makes the ordering even more random.

 Patch to fix this preapproved ;-)
>>> The items rather appear to be arranged by subject. Does it really make
>>> sense do pull items like this
>>> apart just to have it in alphabetical order?
>>>
>>> @item vect_intfloat_cvt
>>> Target supports conversion from @code{signed int} to @code{float}.
>>>
>>> @item vect_uintfloat_cvt
>>> Target supports conversion from @code{unsigned int} to @code{float}.
>>>
>>> @item vect_floatint_cvt
>>> Target supports conversion from @code{float} to @code{signed int}.
>>>
>>> @item vect_floatuint_cvt
>>> Target supports conversion from @code{float} to @code{unsigned int}.
>>>
>>>
>>> I've added the no_peel item intentionally to the hw_misalign/no_align block.
>>
>> granted, there are some attempts at that, but I find it hard to make my
>> way through that longish list.  The way it is, you have to skip through
>> the whole list beginning to end.  Texinfo seems to have no subsubsection
>> which would allow to make the sub-grouping explicit...
>>
>> Let's hear what Sandra thinks.
> 
> U.  There is no common convention in the GCC documentation and other 
> parts of the manual do deliberately diverge from alphabetization in 
> places.  There's a perpetual tension between putting the most 
> commonly-needed information first vs grouping things by related concepts 
> vs alphabetize vs the tendency of people to insert new items at random 
> places in an existing list regardless of how it's previously been 
> organized.  :-(
> 
> Alphabetical lists are useful when you already know the name of the 
> thing you are searching for, but almost everybody reads the 
> documentation in a web browser or PDF viewer with a search feature 
> nowadays so you can find the term no matter how the list is sorted.  So 
> I'd say we shouldn't alphabetize as a matter of policy if there is some 
> other organization that makes sense.
> 
> In this case, the section is already broken into multiple sublists by 
> topic, most of the sublists are fairly short, and where there's some 
> discernible sort order within the sublists, it seems to be grouping 
> related things together rather than alphabetical.  So I wouldn't insist 
> on alphabetizing this particular sublist either.
> 
> -Sandra

Ok thanks for the clarification. I'll try to fit the documentation updates into 
the existing
structure.  Updated patchset here: 
https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01862.html

Bye,

-Andreas-

Re: [PATCH 4/5] New target check: vect_nopeel - v2

2017-09-28 Thread Andreas Krebbel

On 09/26/2017 06:49 PM, Richard Sandiford wrote:
> Andreas Krebbel  writes:
...
> Sorry for the bikeshedding, but how about having a positive test
> like vect_can_peel instead?  ! vect_no... can be hard to read in
> complex conditions.  (There's already that problem with existing
> vect_no...s.)

Done. Updated patch here: 
https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01867.html

Bye,

-Andreas-

[PATCH 5/5] New target check: vect_can_peel

2017-09-28 Thread Andreas Krebbel

gcc/ChangeLog:

2017-09-28  Andreas Krebbel  

* doc/sourcebuild.texi: Document vect_can_peel.

gcc/testsuite/ChangeLog:

2017-09-28  Andreas Krebbel  

* g++.dg/vect/slp-pr56812.cc: xfail for targets which don't want
vector loop peeling.
* lib/target-supports.exp (check_effective_target_vect_can_peel):
New proc.
---
 gcc/doc/sourcebuild.texi |  3 +++
 gcc/testsuite/g++.dg/vect/slp-pr56812.cc |  4 +++-
 gcc/testsuite/lib/target-supports.exp| 22 ++
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index e09bca1..01d8595 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1404,6 +1404,9 @@ Target supports a vector misalign access.
 @item vect_no_align
 Target does not support a vector alignment mechanism.
 
+@item vect_can_peel
+Target might require to peel loops for alignment purposes.
+
 @item vect_no_int_min_max
 Target does not support a vector min and max instruction on @code{int}.
 
diff --git a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc 
b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
index 80bdcdd..7d1cd71 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
@@ -17,4 +17,6 @@ void mydata::Set (float x)
 data[i] = x;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" } } */
+/* For targets without vector loop peeling the loop becomes cheap
+   enough to be vectorized.  */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" { xfail 
{ ! vect_can_peel } } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4f752ec2..49a7aef 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3277,6 +3277,28 @@ proc check_effective_target_vect_floatuint_cvt { } {
 return $et_vect_floatuint_cvt_saved($et_index)
 }
 
+# Return 1 if peeling for alignment might be profitable on the target
+#
+
+proc check_effective_target_vect_can_peel { } {
+global et_vect_can_peel_saved
+global et_index
+
+if [info exists et_vect_can_peel_saved($et_index)] {
+   verbose "check_effective_target_vect_can_peel: using cached result" 2
+} else {
+   set et_vect_can_peel_saved($et_index) 1
+if { ([istarget s390*-*-*]
+ && [check_effective_target_s390_vx]) } {
+   set et_vect_can_peel_saved($et_index) 0
+}
+}
+
+verbose "check_effective_target_vect_can_peel:\
+returning $et_vect_can_peel_saved($et_index)" 2
+return $et_vect_can_peel_saved($et_index)
+}
+
 # Return 1 if the target supports #pragma omp declare simd, 0 otherwise.
 #
 # This won't change for different subtargets so cache the result.
-- 
2.9.1

[PATCH 4/5] New target check for double<->int conversions

2017-09-28 Thread Andreas Krebbel

gcc/ChangeLog:

2017-09-28  Andreas Krebbel  

* doc/sourcebuild.texi: Document vect_intdouble_cvt and
vect_doubleint_cvt.

gcc/testsuite/ChangeLog:

2017-09-28  Andreas Krebbel  

* gcc.dg/vect/pr66251.c: Replace vect_floatint_cvt with
vect_doubleint_cvt and vect_intfloat_cvt with vect_intdouble_cvt.
* gcc.dg/vect/vect-floatint-conversion-2.c: Replace
vect_floatint_cvt with vect_doubleint_cvt.
* gcc.dg/vect/vect-intfloat-conversion-3.c: Replace
vect_intfloat_cvt with vect_intdouble_cvt.
* gfortran.dg/vect/pr60510.f: Require vect_intdouble_cvt.
* gfortran.dg/vect/vect-8.f90: Make number of vectorized loops
depend on vect_intdouble_cvt.
* lib/target-supports.exp
(check_effective_target_vect_doubleint_cvt)
(check_effective_target_vect_intdouble_cvt): New procs.
---
 gcc/doc/sourcebuild.texi   |  6 +++
 gcc/testsuite/gcc.dg/vect/pr66251.c|  4 +-
 .../gcc.dg/vect/vect-floatint-conversion-2.c   |  2 +-
 .../gcc.dg/vect/vect-intfloat-conversion-3.c   |  2 +-
 gcc/testsuite/gfortran.dg/vect/pr60510.f   |  1 +
 gcc/testsuite/gfortran.dg/vect/vect-8.f90  |  3 +-
 gcc/testsuite/lib/target-supports.exp  | 62 ++
 7 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 4f25268..e09bca1 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1508,6 +1508,12 @@ Target supports conversion from @code{float} to 
@code{signed int}.
 @item vect_floatuint_cvt
 Target supports conversion from @code{float} to @code{unsigned int}.
 
+@item vect_intdouble_cvt
+Target supports conversion from @code{signed int} to @code{double}.
+
+@item vect_doubleint_cvt
+Target supports conversion from @code{double} to @code{signed int}.
+
 @item vect_max_reduc
 Target supports max reduction for vectors.
 @end table
diff --git a/gcc/testsuite/gcc.dg/vect/pr66251.c 
b/gcc/testsuite/gcc.dg/vect/pr66251.c
index 7f0c4bc..26afbc9 100644
--- a/gcc/testsuite/gcc.dg/vect/pr66251.c
+++ b/gcc/testsuite/gcc.dg/vect/pr66251.c
@@ -1,7 +1,7 @@
 /* { dg-require-effective-target vect_int } */
 /* { dg-require-effective-target vect_double } */
-/* { dg-require-effective-target vect_floatint_cvt } */
-/* { dg-require-effective-target vect_intfloat_cvt } */
+/* { dg-require-effective-target vect_doubleint_cvt } */
+/* { dg-require-effective-target vect_intdouble_cvt } */
 /* { dg-require-effective-target vect_pack_trunc } */
 /* { dg-require-effective-target vect_unpack } */
 /* { dg-require-effective-target vect_hw_misalign } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-2.c
index 27d248b..64fab38 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-2.c
@@ -36,4 +36,4 @@ main (void)
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_floatint_cvt } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_doubleint_cvt } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-3.c
index 6eb4fec..78fc3da 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-3.c
@@ -35,4 +35,4 @@ int main (void)
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_intfloat_cvt } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_intdouble_cvt } } } */
diff --git a/gcc/testsuite/gfortran.dg/vect/pr60510.f 
b/gcc/testsuite/gfortran.dg/vect/pr60510.f
index 5e2c085..202c1be 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr60510.f
+++ b/gcc/testsuite/gfortran.dg/vect/pr60510.f
@@ -1,5 +1,6 @@
 ! { dg-do run }
 ! { dg-require-effective-target vect_double }
+! { dg-require-effective-target vect_intdouble_cvt }
 ! { dg-additional-options "-fno-inline -ffast-math" }
   subroutine foo(a,x,y,n)
   implicit none
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-8.f90 
b/gcc/testsuite/gfortran.dg/vect/vect-8.f90
index ec95598..8e18be5 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-8.f90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-8.f90
@@ -704,4 +704,5 @@ CALL track('KERNEL  ')
 RETURN
 END SUBROUTINE kernel
 
-! { dg-final { scan-tree-dump-times "vectorized 21 loops" 1 "vect" } }
+! { dg-final { scan-tree-dump-times "vectorized 21 loops" 1 "vect" { target { 
vect_intdouble_cvt } } } }
+! { dg-final { scan-tree-dump-times "vectorized 17 loops" 1 "vect" { target { 
! vect_intdouble_cvt } } } }
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp

[PATCH 3/5] New target check: vect_long_mult

2017-09-28 Thread Andreas Krebbel

We don't have a 64 bit vector integer multiply on z.  Add a specific
check for that.

gcc/ChangeLog:

2017-09-28  Andreas Krebbel  

* doc/sourcebuild.texi: Document vect_long_mult.

gcc/testsuite/ChangeLog:

2017-09-28  Andreas Krebbel  

* gcc.dg/vect/pr60656.c: Check vect_long_mult.
* lib/target-supports.exp (check_effective_target_vect_long_mult):
New proc.
---
 gcc/doc/sourcebuild.texi  |  3 +++
 gcc/testsuite/gcc.dg/vect/pr60656.c   |  3 ++-
 gcc/testsuite/lib/target-supports.exp | 24 
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 56e1b4e..4f25268 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1422,6 +1422,9 @@ Target supports @code{vector short} multiplication.
 @item vect_int_mult
 Target supports @code{vector int} multiplication.
 
+@item vect_long_mult
+Target supports 64 bit @code{vector long} multiplication.
+
 @item vect_extract_even_odd
 Target supports vector even/odd element extraction.
 
diff --git a/gcc/testsuite/gcc.dg/vect/pr60656.c 
b/gcc/testsuite/gcc.dg/vect/pr60656.c
index d9e30bb..70ec0f6 100644
--- a/gcc/testsuite/gcc.dg/vect/pr60656.c
+++ b/gcc/testsuite/gcc.dg/vect/pr60656.c
@@ -43,4 +43,5 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_widen_mult_si_to_di_pattern } } } */
+/* P * P * P requires a widening multiplication first as well as a 
longxlong->long after that.  */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
vect_widen_mult_si_to_di_pattern && vect_long_mult } } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 5949da4..539aaaf 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6299,6 +6299,30 @@ proc check_effective_target_vect_int_mult { } {
 return $et_vect_int_mult_saved($et_index)
 }
 
+# Return 1 if the target supports 64 bit hardware vector
+# multiplication of long operands with a long result, 0 otherwise.
+#
+# This can change for different subtargets so do not cache the result.
+
+proc check_effective_target_vect_long_mult { } {
+if { [istarget i?86-*-*] || [istarget x86_64-*-*]
+|| (([istarget powerpc*-*-*]
+  && ![istarget powerpc-*-linux*paired*])
+  && [check_effective_target_ilp32])
+|| [is-effective-target arm_neon]
+|| ([istarget sparc*-*-*] && [check_effective_target_ilp32])
+|| [istarget aarch64*-*-*]
+|| ([istarget mips*-*-*]
+ && [et-is-effective-target mips_msa]) } {
+   set answer 1
+} else {
+   set answer 0
+}
+
+verbose "check_effective_target_vect_long_mult: returning $answer" 2
+return $answer
+}
+
 # Return 1 if the target supports vector even/odd elements extraction, 0 
otherwise.
 
 proc check_effective_target_vect_extract_even_odd { } {
-- 
2.9.1

[PATCH 2/5] Testcases using dg-options require at least -mzarch.

2017-09-28 Thread Andreas Krebbel

Testcases which override the vect default options using dg-options
need at least -mzarch on S/390 32 bit.

gcc/testsuite/ChangeLog:

2017-09-28  Andreas Krebbel  

* gfortran.dg/vect/fast-math-mgrid-resid.f: Use -mzarch on S/390.
* gfortran.dg/vect/pr77848.f: Likewise.
---
 gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f | 1 +
 gcc/testsuite/gfortran.dg/vect/pr77848.f   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f 
b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
index 54f1e9e..7e2816b 100644
--- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
+++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
@@ -2,6 +2,7 @@
 ! { dg-require-effective-target vect_double }
 ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0 
-fpredictive-commoning -fdump-tree-pcom-details" }
 ! { dg-additional-options "-mprefer-avx128" { target { i?86-*-* x86_64-*-* } } 
}
+! { dg-additional-options "-mzarch" { target { s390*-*-* } } }
 
 *** RESID COMPUTES THE RESIDUAL:  R = V - AU
 *
diff --git a/gcc/testsuite/gfortran.dg/vect/pr77848.f 
b/gcc/testsuite/gfortran.dg/vect/pr77848.f
index d54676e..4752205 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr77848.f
+++ b/gcc/testsuite/gfortran.dg/vect/pr77848.f
@@ -1,6 +1,7 @@
 ! PR 77848: Verify versioning is on when vectorization fails
 ! { dg-do compile }
 ! { dg-options "-O3 -ffast-math -fdump-tree-ifcvt -fdump-tree-vect-details" }
+! { dg-additional-options "-mzarch" { target { s390*-*-* } } }
 
   subroutine sub(x,a,n,m)
   implicit none
-- 
2.9.1

[PATCH 1/5] Enable vect_float with S/390 VXE and adjust testcases

2017-09-28 Thread Andreas Krebbel

The target supports routines provide vect_double and vect_float but
these do not appear to be used consequently in the vect testcases.
With z13 we only have support for vector double but with z14 also for
vector float.  This patch adds vect_float to the testcases using the
float data type and make the vect_float target check to return 1 only
on z14.

gcc/testsuite/ChangeLog:

2017-09-28  Andreas Krebbel  

* lib/target-supports.exp (check_effective_target_vect_float):
Return 1 being on a S/390 with VXE.
* gcc.dg/vect/pr31699.c: Require vec_float.
* gcc.dg/vect/pr61194.c: Likewise.
* gcc.dg/vect/pr65947-10.c: Likewise.
* gcc.dg/vect/pr66142.c: Likewise.
* gcc.dg/vect/slp-10.c: Likewise.
* gcc.dg/vect/slp-11c.c: Likewise.
* gcc.dg/vect/slp-12b.c: Likewise.
* gcc.dg/vect/slp-18.c: Likewise.
* gcc.dg/vect/slp-33.c: Likewise.
* gcc.dg/vect/slp-cond-2-big-array.c: Likewise.
* gcc.dg/vect/slp-cond-2.c: Likewise.
* gcc.dg/vect/vect-cond-10.c: Likewise.
* gcc.dg/vect/vect-cond-8.c: Likewise.
* gcc.dg/vect/vect-cond-9.c: Likewise.
* gcc.dg/vect/vect-float-extend-1.c: Likewise.
* gcc.dg/vect/vect-float-truncate-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/pr31699.c   | 2 +-
 gcc/testsuite/gcc.dg/vect/pr61194.c   | 1 +
 gcc/testsuite/gcc.dg/vect/pr65947-10.c| 1 +
 gcc/testsuite/gcc.dg/vect/pr66142.c   | 2 +-
 gcc/testsuite/gcc.dg/vect/slp-10.c| 1 +
 gcc/testsuite/gcc.dg/vect/slp-11c.c   | 1 +
 gcc/testsuite/gcc.dg/vect/slp-12b.c   | 1 +
 gcc/testsuite/gcc.dg/vect/slp-18.c| 1 +
 gcc/testsuite/gcc.dg/vect/slp-33.c| 1 +
 gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c  | 2 ++
 gcc/testsuite/gcc.dg/vect/slp-cond-2.c| 2 ++
 gcc/testsuite/gcc.dg/vect/vect-cond-10.c  | 1 +
 gcc/testsuite/gcc.dg/vect/vect-cond-8.c   | 1 +
 gcc/testsuite/gcc.dg/vect/vect-cond-9.c   | 1 +
 gcc/testsuite/gcc.dg/vect/vect-float-extend-1.c   | 1 +
 gcc/testsuite/gcc.dg/vect/vect-float-truncate-1.c | 1 +
 gcc/testsuite/lib/target-supports.exp | 4 +++-
 17 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr31699.c 
b/gcc/testsuite/gcc.dg/vect/pr31699.c
index 59b8daa..7ec4dfe 100644
--- a/gcc/testsuite/gcc.dg/vect/pr31699.c
+++ b/gcc/testsuite/gcc.dg/vect/pr31699.c
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target vect_double } */
+/* { dg-require-effective-target vect_float } */
 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/vect/pr61194.c 
b/gcc/testsuite/gcc.dg/vect/pr61194.c
index f7c71b9..8421367 100644
--- a/gcc/testsuite/gcc.dg/vect/pr61194.c
+++ b/gcc/testsuite/gcc.dg/vect/pr61194.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_cond_mixed } */
+/* { dg-require-effective-target vect_float } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-10.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-10.c
index a8a674f..321cb8c 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-10.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-10.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_condition } */
+/* { dg-require-effective-target vect_float } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/pr66142.c 
b/gcc/testsuite/gcc.dg/vect/pr66142.c
index 94854ea..8c79f29 100644
--- a/gcc/testsuite/gcc.dg/vect/pr66142.c
+++ b/gcc/testsuite/gcc.dg/vect/pr66142.c
@@ -41,4 +41,4 @@ foo (float *a, float *b, float *c)
   *a = z;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 1 "vect" 
{ target vect_condition } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 1 "vect" 
{ target { vect_condition && vect_float } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-10.c 
b/gcc/testsuite/gcc.dg/vect/slp-10.c
index 3395d22..61c5d3c 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-10.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-10.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
 
 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/slp-11c.c 
b/gcc/testsuite/gcc.dg/vect/slp-11c.c
index 8edd663..bdcf434 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-11c.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-11c.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
 
 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/slp-12b.c 
b/gcc/testsuite/gcc.dg/vect/slp-12b.c
index d6fe4e4..48e7865 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-12b.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-12b.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_uintfloat_cvt } */
+/* { dg-require-effective-target vect_float } */
 
 #include 
 #include "tree-vect.h"
diff --git

[PATCH 0/5] vect testsuite adjustments for S/390 - v2

2017-09-28 Thread Andreas Krebbel

Changes to last version:

- vect_mult_long renamed to vect_long_mult (since there is already 
vect_int_mult ...)
- vect_no_peel changed to vect_can_peel as suggested by Richard
- another two target checks added: vect_intdouble_cvt and vect_doubleint_cvt
- documentation for the new target checks added


Ok for mainline?

Andreas Krebbel (5):
  Enable vect_float with S/390 VXE and adjust testcases
  Testcases using dg-options require at least -mzarch.
  New target check: vect_long_mult
  New target check for double<->int conversions
  New target check: vect_can_peel

 gcc/doc/sourcebuild.texi   |  12 +++
 gcc/testsuite/g++.dg/vect/slp-pr56812.cc   |   4 +-
 gcc/testsuite/gcc.dg/vect/pr31699.c|   2 +-
 gcc/testsuite/gcc.dg/vect/pr60656.c|   3 +-
 gcc/testsuite/gcc.dg/vect/pr61194.c|   1 +
 gcc/testsuite/gcc.dg/vect/pr65947-10.c |   1 +
 gcc/testsuite/gcc.dg/vect/pr66142.c|   2 +-
 gcc/testsuite/gcc.dg/vect/pr66251.c|   4 +-
 gcc/testsuite/gcc.dg/vect/slp-10.c |   1 +
 gcc/testsuite/gcc.dg/vect/slp-11c.c|   1 +
 gcc/testsuite/gcc.dg/vect/slp-12b.c|   1 +
 gcc/testsuite/gcc.dg/vect/slp-18.c |   1 +
 gcc/testsuite/gcc.dg/vect/slp-33.c |   1 +
 gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c   |   2 +
 gcc/testsuite/gcc.dg/vect/slp-cond-2.c |   2 +
 gcc/testsuite/gcc.dg/vect/vect-cond-10.c   |   1 +
 gcc/testsuite/gcc.dg/vect/vect-cond-8.c|   1 +
 gcc/testsuite/gcc.dg/vect/vect-cond-9.c|   1 +
 gcc/testsuite/gcc.dg/vect/vect-float-extend-1.c|   1 +
 gcc/testsuite/gcc.dg/vect/vect-float-truncate-1.c  |   1 +
 .../gcc.dg/vect/vect-floatint-conversion-2.c   |   2 +-
 .../gcc.dg/vect/vect-intfloat-conversion-3.c   |   2 +-
 .../gfortran.dg/vect/fast-math-mgrid-resid.f   |   1 +
 gcc/testsuite/gfortran.dg/vect/pr60510.f   |   1 +
 gcc/testsuite/gfortran.dg/vect/pr77848.f   |   1 +
 gcc/testsuite/gfortran.dg/vect/vect-8.f90  |   3 +-
 gcc/testsuite/lib/target-supports.exp  | 112 -
 27 files changed, 155 insertions(+), 10 deletions(-)

-- 
2.9.1

Re: [PATCH][mingw] Enable colorized diagnostics

2017-09-28 Thread Liu Hao


On 2017/9/28 7:37, JonY wrote:

Does it make sense to use a global lock in mingw_ansi_fputs?



I was thinking about a named Mutex object. Named Mutexes (as well as 
Events and Semaphores) can be shared across processes, but there are 
other considerations:


1. The name of the Mutex should base on the current console window which 
is shared by all child processes created by `make`, and must be unique. 
How can it be? Is it possible to create a string basing on the window 
handle or unique identifier whatsoever? Will the handle or unique id be 
reused after the window is destroyed? Is it unique after all?


2. This Mutex would only protect diagnostics from interleaving. 
Diagnostics can interleave with other messages written via stdio 
functions, including those written to `stdout` which is often output to 
the console as well. I don't think there are any solutions for this.


--
Best regards,
LH_Mouse

Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-09-28 Thread Szabolcs Nagy

On 31/08/17 18:24, Steve Ellcey wrote:
> On Tue, 2017-08-29 at 12:25 +0100, Szabolcs Nagy wrote:
>> > 
>> > in glibc the hwcap is not used, because it has accesses to
>> > cached dispatch info, but in libatomic using the hwcap
>> > argument is the right way.
> Here is an updated version of the patch to allow aarch64 to use ifuncs
> in libatomic.
> 
> The main difference from the last patch is that the library does not
> access the hwcap value directly but accesses it through the ifunc
> resolver argument.  That means that we no longer need the
> init_cpu_revision static constructor to set a flag that the resolver
> checks, instead the resolver just does a comparision of its incoming
> argument with HWCAP_ATOMICS.
> 

i think this approach is fine.

> This did mean I had to change the prototype for the resolver functions
> in libatomic_i.h to have an argument, which is the way glibc calls
> them.  One complication of this is that the type of the argument can
> differ between platforms and ABIs so I added code to configure.tgt to
> set the type.  I used uint64_t for aarch64 and 'long unsigned int'
> for everything else.  That is not correct for all platforms but at 
> this point no other platforms access the argument so it should not
> matter.  If and when platforms do need to access it they can change
> the type if necessary.
> 

i think this should be improved, see below.

> Steve Ellcey
> sell...@cavium.com
> 
> 
> 2017-08-31  Steve Ellcey  
> 
>   * Makefile.am (ARCH_AARCH64_LINUX_LSE): Add IFUNC_OPTIONS and
>   libatomic_la_LIBADD.
>   * config/linux/aarch64/host-config.h: New file.
>   * configure.ac (HWCAP_TYPE): Define.
>   (AC_CHECK_HEADERS): Check for sys/auxv.h.
>   (AC_CHECK_FUNCS): Check for getauxval.
>   (ARCH_AARCH64_LINUX_LSE): New conditional for IFUNC builds.
>   * configure.tgt (aarch64): Set AARCH and try_ifunc.
>   (aarch64*-*-linux*) Update config_path.
>   (aarch64*-*-linux*) Set HWCAP_TYPE.
>   * libatomic_i.h (GEN_SELECTOR): Add "HWCAP_TYPE hwcap" argument.
>   * Makefile.in: Regenerate.
>   * auto-config.h.in: Regenerate.
>   * configure: Regenerate.
> 
> 
> libatomic.patch
> 
> 
> diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
> index d731406..a35df1e 100644
> --- a/libatomic/Makefile.am
> +++ b/libatomic/Makefile.am
> @@ -122,6 +122,10 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix 
> _$(s)_.lo,$(SIZEOBJS)))
>  
>  ## On a target-specific basis, include alternates to be selected by IFUNC.
>  if HAVE_IFUNC
> +if ARCH_AARCH64_LINUX_LSE
> +IFUNC_OPTIONS = -mcpu=thunderx2t99

i'd expect -march=armv8.1-a instead of a particular cpu here.
(and it think this assumes a not very old binutils gas)

> +libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
> _$(s)_1_.lo,$(SIZEOBJS)))
> +endif
>  if ARCH_ARM_LINUX
>  IFUNC_OPTIONS = -march=armv7-a -DHAVE_KERNEL64
>  libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
> _$(s)_1_.lo,$(SIZEOBJS)))
> diff --git a/libatomic/configure.ac b/libatomic/configure.ac
> index 023f172..4e06ffe 100644
> --- a/libatomic/configure.ac
> +++ b/libatomic/configure.ac
> @@ -163,6 +163,10 @@ if test -n "$UNSUPPORTED"; then
>AC_MSG_ERROR([Configuration ${target} is unsupported.])
>  fi
>  
> +# Write out the ifunc resolver arg type.
> +AC_DEFINE_UNQUOTED(HWCAP_TYPE, $HWCAP_TYPE,
> + [Define type of ifunc resolver function argument.])
> +
>  # Disable fallbacks to __sync routines from libgcc.  Otherwise we'll
>  # make silly decisions about what the cpu can do.
>  CFLAGS="$save_CFLAGS -fno-sync-libcalls $XCFLAGS"
> @@ -171,7 +175,8 @@ CFLAGS="$save_CFLAGS -fno-sync-libcalls $XCFLAGS"
>  AC_STDC_HEADERS
>  ACX_HEADER_STRING
>  GCC_HEADER_STDINT(gstdint.h)
> -AC_CHECK_HEADERS([fenv.h])
> +AC_CHECK_HEADERS([fenv.h sys/auxv.h])
> +AC_CHECK_FUNCS(getauxval)
>  

getauxval is no longer needed.

>  # Check for common type sizes
>  LIBAT_FORALL_MODES([LIBAT_HAVE_INT_MODE])
> @@ -247,6 +252,8 @@ AC_SUBST(LIBS)
>  AC_SUBST(SIZES)
>  
>  AM_CONDITIONAL(HAVE_IFUNC, test x$libat_cv_have_ifunc = xyes)
> +AM_CONDITIONAL(ARCH_AARCH64_LINUX_LSE,
> +[expr "$config_path" : ".* linux/aarch64 .*" > /dev/null])

linux/aarch64 seems to be set for all aarch64*-*-linux* targets below.
so why call it _LSE?

>  AM_CONDITIONAL(ARCH_ARM_LINUX,
>  [expr "$config_path" : ".* linux/arm .*" > /dev/null])
>  AM_CONDITIONAL(ARCH_I386,
> diff --git a/libatomic/configure.tgt b/libatomic/configure.tgt
> index b8af3ab..0bb5c66 100644
> --- a/libatomic/configure.tgt
> +++ b/libatomic/configure.tgt
> @@ -40,6 +40,14 @@ case "${target_cpu}" in
>riscv*)ARCH=riscv ;;
>sh*)   ARCH=sh ;;
>  
> +  aarch64*)
> + ARCH=aarch64
> + case "${target}" in
> + aarch64*-*-linux*)
> + try_ifunc=yes
> + ;;
> + esac
> + ;;
>arm*)
>   ARCH=arm
>   case "${target}"

Re: correct attribute ifunc C++ type safety (PR 82301)

2017-09-28 Thread Pedro Alves

On 09/25/2017 02:03 AM, Martin Sebor wrote:

> +a @option{-Wincompatible-pointer-types} warning for mismatches.  To suppress
> +a warning for the necessary cast from a pointer to the implementation member
> +function to the type of the corresponding non-member function use the
> +@option{-Wno-pmf-conversions} option.  For example:

FWIW, it seems odd to me to tell users they need to suppress warnings, when
the compiler surely could provide better/safer means to avoid needing
to use the reinterpret_cast hammer.   See below.

> +
> +@smallexample
> +class S
> +@{
> +private:
> +  int debug_impl (int);
> +  int optimized_impl (int);
> +
> +  typedef int Func (S*, int);
> +
> +  static Func* resolver ();
> +public:
> +
> +  int interface (int);
> +@};
> +
> +int S::debug_impl (int) @{ /* @r{@dots{}} */ @}
> +int S::optimized_impl (int) @{ /* @r{@dots{}} */ @}
> +
> +S::Func* S::resolver ()
> +@{
> +  int (S::*pimpl) (int)
> += getenv ("DEBUG") ? ::debug_impl : ::optimized_impl;
> +
> +  // Cast triggers -Wno-pmf-conversions.
> +  return reinterpret_cast(pimpl);
> +@}
> +

If I were writing code like this, I'd write a reinterpret_cast-like
function specifically for pointer-to-member-function to free-function
casting, and only suppress the warning there instead of disabling
the warning for the whole translation unit.  Something like:

#include 

template struct pmf_as_func;

template
struct pmf_as_func
{
  typedef Ret (func_type) (S *, Args...);
  typedef S struct_type;
};

template
typename pmf_as_func::func_type *
pmf_as_func_cast (Pmf pmf)
{
  static_assert (!std::is_polymorphic::struct_type>::value,
 "");
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpmf-conversions"
  return reinterpret_cast::func_type *> (pmf);
#pragma GCC diagnostic pop
}

and then write:
 return pmf_as_func_cast (pimpl);

instead of:
  return reinterpret_cast(pimpl);

The point being of course to make it harder to misuse the casts.

But that may be a bit too much for the manual.  

It also wouldn't work as is with C++03 (because variatic templates).
Which leads me to think that if GCC guarantees this cast works, then
it'd be nice to have GCC provide it (like a __pmf_as_func_cast function)
as builtin.  Then it'd work on C++03 as well, and the compiler of course
can precisely validate whether the cast is valid.  (It's quite possible
that there's a better check than is_polymorphic as I've written above.)

Just a passing thought.

Thanks,
Pedro Alves

Re: [PATCH] streambuf_iterator: avoid debug-dependent behaviour

2017-09-28 Thread Jonathan Wakely


On 28/09/17 11:50 +0100, Jonathan Wakely wrote:

On 21/09/17 07:46 +0200, François Dumont wrote:

Gentle reminder, ok to commit ?


No. Could you and Petr please come to an agreement about what is
actually wrong with the current implementation, and agree on a
solution?

Currently you're both just proposing patches that do different things,
without indicating why one patch is better than the other.

I understand that we want to remove the debugmode-dependent behaviour,
but I'd like to see any other changes justified by references to the
standard.


Here's a test we currently fail, but it looks like we should pass it:
http://llvm.org/svn/llvm-project/libcxx/trunk/test/std/iterators/stream.iterators/istreambuf.iterator/istreambuf.iterator.cons/proxy.pass.cpp

Do the changes either of you is proposing change this result?

Re: [PATCH] streambuf_iterator: avoid debug-dependent behaviour

2017-09-28 Thread Jonathan Wakely


On 21/09/17 07:46 +0200, François Dumont wrote:

Gentle reminder, ok to commit ?


No. Could you and Petr please come to an agreement about what is
actually wrong with the current implementation, and agree on a
solution?

Currently you're both just proposing patches that do different things,
without indicating why one patch is better than the other.

I understand that we want to remove the debugmode-dependent behaviour,
but I'd like to see any other changes justified by references to the
standard.


diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc 
b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
index b81f4d4..e3d99f9 100644
--- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
+++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
@@ -25,9 +25,7 @@

void test02(void)
{
-
  typedef std::istreambuf_iterator cistreambuf_iter;
-  typedef cistreambuf_iter::streambuf_type cstreambuf_type;
  const char slit01[] = "playa hermosa, liberia, guanacaste";
  std::string str01(slit01);
  std::istringstream istrs00(str01);
@@ -35,10 +33,17 @@ void test02(void)

  // ctor sanity checks
  cistreambuf_iter istrb_it01(istrs00);
-  cistreambuf_iter istrb_it02;
-  std::string tmp(istrb_it01, istrb_it02); 
+  cistreambuf_iter istrb_eos;

+  VERIFY( istrb_it01 != istrb_eos );
+
+  std::string tmp(istrb_it01, istrb_eos);
  VERIFY( tmp == str01 );

+  VERIFY( istrb_it01 != istrb_eos );


Why should this condition be true? The std::string constructor
increments the iterator until it reaches the end-of-stream value.

This is true with our current implementation, but that seems like a
bug, not something we want to verify in the testsuite.


+  cistreambuf_iter old = istrb_it01++;
+  VERIFY( old == istrb_eos );


This behaviour makes no sense.


+  VERIFY( istrb_it01 == istrb_eos );
+
  cistreambuf_iter istrb_it03(0);
  cistreambuf_iter istrb_it04;
  VERIFY( istrb_it03 == istrb_it04 );

Re: [PATCH] libstdc++: istreambuf_iterator keep attached streambuf

2017-09-28 Thread Jonathan Wakely


On 23/09/17 09:54 +0300, Petr Ovtchenkov wrote:

istreambuf_iterator should not forget about attached
streambuf when it reach EOF.

Checks in debug mode has no infuence more on character
extraction in istreambuf_iterator increment operators.
In this aspect behaviour in debug and non-debug mode
is similar now.

Test for detached srteambuf in istreambuf_iterator:
When istreambuf_iterator reach EOF of istream, it should not
forget about attached streambuf.
From fact "EOF in stream reached" not follow that
stream reach end of life and input operation impossible
more.
---
libstdc++-v3/include/bits/streambuf_iterator.h | 41 +++
.../24_iterators/istreambuf_iterator/3.cc  | 61 ++
2 files changed, 80 insertions(+), 22 deletions(-)
create mode 100644 libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index f0451b1..45c3d89 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -136,12 +136,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  istreambuf_iterator&
  operator++()
  {
-   __glibcxx_requires_cond(!_M_at_eof(),
+   __glibcxx_requires_cond(_M_sbuf,
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
if (_M_sbuf)
  {
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   int_type _tmp =
+#endif
_M_sbuf->sbumpc();
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   
__glibcxx_requires_cond(!traits_type::eq_int_type(_tmp,traits_type::eof()),
+   
_M_message(__gnu_debug::__msg_inc_istreambuf)
+   ._M_iterator(*this));
+#endif
_M_c = traits_type::eof();
  }
return *this;
@@ -151,14 +159,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  istreambuf_iterator
  operator++(int)
  {
-   __glibcxx_requires_cond(!_M_at_eof(),
+_M_get();
+   __glibcxx_requires_cond(_M_sbuf
+   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));

istreambuf_iterator __old = *this;
if (_M_sbuf)
  {
-   __old._M_c = _M_sbuf->sbumpc();
+   _M_sbuf->sbumpc();
_M_c = traits_type::eof();
  }
return __old;
@@ -177,18 +187,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_get() const
  {
const int_type __eof = traits_type::eof();
-   int_type __ret = __eof;
-   if (_M_sbuf)
- {
-   if (!traits_type::eq_int_type(_M_c, __eof))
- __ret = _M_c;
-   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
-  __eof))
- _M_c = __ret;
-   else
- _M_sbuf = 0;
- }
-   return __ret;
+   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
+  _M_c = _M_sbuf->sgetc();
+   return _M_c;
  }

  bool
@@ -339,7 +340,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typedef typename __is_iterator_type::streambuf_type  streambuf_type;
  typedef typename traits_type::int_type   int_type;

-  if (__first._M_sbuf && !__last._M_sbuf)
+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
{
  streambuf_type* __sb = __first._M_sbuf;
  int_type __c = __sb->sgetc();
@@ -374,7 +375,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typedef typename __is_iterator_type::streambuf_type  streambuf_type;
  typedef typename traits_type::int_type   int_type;

-  if (__first._M_sbuf && !__last._M_sbuf)
+  if (__first._M_sbuf && (__last == istreambuf_iterator<_CharT>()))
{
  const int_type __ival = traits_type::to_int_type(__val);
  streambuf_type* __sb = __first._M_sbuf;
@@ -395,11 +396,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  else
__c = __sb->snextc();
}
-
- if (!traits_type::eq_int_type(__c, traits_type::eof()))
-   __first._M_c = __c;
- else
-   __first._M_sbuf = 0;
+ __first._M_c = __c;
}
  return __first;
}
diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc 
b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
new file mode 100644
index 000..803ede4
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc
@@ -0,0 +1,61 @@
+// { dg-options "-std=gnu++17" }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published

Re: [PATCH] Fix fortran/81509

2017-09-28 Thread Paul Richard Thomas

Hi Steve,

I'll take your word for it on the F2008 contraints. Given that the
patch is very good - OK for trunk.

Thanks

Paul

On 27 September 2017 at 20:36, Steve Kargl
 wrote:
> The attached patch fixes PR fortran/81509.
>
> In short, F2008 now allows boz-literal-constants in IAND, IOR, IEOR,
> DSHIFTL, DSHIFTR, and MERGE_BITS.  gfortran currently allows BOZ
> argument, but she was not enforcing restrictions in F2008.  The
> attach patch causes gfortran to conform to F2008.
>
> As aside effect, the patch removes a questionable GNU Fortran
> extension that allowed arguments to IAND, IOR, and IEOR to have
> different kind type parameters.  The behavior of this extension
> was not documented.
>
> 2017-09-27  Steven G. Kargl  
>
> PR fortran/81509
> * check.c: Rename function gfc_check_iand to gfc_check_iand_ieor_ior.
> * check.c (boz_args_check): New function.  Check I and J not both BOZ.
> (gfc_check_dshift,gfc_check_iand_ieor_ior, gfc_check_ishft,
>  gfc_check_and, gfc_check_merge_bits): Use it.
> * check.c (gfc_check_iand_ieor_ior): Force conversion of BOZ to kind
> type of other agrument.  Remove silly GNU extension.
> (gfc_check_ieor, gfc_check_ior): Delete now unused functions.
> * intrinsic.c (add_functions): Use gfc_check_iand_ieor_ior. Wrap long
> line.
> * intrinsic.h: Rename gfc_check_iand to gfc_check_iand_ieor_ior.
> Delete prototype for bool gfc_check_ieor and gfc_check_ior
> * intrinsic.texi: Update documentation for boz-literal-constant.
>
> 2017-09-27  Steven G. Kargl  
>
> PR fortran/81509
> * gfortran.dg/graphite/id-26.f03: Fix non-conforming use of IAND.
> * gfortran.dg/pr81509_1.f90: New test.
> * gfortran.dg/pr81509_2.f90: New test.
>
> --
> Steve
> 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
> 20161221 https://www.youtube.com/watch?v=IbCHE-hONow



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

Re: [PATCH][AArch64] Add BIC-imm and ORR-imm SIMD pattern

2017-09-28 Thread Richard Earnshaw (lists)

On 27/09/17 18:57, Sudi Das wrote:
> 
> 
> Hi James
> 
> I have made the requested changes to the patch.
> 
> 
> 2017-09-27 Sudakshina Das  
> 
>   * config/aarch64/aarch64-protos.h (enum simd_immediate_check): New 
> check type
>   for aarch64_simd_valid_immediate.
>   (aarch64_output_simd_mov_immediate): Update prototype.
>   (aarch64_simd_valid_immediate): Update prototype.
> 
>   * config/aarch64/aarch64-simd.md (orr3): modified pattern to add
>   support for ORR-immediate.
>   (and3): modified pattern to add support for BIC-immediate.
> 
>   * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): Function now 
> checks
>   for valid immediate for BIC and ORR based on new enum argument.
>   (aarch64_output_simd_mov_immediate): Function now used to output 
> BIC/ORR imm
>   as well based on new enum argument.
>  
>   * config/aarch64/constraints.md (Do): New vector immediate constraint.
>   (Db): Likewise.
> 
> 2017-09-27 Sudakshina Das  
> 
>   * gcc.target/aarch64/bic_imm_1.c: New test.
>   * gcc.target/aarch64/orr_imm_1.c: Likewise.
> 
> 
> Thanks
> Sudi
> 
>   
> From: James Greenhalgh 
> Sent: Tuesday, September 26, 2017 8:04:38 PM
> To: Sudi Das
> Cc: Richard Earnshaw; gcc-patches@gcc.gnu.org; nd; Marcus Shawcroft
> Subject: Re: [PATCH][AArch64] Add BIC-imm and ORR-imm SIMD pattern
> 
> On Mon, Sep 25, 2017 at 11:13:57AM +0100, Sudi Das wrote:
>>
>> Hi James
>>
>> I put aarch64_output_simd_general_immediate looking at the similarities of
>> the immediates for mov/mvni and orr/bic. The CHECK macro in
>> aarch64_simd_valid_immediate both checks
>> and converts the immediates in a manner that are needed for the instructions.
>>
>> Having said that, I agree that maybe I could have refactored
>> aarch64_output_simd_mov_immediate to do the work rather than creating a new
>> functions to do similar things. I have done so in this patch.
> 
> Thanks, this looks much neater.
> 
>> I have also changed the names of the enum simd_immediate_check to be better
>> indicative of what they are doing. 
> 
> Thanks, I'd tweak them to look more like the bitmasks you use them as, but
> that is a small change for my personal preference.
> 
>> Lastly I have added more cases in the tests (according to all the possible
>> CHECKs) and made them dg-do assemble (although I had to add --save-temps so
>> that the scan-assembler would work). Do you think I should not put that
>> option and rather create separate tests?
> 
> This is good - thanks.
> 
> I think clean up the enum definitions and this patch will be good.
> 
>> @@ -308,6 +308,16 @@ enum aarch64_parse_opt_result
>> AARCH64_PARSE_INVALID_ARG  /* Invalid arch, tune, cpu arg.  */
>>   };
>>   
>> +/* Enum to distinguish which type of check is to be done in
>> +   aarch64_simd_valid_immediate.  This is used as a bitmask where
>> +   AARCH64_CHECK_MOV has both bits set.  Thus AARCH64_CHECK_MOV will
>> +   perform all checks.  Adding new types would require changes accordingly. 
>>  */
>> +enum simd_immediate_check {
>> +  AARCH64_CHECK_ORR  = 1,/* Perform immediate checks for ORR.  */
>> +  AARCH64_CHECK_BIC  = 2,/* Perform immediate checks for BIC.  */
>> +  AARCH64_CHECK_MOV  = 3 /* Perform all checks (used for MOVI/MNVI).  */
> 
> These are used in bit-mask style, so how about:
> 
>   AARCH64_CHECK_ORR = 1 << 0,
>   AARCH64_CHECK_BIC = 1 << 1,
>   AARCH64_CHECK_MOV = AARCH64_CHECK_ORR | AARCH64_CHECK_BIC
> 
> Which is more self-documenting.
> 
>> @@ -13001,7 +13013,8 @@ aarch64_float_const_representable_p (rtx x)
>>   char*
>>   aarch64_output_simd_mov_immediate (rtx const_vector,
>>machine_mode mode,
>> -unsigned width)
>> +unsigned width,
>> +enum simd_immediate_check which)
> 
> This function is sorely missing a comment explaining the parameters - it
> would be very helpful if you could add one as part of this patch.
> 
> Thanks,
> James
> 
> 
> 
+;; For AND (vector, register) and BIC (vector, immediate)
 (define_insn "and3"
-  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
-(and:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")
-(match_operand:VDQ_I 2 "register_operand" "w")))]
+  [(set (match_operand:VDQ_I 0 "register_operand" "=w,w")
+   (and:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w,0")
+(match_operand:VDQ_I 2 "nonmemory_operand" "w,Db")))]

You should define a new predicate operation for operand 2 that accepts
just registers or the valid constants.  Otherwise you'll may get
spilling during register allocation.

Similarly for the other pattern.

R.

Re: [libgomp, testsuite] Remove superfluous -fopenmp from libgomp testcases

2017-09-28 Thread Jakub Jelinek

On Thu, Sep 28, 2017 at 09:41:15AM +0200, Tom de Vries wrote:
> [ was: Re: [gomp4 2/9] libgomp: Prepare for testcases without -fopenmp. ]
> 
> On 11/07/2013 09:11 AM, Jakub Jelinek wrote:
> > On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:
> > > From: Thomas Schwinge 
> > > 
> > >   libgomp/
> > >   * testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
> > >   ALWAYS_CFLAGS.
> > >   * testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
> > >   * testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
> > >   * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
> > >   * testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
> > >   Likewise.
> > 
> > Ok for trunk/gomp-4_0-branch.
> 
> Following up on this, how about we drop the now superfluous -fopenmp in
> current test-cases?
> 
> Tested on x86_64. Verified by analyzing libgomp.log that -fopenmp is still
> passed to test-cases as required.
> 
> OK for trunk?
> 
> Thanks,
> - Tom

> Remove superfluous -fopenmp from libgomp testcases
> 
> 2017-09-16  Tom de Vries  
> 
>   * testsuite/libgomp.c++/for-12.C: Remove superfluous -fopenmp option
>   setting.
>   * testsuite/libgomp.c++/pr69393.C: Same.
>   * testsuite/libgomp.c++/taskloop-1.C: Same.
>   * testsuite/libgomp.c++/taskloop-3.C: Same.
>   * testsuite/libgomp.c++/taskloop-4.C: Same.
>   * testsuite/libgomp.c/for-4.c: Same.
>   * testsuite/libgomp.c/pr66199-3.c: Same.
>   * testsuite/libgomp.c/pr66199-4.c: Same.
>   * testsuite/libgomp.c/pr66199-6.c: Same.
>   * testsuite/libgomp.c/taskloop-1.c: Same.
>   * testsuite/libgomp.c/taskloop-3.c: Same.
>   * testsuite/libgomp.c/taskloop-4.c: Same.
>   * testsuite/libgomp.fortran/aligned1.f03: Same.
>   * testsuite/libgomp.fortran/condinc1.f: Same.
>   * testsuite/libgomp.fortran/condinc3.f90: Same.
>   * testsuite/libgomp.fortran/crayptr1.f90: Same.
>   * testsuite/libgomp.fortran/crayptr2.f90: Same.
>   * testsuite/libgomp.fortran/crayptr3.f90: Same.
>   * testsuite/libgomp.fortran/omp_cond1.f: Same.
>   * testsuite/libgomp.fortran/omp_cond3.F90: Same.
>   * testsuite/libgomp.fortran/pr66199-1.f90: Same.
>   * testsuite/libgomp.fortran/pr66199-2.f90: Same.
>   * testsuite/libgomp.fortran/recursion1.f90: Same.
>   * testsuite/libgomp.fortran/target2.f90: Same.
>   * testsuite/libgomp.fortran/target5.f90: Same.
>   * testsuite/libgomp.fortran/task3.f90: Same.

Ok.

Jakub

Re: [libgomp, testsuite] Remove superfluous -fopenmp from libgomp testcases

2017-09-28 Thread Thomas Schwinge

Hi Tom!

On Thu, 28 Sep 2017 09:41:15 +0200, Tom de Vries  wrote:
> On 11/07/2013 09:11 AM, Jakub Jelinek wrote:
> > On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:
> >> From: Thomas Schwinge 
> >>
> >>libgomp/
> >>* testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
> >>ALWAYS_CFLAGS.
> >>* testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
> >>* testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
> >>* testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
> >>* testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
> >>Likewise.

Note that my patch just moved *where* the flag gets set, so...

> Following up on this, how about we drop the now superfluous -fopenmp in 
> current test-cases?

... it has already been superfluous before.  ;-)

Anyway: ACK conceptually.

> Tested on x86_64. Verified by analyzing libgomp.log that -fopenmp is 
> still passed to test-cases as required.
> 
> OK for trunk?

> --- a/libgomp/testsuite/libgomp.c++/for-12.C
> +++ b/libgomp/testsuite/libgomp.c++/for-12.C
> @@ -1,5 +1,3 @@
> -/* { dg-options "-fopenmp" } */

As far as I remember, this means that instead of "-fopenmp" the
"DEFAULT_CFLAGS" will then be used: "-O2", so this effectively changes
testing from "-O2" to "-O2".  Same for a few other cases where you remove
"dg-options" altogether.  Special consideration required for fortran,
which should never specify these in "dg-options" because it cycles
("torture testing") through different optimization flags.

With that fixed: Reviewed-by: Thomas Schwinge 
(See
.)
But I can't formally approve, of course.

Grüße
 Thomas

[libgomp, testsuite] Remove superfluous -fopenmp from libgomp testcases

2017-09-28 Thread Tom de Vries


[ was: Re: [gomp4 2/9] libgomp: Prepare for testcases without -fopenmp. ]

On 11/07/2013 09:11 AM, Jakub Jelinek wrote:

On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:

From: Thomas Schwinge 

libgomp/
* testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
ALWAYS_CFLAGS.
* testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
* testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
Likewise.


Ok for trunk/gomp-4_0-branch.


Following up on this, how about we drop the now superfluous -fopenmp in 
current test-cases?


Tested on x86_64. Verified by analyzing libgomp.log that -fopenmp is 
still passed to test-cases as required.


OK for trunk?

Thanks,
- Tom
Remove superfluous -fopenmp from libgomp testcases

2017-09-16  Tom de Vries  

	* testsuite/libgomp.c++/for-12.C: Remove superfluous -fopenmp option
	setting.
	* testsuite/libgomp.c++/pr69393.C: Same.
	* testsuite/libgomp.c++/taskloop-1.C: Same.
	* testsuite/libgomp.c++/taskloop-3.C: Same.
	* testsuite/libgomp.c++/taskloop-4.C: Same.
	* testsuite/libgomp.c/for-4.c: Same.
	* testsuite/libgomp.c/pr66199-3.c: Same.
	* testsuite/libgomp.c/pr66199-4.c: Same.
	* testsuite/libgomp.c/pr66199-6.c: Same.
	* testsuite/libgomp.c/taskloop-1.c: Same.
	* testsuite/libgomp.c/taskloop-3.c: Same.
	* testsuite/libgomp.c/taskloop-4.c: Same.
	* testsuite/libgomp.fortran/aligned1.f03: Same.
	* testsuite/libgomp.fortran/condinc1.f: Same.
	* testsuite/libgomp.fortran/condinc3.f90: Same.
	* testsuite/libgomp.fortran/crayptr1.f90: Same.
	* testsuite/libgomp.fortran/crayptr2.f90: Same.
	* testsuite/libgomp.fortran/crayptr3.f90: Same.
	* testsuite/libgomp.fortran/omp_cond1.f: Same.
	* testsuite/libgomp.fortran/omp_cond3.F90: Same.
	* testsuite/libgomp.fortran/pr66199-1.f90: Same.
	* testsuite/libgomp.fortran/pr66199-2.f90: Same.
	* testsuite/libgomp.fortran/recursion1.f90: Same.
	* testsuite/libgomp.fortran/target2.f90: Same.
	* testsuite/libgomp.fortran/target5.f90: Same.
	* testsuite/libgomp.fortran/task3.f90: Same.

---
 libgomp/testsuite/libgomp.c++/for-12.C   | 2 --
 libgomp/testsuite/libgomp.c++/pr69393.C  | 2 +-
 libgomp/testsuite/libgomp.c++/taskloop-1.C   | 2 +-
 libgomp/testsuite/libgomp.c++/taskloop-3.C   | 2 +-
 libgomp/testsuite/libgomp.c++/taskloop-4.C   | 2 +-
 libgomp/testsuite/libgomp.c/for-4.c  | 2 +-
 libgomp/testsuite/libgomp.c/pr66199-3.c  | 2 +-
 libgomp/testsuite/libgomp.c/pr66199-4.c  | 2 +-
 libgomp/testsuite/libgomp.c/pr66199-6.c  | 2 +-
 libgomp/testsuite/libgomp.c/taskloop-1.c | 2 +-
 libgomp/testsuite/libgomp.c/taskloop-3.c | 2 +-
 libgomp/testsuite/libgomp.c/taskloop-4.c | 2 +-
 libgomp/testsuite/libgomp.fortran/aligned1.f03   | 2 +-
 libgomp/testsuite/libgomp.fortran/condinc1.f | 1 -
 libgomp/testsuite/libgomp.fortran/condinc3.f90   | 1 -
 libgomp/testsuite/libgomp.fortran/crayptr1.f90   | 2 +-
 libgomp/testsuite/libgomp.fortran/crayptr2.f90   | 2 +-
 libgomp/testsuite/libgomp.fortran/crayptr3.f90   | 2 +-
 libgomp/testsuite/libgomp.fortran/omp_cond1.f| 1 -
 libgomp/testsuite/libgomp.fortran/omp_cond3.F90  | 1 -
 libgomp/testsuite/libgomp.fortran/pr66199-1.f90  | 2 +-
 libgomp/testsuite/libgomp.fortran/pr66199-2.f90  | 2 +-
 libgomp/testsuite/libgomp.fortran/recursion1.f90 | 2 +-
 libgomp/testsuite/libgomp.fortran/target2.f90| 2 +-
 libgomp/testsuite/libgomp.fortran/target5.f90| 1 -
 libgomp/testsuite/libgomp.fortran/task3.f90  | 1 -
 26 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/libgomp/testsuite/libgomp.c++/for-12.C b/libgomp/testsuite/libgomp.c++/for-12.C
index ea32192..295b12f 100644
--- a/libgomp/testsuite/libgomp.c++/for-12.C
+++ b/libgomp/testsuite/libgomp.c++/for-12.C
@@ -1,5 +1,3 @@
-/* { dg-options "-fopenmp" } */
-
 extern "C" void abort (void);
 
 #define M(x, y, z) O(x, y, z)
diff --git a/libgomp/testsuite/libgomp.c++/pr69393.C b/libgomp/testsuite/libgomp.c++/pr69393.C
index e3f0de1..02605e0 100644
--- a/libgomp/testsuite/libgomp.c++/pr69393.C
+++ b/libgomp/testsuite/libgomp.c++/pr69393.C
@@ -1,6 +1,6 @@
 // { dg-do run }
 // { dg-require-effective-target lto }
-// { dg-options "-flto -g -fopenmp" }
+// { dg-options "-flto -g" }
 
 int e = 5;
 
diff --git a/libgomp/testsuite/libgomp.c++/taskloop-1.C b/libgomp/testsuite/libgomp.c++/taskloop-1.C
index 66f8e0b..7fc6e46 100644
--- a/libgomp/testsuite/libgomp.c++/taskloop-1.C
+++ b/libgomp/testsuite/libgomp.c++/taskloop-1.C
@@ -1,4 +1,4 @@
 // { dg-do run }
-// { dg-options "-O2 -fopenmp" }
+// { dg-options "-O2" }
 
 #include "../libgomp.c/taskloop-1.c"
diff --git a/libgomp/testsuite/libgomp.c++/taskloop-3.C b/libgomp/testsuite/libgomp.c++/taskloop-3.C
index bfd793c..c08a045 100644
---

Re: [PATCH] For -Os change movabsq $(imm32 << shift), %rX[xip] to movl $imm2, %eX[xip]; shl $shift, %rX[xip] (PR target/82339)

2017-09-28 Thread Uros Bizjak

On Wed, Sep 27, 2017 at 3:36 PM, Jakub Jelinek  wrote:
> Hi!
>
> Doing a movl + shlq by constant seems to be 1 byte shorter
> than movabsq, so this patch attempts to use the former form
> unless flags is live.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Performance-wise, not really sure what is a win (on i7-5960X on the
> testcase in the PR movl + shlq seems to be significantly faster, but e.g.
> on i7-2600 it is the same) so not doing anything for speed yet.
>
> 2017-09-27  Jakub Jelinek  
>
> PR target/82339
> * config/i386/i386.md (*movdi_internal peephole2): New -Os peephole
> for movabsq $(i32 << shift), r64.

LGTM, but also have no idea about performance impact ...

Uros.

> --- gcc/config/i386/i386.md.jj  2017-09-21 09:26:42.0 +0200
> +++ gcc/config/i386/i386.md 2017-09-27 10:24:01.520673889 +0200
> @@ -2379,6 +2379,28 @@ (define_split
>   gen_lowpart (SImode, operands[1]));
>  })
>
> +;; movabsq $0x001234567800, %rax is longer
> +;; than movl $0x12345678, %eax; shlq $24, %rax.
> +(define_peephole2
> +  [(set (match_operand:DI 0 "register_operand")
> +   (match_operand:DI 1 "const_int_operand"))]
> +  "TARGET_64BIT
> +   && optimize_insn_for_size_p ()
> +   && LEGACY_INT_REG_P (operands[0])
> +   && !x86_64_immediate_operand (operands[1], DImode)
> +   && !x86_64_zext_immediate_operand (operands[1], DImode)
> +   && !((UINTVAL (operands[1]) >> ctz_hwi (UINTVAL (operands[1])))
> +& ~(HOST_WIDE_INT) 0x)
> +   && peep2_regno_dead_p (0, FLAGS_REG)"
> +  [(set (match_dup 0) (match_dup 1))
> +   (parallel [(set (match_dup 0) (ashift:DI (match_dup 0) (match_dup 2)))
> + (clobber (reg:CC FLAGS_REG))])]
> +{
> +  int shift = ctz_hwi (UINTVAL (operands[1]));
> +  operands[1] = gen_int_mode (UINTVAL (operands[1]) >> shift, DImode);
> +  operands[2] = gen_int_mode (shift, QImode);
> +})
> +
>  (define_insn "*movsi_internal"
>[(set (match_operand:SI 0 "nonimmediate_operand"
>  "=r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m ,?r ,?*Yi,*k,*k ,*rm")
>
> Jakub

[openacc, testsuite, committed] Fix libgomp.oacc-c-c++-common/loop-g-{1,2}.c for non-nvidia devices

2017-09-28 Thread Tom de Vries


Hi,

this patch makes the test-cases libgomp.oacc-c-c++-common/loop-g-{1,2}.c 
 work for non-nvidia devices.


For nvidia devices, a vector_length of 32 is required for the test to pass.

For devices with a non-32 forced vector_length, this test-case will fail 
the test for excess errors due to:

...
warning: using vector_length (x), ignoring 32
...

Fixed by removing the explicit vector_length setting. For nvidia 
devices, 32 is required, but that's also the forced default, so there's 
no need to be explicit about it.


Committed as obvious.

Thanks,
- Tom
Fix libgomp.oacc-c-c++-common/loop-g-{1,2}.c for non-nvidia devices

2017-09-28  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c (main): Remove
	vector_length(32) clause from acc parallel directive.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c (main): Same.

---
 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
index 7bff6cd..ae1d588 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
@@ -15,7 +15,7 @@ int main ()
   for (ix = 0; ix < N;ix++)
 ary[ix] = -1;
   
-#pragma acc parallel num_gangs(32) vector_length(32) copy(ary) copy(ondev)
+#pragma acc parallel num_gangs(32) copy(ary) copy(ondev)
   {
 #pragma acc loop gang
 for (unsigned ix = 0; ix < N; ix++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
index 92b82a0..c06d861 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
@@ -15,7 +15,7 @@ int main ()
   for (ix = 0; ix < N;ix++)
 ary[ix] = -1;
   
-#pragma acc parallel num_gangs(32) vector_length(32) copy(ary) copy(ondev)
+#pragma acc parallel num_gangs(32) copy(ary) copy(ondev)
   {
 #pragma acc loop gang (static:1)
 for (unsigned ix = 0; ix < N; ix++)

78 matches

Mail list logo