Re: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Richard Biener
On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  wrote:
> Hi!
>
> I trimmed the CC list -- I'm looking for advice about debugging a lto1
> ICE.
>
> On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
> wrote:
>> On Fri, 19 Aug 2016, Richard Biener wrote:
>> > Can you quickly verify if LTO works with the new types?  I don't see 
>> > anything
>> > that would prevent it but having new global trees and backends 
>> > initializing them
>> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
>>
>> Well, the execution tests are in gcc.dg/torture, which is run with various
>> options including -flto (and I've checked the testsuite logs to confirm
>> these tests are indeed run with such options).  Is there something else
>> you think should be tested?
>
> As I noted in :
>
> As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
> nvptx
> offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
> data-type
> mismatch that is not visible with Intel MIC offloading (using the same 
> data
> types) but explodes with nvptx.  I'm having a look.
>
> I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
> backtrace of the ICE looks as follows:
>
> #0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
> line=line@entry=727, function=function@entry=0x10d6e3a 
> <_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
> [...]/source-gcc/gcc/diagnostic.c:1414
> #1  0x0058c9ef in vec::operator[] 
> (this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
> #2  0x0058ca33 in vec::operator[] 
> (this=this@entry=0x1691998, ix=ix@entry=185) at 
> [...]/source-gcc/gcc/vec.h:1211

so it wants tree 185 which is (given the low number) likely one streamed by
preload_common_nodes.  This is carefully crafted to _not_ diverge by
frontend (!) it wasn't even designed to cope with global trees being present
or not dependent on target (well, because the target is always the
same! mind you!)

Now -- in theory it should deal with NULLs just fine (resulting in
error_mark_node), but it can diverge when there are additional
compount types (like vectors, complex
or array or record types) whose element types are not in the set of
global trees.
The complex _FloatN types would be such a case given they appear before their
components.  That mixes up the ordering at least.

So I suggest to add a print_tree to where it does the streamer_tree_cache_append
and compare cc1 and lto1 outcome.

The ICE above means the lto1 has fewer preloaded nodes I guess.

Richard.

> #3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
> ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
> #4  0x00c73eb9 in streamer_get_pickled_tree 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1112
> #5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
> hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
> #6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
> data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
> #7  0x00c720d2 in lto_input_ts_list_tree_pointers 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
> #8  0x00c7444e in streamer_read_tree_body 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1077
> #9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1285
> #10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1315
> #11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1427
> #12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
> entry_len=entry_len@entry=0x7fffcea8) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1339
> #13 0x005890f7 in lto_read_decls 
> (decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
> resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
> #14 0x005898c8 in lto_file_finalize 
> (file_data=file_data@entry=0x77fc, 

Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Thomas Schwinge
Hi!

I trimmed the CC list -- I'm looking for advice about debugging a lto1
ICE.

On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
wrote:
> On Fri, 19 Aug 2016, Richard Biener wrote:
> > Can you quickly verify if LTO works with the new types?  I don't see 
> > anything
> > that would prevent it but having new global trees and backends initializing 
> > them
> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
> 
> Well, the execution tests are in gcc.dg/torture, which is run with various 
> options including -flto (and I've checked the testsuite logs to confirm 
> these tests are indeed run with such options).  Is there something else 
> you think should be tested?

As I noted in :

As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
nvptx
offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
data-type
mismatch that is not visible with Intel MIC offloading (using the same data
types) but explodes with nvptx.  I'm having a look.

I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
backtrace of the ICE looks as follows:

#0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
line=line@entry=727, function=function@entry=0x10d6e3a 
<_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
[...]/source-gcc/gcc/diagnostic.c:1414
#1  0x0058c9ef in vec::operator[] 
(this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
#2  0x0058ca33 in vec::operator[] 
(this=this@entry=0x1691998, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:1211
#3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
#4  0x00c73eb9 in streamer_get_pickled_tree 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
[...]/source-gcc/gcc/tree-streamer-in.c:1112
#5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
#6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
#7  0x00c720d2 in lto_input_ts_list_tree_pointers 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
#8  0x00c7444e in streamer_read_tree_body 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:1077
#9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1285
#10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1315
#11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1427
#12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
entry_len=entry_len@entry=0x7fffcea8) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1339
#13 0x005890f7 in lto_read_decls 
(decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
#14 0x005898c8 in lto_file_finalize 
(file_data=file_data@entry=0x77fc, file=file@entry=0x15eedb0) at 
[...]/source-gcc/gcc/lto/lto.c:2037
#15 0x00589928 in lto_create_files_from_ids 
(file=file@entry=0x15eedb0, file_data=file_data@entry=0x77fc, 
count=count@entry=0x7fffd054) at [...]/source-gcc/gcc/lto/lto.c:2047
#16 0x00589a7a in lto_file_read (file=0x15eedb0, 
resolution_file=resolution_file@entry=0x0, count=count@entry=0x7fffd054) at 
[...]/source-gcc/gcc/lto/lto.c:2088
#17 0x00589e84 in read_cgraph_and_symbols (nfiles=1, 
fnames=0x160e990) at [...]/source-gcc/gcc/lto/lto.c:2798
#18 0x0058a572 in lto_main () at [...]/source-gcc/gcc/lto/lto.c:3299
#19 0x00a48eff in compile_file () at 
[...]/source-gcc/gcc/toplev.c:466
#20 0x00550943 in do_compile () at 
[...]/source-gcc/gcc/toplev.c:2010
#21 toplev::main (this=this@entry=0x7fffd180, argc=argc@entry=20, 
argv=0x15daf20, argv@entry=0x7fffd288) at [...]/source-gcc/gcc/toplev.c:2144
#22 0x00552717 in main (argc=20, argv=0x7fffd288) at 
[...]/source-gcc/gcc/main.c:39

(Comparing to yesterday's r240004, the 

Re: why do we need xtensa-config.h?

2016-09-07 Thread augustine.sterl...@gmail.com
On Tue, Sep 6, 2016 at 11:55 PM, Thomas Schwinge
 wrote:
> Hi!
>
> Neither do I really know anything about Xtensa, nor do I have a lot of
> experience in these parts of GCC back ends, but:

There is a lot of background to know here. Unfortunately, I have no
familiarity with making debian packages, so I'm unfamiliar with that
side of it.

First--and perhaps most important--the current method of configuring
GCC for xtensa targets has worked well for nearly two decades. As far
as I know, it is rare to encounter problems. Because of that, the bar
to change it will probably be fairly high to change it.

> On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
> wrote:
>> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
>> way to provide this firmware as opensource/free package for debian. Main
>> problem seems to be the need to patch gcc xtensa-config.h to make it
>> suitable for our CPU.
>>
>> I have fallowing questions:
>>
>> do we really need this patch?
>> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch
>
> That I can't tell.  ;-)

You need something like that patch, for sure.

>> Is it possible or welcome to extend gcc to be configurable without
>> patching it all the time?
>
> Yes, I would think.  The macros modified in the above patch to GCC's
> include/xtensa-config.h file look like these ought to be modifiable with
> -m* options defined by the Xtensa back end, and you'd then assign
> specific defaults to a specific CPU variant, and build GCC (or build a
> multilib) for that configuration.

Today, there are literally hundreds of variants of the xtensa cpu
actually realized and in use. Having a list of all those variants and
their defaults inside gcc would be awkward and unwieldy.

But--and here's the rub--literally tomorrow, someone could design a
hundred more that are different from all of the ones already out
there. There is literally an unlimited number of potential variants,
each with potentially brand new, never conceived instructions. (Adding
clever custom instructions is xtensa's raison d'etre.)

With the current configurability mechanism, supporting all of those
variants inside gcc (and, in fact, the rest of the gnu-toolchain) is
simply a matter of using the correct xtensa-config.h for that
particular variant. If we were to go with the "-m with defaults"
mechanism, we would need some way of adding the defaults for the new
variant to gcc.

But that is patching gcc also, and once you go there, you may as well
use the original method.

>
> This file include/xtensa-config.h is #included in
> gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,

Note that "-m" options can't change the instructions in crti.S and
lib?funcs.S, but macros can and do.



On the debian packaging side. Forgive me for my ignorance on the
topic; I don't know that the tool-flow is, or what the requirements
are. As far as I am aware, this is the first time someone has tried to
make a debian package for xtensa.

Anyway, I wouldn't expect patching gcc (or configuring it) for an
individual package is the right thing. It should probably happen as
part of some kind of "setup toolchain" step.

Typically, patching gcc for a xtensa config happens just once
immediately after designing the processor, or--if you aren't the
designer yourself--when one starts development for that variant.

Surely if someone is building this package, they would have already
built some other software for that particular xtensa target. (Perhaps
as part of a larger debian build?)

Also, this package should probably only be built when targeting this
particular xtensa variant (not xtensa generally). I don't know how one
restricts this in the debian packaging mechanism.

Hope this helps, and I'm happy to answer more questions.


Re: Proposal: readable and writable attributes on variables

2016-09-07 Thread Jeff Law

On 09/01/2016 09:04 AM, Martin Sebor wrote:


Understood.  I think a write-only attribute or type qualifier would
make sense.  Until/unless it's implemented I would recommend to work
around its absence by hiding access to the registers behind a read-
only and write-only functional API.
As you noted earlier Martin, if we bake it into the typesystem, then you 
get the desired warnings when you mix-n-match types.  For that reason I 
see a type qualifier is superior to an attribute.


IIRC the national labs that were looking at the alignment attribute 
essentially came to the same conclusion -- bake it into the core of the 
typesystem and rely on the typesystem to ensure you don't lose the data.


Jeff


Re: why do we need xtensa-config.h?

2016-09-07 Thread augustine.sterl...@gmail.com
On Wed, Sep 7, 2016 at 9:21 AM, augustine.sterl...@gmail.com
 wrote:
> Hope this helps, and I'm happy to answer more questions.

Also, one technique commonly used by people who ship software for
Xtensa is to write it such that it could compile for any variant at
all. This requires care, but is quite doable.


Re: why do we need xtensa-config.h?

2016-09-07 Thread Thomas Schwinge
Hi!

Neither do I really know anything about Xtensa, nor do I have a lot of
experience in these parts of GCC back ends, but:

On Tue, 6 Sep 2016 20:42:53 +0200, Oleksij Rempel  
wrote:
> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
> way to provide this firmware as opensource/free package for debian. Main
> problem seems to be the need to patch gcc xtensa-config.h to make it
> suitable for our CPU.
> 
> I have fallowing questions:
> 
> do we really need this patch?
> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch

That I can't tell.  ;-)

> Is it possible or welcome to extend gcc to be configurable without
> patching it all the time?

Yes, I would think.  The macros modified in the above patch to GCC's
include/xtensa-config.h file look like these ought to be modifiable with
-m* options defined by the Xtensa back end, and you'd then assign
specific defaults to a specific CPU variant, and build GCC (or build a
multilib) for that configuration.

This file include/xtensa-config.h is #included in
gcc/config/xtensa/xtensa.h and libgcc/config/xtensa/crti.S,
libgcc/config/xtensa/crtn.S, libgcc/config/xtensa/lib1funcs.S,
libgcc/config/xtensa/lib2funcs.S, but I have not checked how the macro
definitions are actually used.

In gcc/doc/install.texi I read:

@anchor{xtensa-x-elf}
@heading xtensa*-*-elf
This target is intended for embedded Xtensa systems using the
@samp{newlib} C library.  It uses ELF but does not support shared
objects.  Designed-defined instructions specified via the
Tensilica Instruction Extension (TIE) language are only supported
through inline assembly.

The Xtensa configuration information must be specified prior to
building GCC@.  The @file{include/xtensa-config.h} header
file contains the configuration information.  If you created your
own Xtensa configuration with the Xtensa Processor Generator, the
downloaded files include a customized copy of this header file,
which you can use to replace the default header file.

@html

@end html
@anchor{xtensa-x-linux}
@heading xtensa*-*-linux*
[...]

Hmm.  CCing Sterling Augustine who is listed as the Xtensa CPU Port
Maintainer.


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: why do we need xtensa-config.h?

2016-09-07 Thread Max Filippov
Hello,

On Tue, Sep 6, 2016 at 11:42 AM, Oleksij Rempel  wrote:
> i'm one of ath9k-htc-firmware developers. Currently i'm looking for the
> way to provide this firmware as opensource/free package for debian. Main
> problem seems to be the need to patch gcc xtensa-config.h to make it
> suitable for our CPU.
>
> I have fallowing questions:
>
> do we really need this patch?
> https://github.com/qca/open-ath9k-htc-firmware/blob/master/local/patches/gcc.patch

Yes, these changes are needed, but perhaps not in a form of a patch.
The changed file is a part of the configuration overlay that need to be
applied to binutils, gcc and gdb in order to configure them to generate
code for the specific xtensa core.
We have xtensa support in the crosstool-NG and the Buildroot, both
of which can generate xtensa toolchain using configuration overlay.
Please refer to
 http://wiki.linux-xtensa.org/index.php/Toolchain_and_Embedded_Distributions
for more information about xtensa configuration overlay and its usage
by crosstool-NG and Buildroot.

> Is it possible or welcome to extend gcc to be configurable without
> patching it all the time?

It is definitely welcome and is likely possible.
Please do not forget that both gcc and binutils need to have coherent idea
of the processor they're generating code for.

-- 
Thanks.
-- Max


[Bug c++/77509] ICE on invalid C++ code: in finish_class_member_access_expr, at cp/typeck.c:2783

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77509

Martin Liška  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||marxin at gcc dot gnu.org
   Target Milestone|--- |5.5
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, all releases I have (4.5.0) ICE.

Re: [PATCH, c++, PR77427 ] Set TYPE_STRUCTURAL_EQUALITY for sysv_abi va_list

2016-09-07 Thread Richard Biener
On Mon, Sep 5, 2016 at 6:11 PM, Tom de Vries  wrote:
> On 05/09/16 09:49, Richard Biener wrote:
>>
>> On Sun, Sep 4, 2016 at 11:30 PM, Tom de Vries 
>> wrote:
>>>
>>> > On 04/09/16 16:08, Richard Biener wrote:

 >>
 >> On September 4, 2016 12:33:02 PM GMT+02:00, Tom de Vries
 >>  wrote:
>
> >>>
> >>> On 04/09/16 08:12, Richard Biener wrote:
>>
>> 
>>  On September 3, 2016 5:23:35 PM GMT+02:00, Tom de Vries
>
> >>>
> >>>  wrote:

 >>
 >> Hi,
 >>
 >> this patch fixes a c++ ICE, a p1 6/7 regression.
 >>
 >>
 >> Consider test.C:
 >> ...
 >> void bar (__builtin_va_list &);
 >>
 >> struct c
 >> {
 >>   operator const __builtin_va_list &();
 >>   operator __builtin_va_list &();
 >> };
 >>
 >> void
 >> foo (void) {
 >>   struct c c1;
 >>
 >>   bar (c1);
 >> }
 >> ...
 >>
 >> The compiler ICEs as follows:
 >> ...
 >> test.C: In function ‘void foo()’:
 >> test.C:13:10: internal compiler error: canonical types differ
 >> for
 >> identical types __va_list_tag [1] and __va_list_tag [1]
 >>bar (c1);
 >>   ^
 >> comptypes(tree_node*, tree_node*, int)
 >> src/gcc/cp/typeck.c:1430
 >> reference_related_p(tree_node*, tree_node*)
 >> src/gcc/cp/call.c:1415
 >> reference_binding
 >> src/gcc/cp/call.c:1559
 >> implicit_conversion
 >> src/gcc/cp/call.c:1805
 >> build_user_type_conversion_1
 >> src/gcc/cp/call.c:3776
 >> reference_binding
 >> src/gcc/cp/call.c:1664
 >> implicit_conversion
 >> src/gcc/cp/call.c:1805
 >> add_function_candidate
 >> src/gcc/cp/call.c:2141
 >> add_candidates
 >> src/gcc/cp/call.c:5394
 >> perform_overload_resolution
 >> src/gcc/cp/call.c:4066
 >> build_new_function_call(tree_node*, vec
> >>>
> >>> vl_embed>**,

 >>
 >>   bool, int)
 >> src/gcc/cp/call.c:4143
 >> finish_call_expr(tree_node*, vec> vl_embed>**,
>
> >>>
> >>> bool,

 >>
 >>   bool, int)
 >> src/gcc/cp/semantics.c:2440
 >> ...
 >>
 >> The regression is caused by the commit for PR70955, that adds
 >> a
 >> "sysv_abi va_list" attribute to the struct in the va_list
 >> type
>
> >>>
> >>> (which

 >>
 >> is
 >> an array of one of struct).
 >>
 >> The ICE in comptypes happens as follows: we're comparing two
>
> >>>
> >>> versions

 >>
 >> of
 >> va_list type (with identical array element type), each with
 >> the
 >> canonical type set to themselves. Since the types are
 >> considered
 >> identical, they're supposed to have identical canonical
 >> types,
>
> >>>
> >>> which is
> >>>
>>
>>  Did you figure out why they are not assigned the same canonical
>>  type?
>
> >>>
> >>>
> >>> When constructing the first type in ix86_build_builtin_va_list_64,
> >>> it's
> >>>
> >>> cached.
> >>>
> >>> When constructing the second type in build_array_type_1 (with call
> >>> stack: grokdeclarator -> cp_build_qualified_type_real ->
> >>> build_cplus_array_type -> build_cplus_array_type ->
> >>> build_array_type ->
> >>>
> >>> build_array_type_1), we call type_hash_canon.
> >>>
> >>> But the cached type has name __builtin_sysv_va_list, and the new
> >>> type
> >>> has no name, so we hit the clause 'TYPE_NAME (a->type) != TYPE_NAME
> >>> (b->type)' in type_cache_hasher::equal.
> >>>
> >>> Consequently, TYPE_CANONICAL for the new type remain set to self.

 >>
 >>
 >> But how did it then work before the patch causing this?
>>>
>>> >
>>> >
>>> > Before the patch that introduced the attribute, rather than assigning
>>> > the
>>> > result of ix86_build_builtin_va_list_64 directly
>>> > sysv_va_list_type_node, an
>>> > intermediate build_variant_type_copy was used.
>>> >
>>> > This had as consequence 

[Bug bootstrap/77359] [7 Regression] AIX bootstrap failure due to alignment of stack pointer + STACK_DYNAMIC_OFFSET

2016-09-07 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77359

--- Comment #14 from Dominik Vogt  ---
Okay, it looks like outgoing_args_size is rounded up to a multiple of
preferred_stack_boundary, so there's no problem on s390 or other targets with a
stack allocation size smaller than STACK_BOUNDARY.  So, what needs to be done
to get the patch back in?

1. Test and apply a patch with the change suggersted in comment 10.  (I can do
a quick test of the patch on AIX, but someone else should test AIX/Darwin/Linux
on Power with the patch)
2. Reapply the backed out patch.
3. Create a bug report for the not matching conditions in STACK_DYNAMIC_OFFSET
and STACK_BOUNDARY in rs6000.h and let the maintainers of that code clean it
up.

Opinions?

Re: [PATCH GCC 4/9]Check niters for peeling for data access gaps in analyzer

2016-09-07 Thread Jeff Law

On 09/06/2016 12:51 PM, Bin Cheng wrote:

Hi,
This patch checks if loop has enough niters for peeling for data access gaps in 
vect_analyze_loop_2, while now this check is in vect_transform_loop stage.  The 
problem is vectorizer may vectorize loops without enough iterations and 
generate false guard on the vectorized loop.  Though the loop is successfully 
vectorized, it will never be executed, and most likely, it will be removed 
during cfg-cleanup.  Examples can be found in revised tests of this patch.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (vect_analyze_loop_2): Check and skip loop if it
has no enough iterations for LOOP_VINFO_PEELING_FOR_GAPS.

gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gcc.dg/vect/vect-98.c: Refine test case.
* gcc.dg/vect/vect-strided-a-u8-i8-gap2.c: Ditto.
* gcc.dg/vect/vect-strided-u8-i8-gap2.c: Ditto.
* gcc.dg/vect/vect-strided-u8-i8-gap4.c: Ditto.


OK.
jeff


Re: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Richard Biener
On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  wrote:
> Hi!
>
> I trimmed the CC list -- I'm looking for advice about debugging a lto1
> ICE.
>
> On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
> wrote:
>> On Fri, 19 Aug 2016, Richard Biener wrote:
>> > Can you quickly verify if LTO works with the new types?  I don't see 
>> > anything
>> > that would prevent it but having new global trees and backends 
>> > initializing them
>> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
>>
>> Well, the execution tests are in gcc.dg/torture, which is run with various
>> options including -flto (and I've checked the testsuite logs to confirm
>> these tests are indeed run with such options).  Is there something else
>> you think should be tested?
>
> As I noted in :
>
> As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
> nvptx
> offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
> data-type
> mismatch that is not visible with Intel MIC offloading (using the same 
> data
> types) but explodes with nvptx.  I'm having a look.
>
> I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
> backtrace of the ICE looks as follows:
>
> #0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
> line=line@entry=727, function=function@entry=0x10d6e3a 
> <_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
> [...]/source-gcc/gcc/diagnostic.c:1414
> #1  0x0058c9ef in vec::operator[] 
> (this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
> #2  0x0058ca33 in vec::operator[] 
> (this=this@entry=0x1691998, ix=ix@entry=185) at 
> [...]/source-gcc/gcc/vec.h:1211

so it wants tree 185 which is (given the low number) likely one streamed by
preload_common_nodes.  This is carefully crafted to _not_ diverge by
frontend (!) it wasn't even designed to cope with global trees being present
or not dependent on target (well, because the target is always the
same! mind you!)

Now -- in theory it should deal with NULLs just fine (resulting in
error_mark_node), but it can diverge when there are additional
compount types (like vectors, complex
or array or record types) whose element types are not in the set of
global trees.
The complex _FloatN types would be such a case given they appear before their
components.  That mixes up the ordering at least.

So I suggest to add a print_tree to where it does the streamer_tree_cache_append
and compare cc1 and lto1 outcome.

The ICE above means the lto1 has fewer preloaded nodes I guess.

Richard.

> #3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
> ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
> #4  0x00c73eb9 in streamer_get_pickled_tree 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1112
> #5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
> hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
> #6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
> data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
> #7  0x00c720d2 in lto_input_ts_list_tree_pointers 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
> #8  0x00c7444e in streamer_read_tree_body 
> (ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
> expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/tree-streamer-in.c:1077
> #9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1285
> #10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1315
> #11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
> at [...]/source-gcc/gcc/lto-streamer-in.c:1427
> #12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
> data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
> entry_len=entry_len@entry=0x7fffcea8) at 
> [...]/source-gcc/gcc/lto-streamer-in.c:1339
> #13 0x005890f7 in lto_read_decls 
> (decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
> resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
> #14 0x005898c8 in lto_file_finalize 
> (file_data=file_data@entry=0x77fc, 

[PATCH][expmed.c] PR middle-end/77426 Delete duplicate condition in synth_mult

2016-09-07 Thread Kyrill Tkachov

Hi all,

The duplicate mode check in synth can just be deleted IMO. It was introduced as 
part of r139821 that was
a much larger change introducing size/speed differentiation to the RTL midend. 
So I think it's just a typo/copy-pasto.

Tested on aarch64-none-elf.
Ok?

Thanks,
Kyrill

2016-09-07  Kyrylo Tkachov  

PR middle-end/77426
* expmed.c (synth_mult): Delete duplicate mode check.
diff --git a/gcc/expmed.c b/gcc/expmed.c
index 1cedf023c8e8916d887bd3a9d9a723e3cc2354f7..a5da8836f21debcda3b834cb869348ea6cb33414 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2572,7 +2572,6 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
   entry_ptr = alg_hash_entry_ptr (hash_index);
   if (entry_ptr->t == t
   && entry_ptr->mode == mode
-  && entry_ptr->mode == mode
   && entry_ptr->speed == speed
   && entry_ptr->alg != alg_unknown)
 {


Re: [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c

2016-09-07 Thread Jeff Law

On 09/06/2016 12:49 PM, Bin Cheng wrote:

Hi,
This is a patch set generating new control flow graph for vectorized loop and 
its peeling loops.  At the moment, CFG for vecorized loop is complicated and 
sub-optimal.  Major issues are like:
A) For both prologue and vectorized loop, it generates guard/branch before 
loops checking if the following (prologue/vectorized) loop should be skipped.  
It also generates guard/branch after loops checking if the next loop 
(vectorized/epilogue) loop should be skipped.
B) Depending on how conditional set is supported by targets, it may generates 
one additional if-statement (branch) setting the niters for prologue loop.
C) In the worst cases, up to 4 branch instructions need to be executed before 
vectorized loop is entered.
D) For loops without enough niters, it checks some (niters_prologue) 
iterations with prologue loop; then checks if the rest number of iterations (niters 
- niters_prologue) is enough for vectorization; if not, it skips vectorized loop 
and continues with epilogue loop.  This is bad since vectorized loop won't be 
executed at all after all the hassle.

This patch set improves it by merging different checks thus only 2 branch 
instructions (could be further reduced in combination with loop versioning) are 
executed before vectorized loop; it does better in compile time analysis in 
order to avoid prologue/epilogue peeling if possible; it improves code 
generation in various ways (live overflow handling, generating short live 
ranges).  In terms of implementation, it tries to factor SSA updating code out 
of CFG changing code, I think this may help future work replacing slpeel_* with 
generic GIMPLE loop copier.

So far there are 9 patches in the set, patch [1-5] are small prerequisites for 
major change which is done by patch 6.  Patch [7-9] are small patches either 
address test case or improve code generation.  Final bootstrap and test of 
patch set ongoing on x86_64 and AArch64.  Assume no new failure or will be 
fixed, any comments on this?

This is the first patch deleting useless code in tree-vect-loop-manip.c, as 
well as fixing obvious code style issue.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop-manip.c (slpeel_can_duplicate_loop_p): Fix code
style issue.
(vect_do_peeling_for_loop_bound, vect_do_peeling_for_alignment):
Remove useless code.
Seems obvious to me -- I can't think of any reason why we'd emit a NULL 
sequence to the loop preheader edge.


jeff



Re: [PATCH GCC 5/9]Put copied loop after its preheader and after the original loop's latch in basic block link list

2016-09-07 Thread Jeff Law

On 09/06/2016 12:52 PM, Bin Cheng wrote:

Hi,
This simple patch changes slpeel_tree_duplicate_loop_edge_cfg by putting copied 
loop after its new preheader and after the original loop's latch in basic 
block's linked list.  It doesn't change CFG at all, but makes the dump cfg a 
little bit easier to read.  I assume this is good for basic block reordering 
too?

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Put
duplicated loop after its preheader and after the original loop.

In theory bb reordering ought to clean this up.  But better to generate 
good code from the start when it's easy to do so.


OK.

jeff


[Bug libfortran/77393] [7 Regression] Revision r237735 changed the behavior of F0.0

2016-09-07 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77393

--- Comment #14 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #13 from Jerry DeLisle  ---
> Author: jvdelisle
> Date: Tue Sep  6 23:22:26 2016
> New Revision: 240018
>
> URL: https://gcc.gnu.org/viewcvs?rev=240018=gcc=rev
> Log:
> 2016-09-06  Jerry DeLisle  
>
> PR libgfortran/77393
> * io/write_float.def (build_float_string): Recognize when the
> result will not fit in the user provided, star fill, and exit
> early.
>
> * gfortran.dg/fmt_f0_2.f90: Update test.
> * gfortran.dg/fmt_f0_3.f90: New test.

I've only now managed to test the patch: it passes on both
i386-pc-solaris2.12 and sparc-sun-solaris2.12.

Thanks.
Rainer

[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #1 from Andreas Schwab  ---
The "3" flag on the line marker marks the following lines as originating from a
system header where warnings are suppressed.  Use -Wsystem-headers to enable
them.

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Mark Wielaard
On Wed, Sep 07, 2016 at 11:15:34AM +0200, Florian Weimer wrote:
> On 09/06/2016 11:31 PM, Paul Eggert wrote:
> >On 09/06/2016 01:40 PM, Joseph Myers wrote:
> >>Sounds like a defect in C11 to me - none of the examples of flexible
> >>array
> >>members anticipate needing to add to the size to allow for tail padding
> >>with unknown alignment requirements.
> >
> >Yes, I would prefer calling it a defect, as most code I've seen dealing
> >with flexible array members does not align the tail size. However, GCC +
> >valgrind does take advantage of this "defect" and I would not be
> >surprised if other picky implementations do too.
> 
> It might be an inherent limitation of the valgrind approach. 
> Speculative loads which cannot result in data races (in the C11 sense) 
> due to the way the architecture behaves should be fine.  The alignment 
> ensures that the load is on the same page, which is what typically 
> prevent this optimization.

It might or might not be an issue for valgrind. If valgrind believes the
memory isn't in valid memory then it will complain about an invalid access.
But if the memory is accessible but uninitialised then it will just track
the undefinedness complain later if such a value is used.

> Some implementation techniques for C string functions result in the same 
> behavior.  valgrind intercepts them or suppresses errors there, but 
> that's not possible for code that GCC emits inline, obviously.

valgrind also has --partial-loads-ok (which in newer versions defaults
to =yes):

   Controls how Memcheck handles 32-, 64-, 128- and 256-bit naturally
   aligned loads from addresses for which some bytes are addressable
   and others are not. When yes, such loads do not produce an address
   error. Instead, loaded bytes originating from illegal addresses are
   marked as uninitialised, and those corresponding to legal addresses
   are handled in the normal way.

> valgrind would still treat the bytes beyond the allocation boundary as 
> undefined.  But I agree that false positives in this area are annoying.

Does anybody have an example program of the above issue compiled with
gcc that produces false positives with valgrind?

Thanks,

Mark


Re: [PATCH GCC 2/9]Add interface reseting original copy tables in cfg.c

2016-09-07 Thread Jeff Law

On 09/06/2016 12:50 PM, Bin Cheng wrote:

Hi,
This simple patch adds interface reseting original copy table in cfg.c.  This 
will be used in rewriting vect_do_peeling_* functions in vectorizer so that we 
don't need to release/allocate tables between prolog and epilog peeling.

Thanks,
bin

2016-09-01  Bin Cheng  

* cfg.c (reset_original_copy_tables): New func.
* cfg.h (reset_original_copy_tables): New decl.

Needs a function comment for reset_original_copy_tables.  Should be fine 
with that change.


Jeff


Re: [PATCH GCC 8/9]Adjust test case for CFG changes in vectorized loop

2016-09-07 Thread Jeff Law

On 09/06/2016 12:53 PM, Bin Cheng wrote:

Hi,
After CFG changes in vectorizer, the epilog loop now can be completely peeled, 
resulting in changes in the number of instructions that these tests check.  
This patch adjusts related checking strings.

Thanks,
bin


gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gcc.target/i386/l_fma_float_1.c: Revise test.
* gcc.target/i386/l_fma_float_2.c: Ditto.
* gcc.target/i386/l_fma_float_3.c: Ditto.
* gcc.target/i386/l_fma_float_4.c: Ditto.
* gcc.target/i386/l_fma_float_5.c: Ditto.
* gcc.target/i386/l_fma_float_6.c: Ditto.
* gcc.target/i386/l_fma_double_1.c: Ditto.
* gcc.target/i386/l_fma_double_2.c: Ditto.
* gcc.target/i386/l_fma_double_3.c: Ditto.
* gcc.target/i386/l_fma_double_4.c: Ditto.
* gcc.target/i386/l_fma_double_5.c: Ditto.
* gcc.target/i386/l_fma_double_6.c: Ditto.

OK when prerequisites are approved.
jeff



Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Jeff Law

On 09/07/2016 02:19 AM, Richard Biener wrote:

On Tue, 6 Sep 2016, Jakub Jelinek wrote:


On Tue, Sep 06, 2016 at 04:59:23PM +0100, Kyrill Tkachov wrote:

On 06/09/16 16:32, Jakub Jelinek wrote:

On Tue, Sep 06, 2016 at 04:14:47PM +0100, Kyrill Tkachov wrote:

The v3 of this patch addresses feedback I received on the version posted at [1].
The merged store buffer is now represented as a char array that we splat values 
onto with
native_encode_expr and native_interpret_expr. This allows us to merge anything 
that native_encode_expr
accepts, including floating point values and short vectors. So this version 
extends the functionality
of the previous one in that it handles floating point values as well.

The first phase of the algorithm that detects the contiguous stores is also 
slightly refactored according
to feedback to read more fluently.

Richi, I experimented with merging up to MOVE_MAX bytes rather than word size 
but I got worse results on aarch64.
MOVE_MAX there is 16 (because it has load/store register pair instructions) but 
the 128-bit immediates that we ended
synthesising were too complex. Perhaps the TImode immediate store RTL 
expansions could be improved, but for now
I've left the maximum merge size to be BITS_PER_WORD.

At least from playing with this kind of things in the RTL PR22141 patch,
I remember storing 64-bit constants on x86_64 compared to storing 2 32-bit
constants usually isn't a win (not just for speed optimized blocks but also for
-Os).  For 64-bit store if the constant isn't signed 32-bit or unsigned
32-bit you need movabsq into some temporary register which has like 3 times 
worse
latency than normal store if I remember well, and then store it.


We could restrict the maximum width of the stores generated to 32 bits on 
x86_64.
I think this would need another parameter or target macro for the target to set.
Alternatively, is it a possibility for x86 to be a bit smarter in its DImode 
mov-immediate
expansion? For example break up the 64-bit movabsq immediate into two SImode 
immediates?


If you want a 64-bit store, you'd need to merge the two, and that would be
even more expensive.  It is a matter of say:
movl $0x12345678, (%rsp)
movl $0x09abcdef, 4(%rsp)
vs.
movabsq $0x09abcdef12345678, %rax
movq %rax, (%rsp)
vs.
movl $0x09abcdef, %eax
salq $32, %rax
orq $0x12345678, %rax
movq $rax, (%rsp)


vs.

movq $LC0, (%rsp)

?


etc.  Guess it needs to be benchmarked on contemporary CPUs, I'm pretty sure
the last sequence is the worst one.


I think the important part to notice is that it should be straight forward
for a target / the expander to split a large store from an immediate
into any of the above but very hard to do the opposite.  Thus from a
GIMPLE side "canonicalizing" to large stores (that are eventually
supported and well-aligned) seems best to me.

Agreed.





I'm aware of that. The patch already has logic to avoid emitting unaligned 
accesses
for SLOW_UNALIGNED_ACCESS targets. Beyond that the patch introduces the 
parameter
PARAM_STORE_MERGING_ALLOW_UNALIGNED that can be used by the user or target to
forbid generation of unaligned stores by the pass altogether. Beyond that I'm 
not sure
how to behave more intelligently here. Any ideas?


Dunno, the heuristics was the main problem with my patch.  Generally, I'd
say there is a difference between cold and hot blocks, in cold ones perhaps
unaligned stores are more appropriate (if supported at all and not way too
slow), while in hot ones less desirable.


Note that I repeatedly argue that if we can canonicalize sth to "larger"
then even if unaligned, the expander should be able to produce optimal
code again (it might not do, of course).
And agreed.  Furthermore, it's in line with our guiding principles WRT 
separation of the tree/SSA optimizers from target dependencies.


So let's push those decisions into the expanders/backend/target and 
canonicalize to the larger stores.


jeff




Re: Ping**2! Re: [PATCH, Fortran] Extension: AUTOMATIC/STATIC symbol attributes with -fdec-static

2016-09-07 Thread Andre Vehreschild
Hi Fritz,

please note: I do not have official review privileges. So my vote here
is rather an advise to you and the official reviewers. Often such a
inofficial review helps to speed things up, because the official ones
are pointed to the nics and nacs and don't have to bother with the
minor things.

So here it comes:

- Do I understand this correctly: AUTOMATIC and STATIC have to come last,
  i.e., right before the :: where declaring, e.g., a variable?


- Running:

  $ contrib/check_GNU_style.sh dec_static.patch

  Reports some style issues in the C code, that should be fixed before
  commit. (Style in Fortran testcases does not matter that much.)


- I have deleted huge parts of the diff and just kept the parts I had a
  question/remark for:

> diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
> index b34ae86..a0cf78b 100644
> --- a/gcc/fortran/gfortran.texi
> +++ b/gcc/fortran/gfortran.texi
> @@ -2120,7 +2121,6 @@ consider @code{BACKSPACE} or @code{REWIND} to properly 
> position
>  the file before the EOF marker.  As an extension, the run-time error may
>  be disabled using -std=legacy.
>  
> -

Please change formatting in a separate patch or not at all (here!).
This policy is to distinguish cosmetic changes from relevant ones.

>  @node STRUCTURE and RECORD
>  @subsection @code{STRUCTURE} and @code{RECORD}
>  @cindex @code{STRUCTURE}
> @@ -2420,6 +2420,53 @@ here:
>@tab @code{--} @tab @code{FLOATI} @tab @code{FLOATJ} @tab @code{FLOATK}
>  @end multitable
>  
> +@node AUTOMATIC and STATIC attributes
> +@subsection @code{AUTOMATIC} and @code{STATIC} attributes
> +@cindex variable attributes
> +@cindex @code{AUTOMATIC}
> +@cindex @code{STATIC}
> +
> +With @option{-fdec-static} GNU Fortran supports the explicit specification of
> +two addition variable attributes: @code{STATIC} and @code{AUTOMATIC}. These

two additional variable ...
^^ 

But is it only for variables? Can't it be used for equivalences or
other constructs, too?

> diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
> index 15c131a..a5da59e 100644
> --- a/gcc/fortran/invoke.texi
> +++ b/gcc/fortran/invoke.texi
> @@ -255,6 +255,11 @@ instead where possible.
>  Enable B/I/J/K kind variants of existing integer functions (e.g. BIAND, 
> IIAND,
>  JIAND, etc...). For a complete list of intrinsics see the full documentation.
>  
> +@item -fdec-static
> +@opindex @code{fdec-static}
> +Enable STATIC and AUTOMATIC as attributes specifying storage location.
> +STATIC is equivalent to SAVE, and locals are typically AUTOMATIC by default.

Well, this description to me sounds like: "Those attributes are
useless, because they can be substituted." This is clearly not what you
intend. I propose to include into the description that with "this
switch the dec-extension" is available "to explicitly specify the
storage of entities". Then the last sentence is still a good hint for
all fortraneers that don't know the extension.

> +
>  @item -fdollar-ok
>  @opindex @code{fdollar-ok}
>  @cindex @code{$}
> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
> index 8ec5400..260512d 100644
> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> @@ -432,6 +432,10 @@ fdec-structure
>  Fortran
>  Enable support for DEC STRUCTURE/RECORD.
>  
> +fdec-static
> +Fortran Var(flag_dec_static)
> +Enable STATIC and AUTOMATIC attributes.

How about: Enable the dec-extension of STATIC and AUTOMATIC attributes.
Just a proposal.

> +
>  fdefault-double-8
>  Fortran Var(flag_default_double)
>  Set the default double precision kind to an 8 byte wide type.


> diff --git a/gcc/testsuite/gfortran.dg/dec_static_1.f90 
> b/gcc/testsuite/gfortran.dg/dec_static_1.f90
> new file mode 100644
> index 000..4dcfc7c
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/dec_static_1.f90

-  Please add some testcases where the new error messages are tested.

So much from my side. Btw, I haven't applied the patch and tested
whether it runs or collides with other proposed patches. That is
usually done by Dominique and I did not want to waste doing it a second
time.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Bernd Schmidt



On 09/07/2016 10:19 AM, Richard Biener wrote:

On Tue, 6 Sep 2016, Jakub Jelinek wrote:



If you want a 64-bit store, you'd need to merge the two, and that would be
even more expensive.  It is a matter of say:
movl $0x12345678, (%rsp)
movl $0x09abcdef, 4(%rsp)
vs.
movabsq $0x09abcdef12345678, %rax
movq %rax, (%rsp)
vs.
movl $0x09abcdef, %eax
salq $32, %rax
orq $0x12345678, %rax
movq $rax, (%rsp)


vs.

movq $LC0, (%rsp)

?


Not the same. That moves the address of $LC0.


Bernd


[Bug fortran/57117] [OOP] ICE for sourced allocation of a polymorphic entity using TRANSPOSE

2016-09-07 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57117

--- Comment #13 from vehre at gcc dot gnu.org ---
Created attachment 39581
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39581=edit
Shorter version to fix the issue.

Hi all,

Dominique pointed out, that the patches proposed by Paul conflict with my accaf
patch. I took a look and found a less-intrusive version to fix this issue.

Note!!! This patch is base on my patch for pr72832 available from:

https://gcc.gnu.org/ml/fortran/2016-09/msg7.html

Bootstraps and regtests ok on x86_64-linux/F23. I haven't tested this patch in
combination with my accaf patch yet (time constraints).

- Andre

Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])

2016-09-07 Thread Thomas Schwinge
Hi!

I trimmed the CC list -- I'm looking for advice about debugging a lto1
ICE.

On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers  
wrote:
> On Fri, 19 Aug 2016, Richard Biener wrote:
> > Can you quickly verify if LTO works with the new types?  I don't see 
> > anything
> > that would prevent it but having new global trees and backends initializing 
> > them
> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
> 
> Well, the execution tests are in gcc.dg/torture, which is run with various 
> options including -flto (and I've checked the testsuite logs to confirm 
> these tests are indeed run with such options).  Is there something else 
> you think should be tested?

As I noted in :

As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx types", 
nvptx
offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
data-type
mismatch that is not visible with Intel MIC offloading (using the same data
types) but explodes with nvptx.  I'm having a look.

I know how to use "-save-temps -v" to re-run the ICEing lto1 in GDB; a
backtrace of the ICE looks as follows:

#0  fancy_abort (file=file@entry=0x10d61d0 "[...]/source-gcc/gcc/vec.h", 
line=line@entry=727, function=function@entry=0x10d6e3a 
<_ZZN3vecIP9tree_node7va_heap8vl_embedEixEjE12__FUNCTION__> "operator[]") at 
[...]/source-gcc/gcc/diagnostic.c:1414
#1  0x0058c9ef in vec::operator[] 
(this=0x16919c0, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:727
#2  0x0058ca33 in vec::operator[] 
(this=this@entry=0x1691998, ix=ix@entry=185) at [...]/source-gcc/gcc/vec.h:1211
#3  0x00c73e54 in streamer_tree_cache_get_tree (cache=0x1691990, 
ix=ix@entry=185) at [...]/source-gcc/gcc/tree-streamer.h:98
#4  0x00c73eb9 in streamer_get_pickled_tree 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930) at 
[...]/source-gcc/gcc/tree-streamer-in.c:1112
#5  0x008f841b in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=LTO_tree_pickle_reference, 
hash=hash@entry=0) at [...]/source-gcc/gcc/lto-streamer-in.c:1404
#6  0x008f8844 in lto_input_tree (ib=0x7fffceb0, 
data_in=0x1691930) at [...]/source-gcc/gcc/lto-streamer-in.c:1444
#7  0x00c720d2 in lto_input_ts_list_tree_pointers 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:861
#8  0x00c7444e in streamer_read_tree_body 
(ib=ib@entry=0x7fffceb0, data_in=data_in@entry=0x1691930, 
expr=expr@entry=0x76993780) at [...]/source-gcc/gcc/tree-streamer-in.c:1077
#9  0x008f6428 in lto_read_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, expr=expr@entry=0x76993780) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1285
#10 0x008f651b in lto_read_tree (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1315
#11 0x008f85db in lto_input_tree_1 (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, tag=tag@entry=4, hash=hash@entry=4086308758) 
at [...]/source-gcc/gcc/lto-streamer-in.c:1427
#12 0x008f8673 in lto_input_scc (ib=ib@entry=0x7fffceb0, 
data_in=data_in@entry=0x1691930, len=len@entry=0x7fffceac, 
entry_len=entry_len@entry=0x7fffcea8) at 
[...]/source-gcc/gcc/lto-streamer-in.c:1339
#13 0x005890f7 in lto_read_decls 
(decl_data=decl_data@entry=0x77fc, data=data@entry=0x169d570, 
resolutions=...) at [...]/source-gcc/gcc/lto/lto.c:1693
#14 0x005898c8 in lto_file_finalize 
(file_data=file_data@entry=0x77fc, file=file@entry=0x15eedb0) at 
[...]/source-gcc/gcc/lto/lto.c:2037
#15 0x00589928 in lto_create_files_from_ids 
(file=file@entry=0x15eedb0, file_data=file_data@entry=0x77fc, 
count=count@entry=0x7fffd054) at [...]/source-gcc/gcc/lto/lto.c:2047
#16 0x00589a7a in lto_file_read (file=0x15eedb0, 
resolution_file=resolution_file@entry=0x0, count=count@entry=0x7fffd054) at 
[...]/source-gcc/gcc/lto/lto.c:2088
#17 0x00589e84 in read_cgraph_and_symbols (nfiles=1, 
fnames=0x160e990) at [...]/source-gcc/gcc/lto/lto.c:2798
#18 0x0058a572 in lto_main () at [...]/source-gcc/gcc/lto/lto.c:3299
#19 0x00a48eff in compile_file () at 
[...]/source-gcc/gcc/toplev.c:466
#20 0x00550943 in do_compile () at 
[...]/source-gcc/gcc/toplev.c:2010
#21 toplev::main (this=this@entry=0x7fffd180, argc=argc@entry=20, 
argv=0x15daf20, argv@entry=0x7fffd288) at [...]/source-gcc/gcc/toplev.c:2144
#22 0x00552717 in main (argc=20, argv=0x7fffd288) at 
[...]/source-gcc/gcc/main.c:39

(Comparing to yesterday's r240004, the 

Re: [PATCH GCC 3/9]Support rewriting non-lcssa phis for vars live outside of vect-loop

2016-09-07 Thread Jeff Law

On 09/06/2016 12:51 PM, Bin Cheng wrote:

Hi,
Current implementation requires that variables live outside of vect-loop 
satisfying LCSSA form, this patch relaxes the restriction.  It keeps the old 
behavior for LCSSA PHI node by replacing use of live var with result of that 
PHI; for other uses of live var, it simply replaces all uses outside loop with 
the newly computed var.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (vectorizable_live_operation): Support handling
for live variable outside loop but not in lcssa form.


OK.
jeff


Re: [PATCH GCC 7/9]Skip loops iterating only 1 time in predictive commoning

2016-09-07 Thread Jeff Law

On 09/06/2016 12:53 PM, Bin Cheng wrote:

Hi,
For loops which are bounded to iterate only 1 time (thus loop's latch doesn't 
roll), there is nothing to predictive common, this patch detects/skips these 
cases.  A test is also added in 
gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f for this.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-predcom.c (tree_predictive_commoning_loop): Skip loop that only
iterates 1 time.

gcc/testsuite/ChangeLog
2016-09-01  Bin Cheng  

* gfortran.dg/vect/fast-math-mgrid-resid.f: New test string.


OK.
jeff


[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread petschy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #2 from petschy at gmail dot com ---
I don't want to enable them. The problem is not with too little but too many
warnings.

A snippet from one of the problematic files:

{ NULL, NULL, false, false }

is preprocessed to 

 { 
# 62 "AdsPlugin.cpp" 3 4
  __null
# 62 "AdsPlugin.cpp"
  , 
# 62 "AdsPlugin.cpp" 3 4
__null
# 62 "AdsPlugin.cpp"
, false, false }
};

Here I see the same flags, yet for these two NULLs gcc warns.

Re: [PATCH][AArch64] Improve legitimize_address

2016-09-07 Thread Richard Earnshaw (lists)
On 06/09/16 14:14, Wilco Dijkstra wrote:
> Improve aarch64_legitimize_address - avoid splitting the offset if it is
> supported.  When we do split, take the mode size into account.  BLKmode
> falls into the unaligned case but should be treated like LDP/STP.
> This improves codesize slightly due to fewer base address calculations:
> 
> int f(int *p) { return p[5000] + p[7000]; }
> 
> Now generates:
> 
> f:
>   add x0, x0, 16384
>   ldr w1, [x0, 3616]
>   ldr w0, [x0, 11616]
>   add w0, w1, w0
>   ret
> 
> instead of:
> 
> f:
>   add x1, x0, 16384
>   add x0, x0, 24576
>   ldr w1, [x1, 3616]
>   ldr w0, [x0, 3424]
>   add w0, w1, w0
>   ret
> 
> OK for trunk?
> 
> ChangeLog:
> 2016-09-06  Wilco Dijkstra  
> 
> gcc/
>   * config/aarch64/aarch64.c (aarch64_legitimize_address):
>   Avoid use of base_offset if offset already in range.

OK.

R.

> --
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 27bbdbad8cddc576f9ed4fd0670116bd6d318412..119ff0aecb0c9f88899fa141b2c7f9158281f9c3
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -5058,9 +5058,19 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
> machine_mode mode)
>/* For offsets aren't a multiple of the access size, the limit is
>-256...255.  */
>else if (offset & (GET_MODE_SIZE (mode) - 1))
> - base_offset = (offset + 0x100) & ~0x1ff;
> + {
> +   base_offset = (offset + 0x100) & ~0x1ff;
> +
> +   /* BLKmode typically uses LDP of X-registers.  */
> +   if (mode == BLKmode)
> + base_offset = (offset + 512) & ~0x3ff;
> + }
> +  /* Small negative offsets are supported.  */
> +  else if (IN_RANGE (offset, -256, 0))
> + base_offset = 0;
> +  /* Use 12-bit offset by access size.  */
>else
> - base_offset = offset & ~0xfff;
> + base_offset = offset & (~0xfff * GET_MODE_SIZE (mode));
>  
>if (base_offset != 0)
>   {
> 



[Bug middle-end/77377] [7 Regression] c-c++-common/pr59037.c ICEs with -fpic -msse on i686

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77377

Martin Liška  changed:

   What|Removed |Added

 CC||zsojka at seznam dot cz

--- Comment #6 from Martin Liška  ---
*** Bug 77401 has been marked as a duplicate of this bug. ***


[Bug rtl-optimization/77401] [7 Regression] ICE: in simplify_binary_operation_1, at simplify-rtx.c:3731 with out-of-bounds vector access

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77401

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||marxin at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #2 from Martin Liška  ---
It's a dup, fixed in r239854.

*** This bug has been marked as a duplicate of bug 77377 ***

[Bug tree-optimization/77498] [7 regression] Performance drop after r239414 on spec2000/172mgrid

2016-09-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77498

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-09-07
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |7.0
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
Note this revision isn't really related to code hoisting.  It merely allows PRE
to perform simple predictive commoning and more PRE in general.  The commoning
can interfere with sinking (see the adjusted testcase).

For the testcase we apply commoning which increases register pressure.

The pcom pass does a better job (well, it was designed for this).

I suppose this PRE improvement raises the general question (again) whether
we want it to introduce loop-carried dependences at all.  In this case
it trades 18 loads for 18 loop-carried dependences - optimally reg colaesced
and thus "free", maybe reg-reg copies or worst spills (as seen here).

I'll need to think about this (again).

[Bug driver/77497] [5/6/7 Regression] Setting DWARF level and debug level together has flag-ordering-dependent results

2016-09-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77497

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.5
Summary|Setting DWARF level and |[5/6/7 Regression] Setting
   |debug level together has|DWARF level and debug level
   |flag-ordering-dependent |together has
   |results |flag-ordering-dependent
   ||results

[Bug target/77483] [6/7 regression] gcc.target/i386/mask-unpack.c etc. FAIL

2016-09-07 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77483

--- Comment #4 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #3 from Richard Biener  ---
> You could add dg-skip-if or XFAILs to the tests now failing.  IMHO a testsuite
> issue.

I though about just adding -mno-stackrealign to the affected testcases
instead.  This would work everywhere and avoid duplicating the effects
of the i386/cygming.h and i386/sol2.h definitions of
STACK_REALIGN_DEFAULT:

cygming.h:#define STACK_REALIGN_DEFAULT TARGET_SSE
sol2.h:#define STACK_REALIGN_DEFAULT (TARGET_64BIT ? 0 : 1)

Rainer

[Bug middle-end/66661] incorrect memory access in optimization with flexible array member

2016-09-07 Thread fw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=1

Florian Weimer  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |---

--- Comment #12 from Florian Weimer  ---
Here's a new test case.  It prints for me (without Address Sanitizer):

  count 2, align 4, minimum size 10, struct size 12, actual size 12
  count 5, align 4, minimum size 13, struct size 16, actual size 20
  count 0, align 8, minimum size 8, struct size 8, actual size 8
  count 3, align 4, minimum size 11, struct size 12, actual size 24
* count 0, align 16, minimum size 8, struct size 16, actual size 8
  count 8, align 4, minimum size 16, struct size 16, actual size 16
  count 4, align 4, minimum size 12, struct size 12, actual size 12
  count 2, align 4, minimum size 10, struct size 12, actual size 12
* count 2, align 8, minimum size 10, struct size 16, actual size 12
  count 6, align 4, minimum size 14, struct size 16, actual size 16
  count 7, align 4, minimum size 15, struct size 16, actual size 16
  count 3, align 4, minimum size 11, struct size 12, actual size 12
  count 3, align 4, minimum size 11, struct size 12, actual size 12
  count 6, align 4, minimum size 14, struct size 16, actual size 44
* count 5, align 32, minimum size 13, struct size 32, actual size 16
  count 8, align 4, minimum size 16, struct size 16, actual size 16
  count 9, align 4, minimum size 17, struct size 20, actual size 20
  count 7, align 4, minimum size 15, struct size 16, actual size 16
  count 1, align 4, minimum size 9, struct size 12, actual size 12
* count 1, align 8, minimum size 9, struct size 16, actual size 12

I believe this shows that GCC has some bug in this area.  Whether it's the
over-reads (but over-reads can be fine here because they cannot trap, and GCC
knows that they won't introduce observable data races), the object allocation
in the .data section (again, could be harmless), or the Address Sanitizer
report, I'm not sure.

#include 
#include 
#include 
#include 

struct flexible
{
  int count;
  int align;
  char bytes[];
};

#define ARGS_0
#define ARGS_1 1
#define ARGS_2 1, 2
#define ARGS_3 1, 2, 3
#define ARGS_4 1, 2, 3, 4
#define ARGS_5 1, 2, 3, 4, 5
#define ARGS_6 1, 2, 3, 4, 5, 6
#define ARGS_7 1, 2, 3, 4, 5, 6, 7
#define ARGS_8 1, 2, 3, 4, 5, 6, 7, 8
#define ARGS_9 1, 2, 3, 4, 5, 6, 7, 8, 9

#define DECL(name, count, align) \
  _Alignas (align) struct flexible name = {count, align,{ ARGS_##count }}

DECL (v4, 4, 4);
DECL (v1, 1, 8);
DECL (v7, 1, 4);
DECL (v17, 7, 4);
DECL (v15, 9, 4);
DECL (v16, 8, 4);
DECL (v11, 5, 32);
DECL (v12, 6, 4);
DECL (v5, 3, 4);
DECL (v9, 3, 4);
DECL (v13, 7, 4);
DECL (v18, 6, 4);
DECL (v2, 2, 8);
DECL (v8, 2, 4);
DECL (v10, 4, 4);
DECL (v14, 8, 4);
DECL (v0, 0, 16);
DECL (v3, 3, 4);
DECL (v5a, 0, 8);
DECL (v19, 5, 4);
DECL (v6, 2, 4);

enum { count = 21 };

static int
cmp(const void *a, const void *b)
{
  struct flexible *const *a1 = a;
  struct flexible *const *b1 = b;
  if (*a1 < *b1)
return -1;
  if (*a1 > *b1)
return 1;
  return 0;
}

int
main()
{
  struct flexible *p[count]
= {, , , , , , , , , , , ,
   , , , , , , , , };
  qsort (p, count, sizeof (p[0]), cmp);
  for (int i = 0; i < count - 1; ++i)
{
  size_t min_size = offsetof (struct flexible, bytes) + p[i]->count;
  size_t align_mask = p[i]->align - 1;

  /* Struct size is the size that a struct with the requested
 length of the flexible array member would have.  */
  size_t struct_size = (min_size + align_mask) & ~align_mask;

  /* The actual size is the offset between this and the next
 object in the data section.  (This can be an over-estimate if
 other objects not listed above are placed between the listed
 objects.)  */
  size_t actual_size = (p[i + 1] - p[i]) * sizeof (*p[i]);

  /* The lines marked with * have an object whose struct size
 exceeds the object size.  If GCC assumes that objects always
 have their struct size allocated, this leads to an
 out-of-bounds acccess.  */
  printf ("%c count %d, align %d, minimum size %zu, struct size %zu, actual
size %zu\n",
  struct_size > actual_size ? '*' : ' ',
  p[i]->count, p[i]->align, min_size, struct_size, actual_size);
}
}

[Bug bootstrap/77512] New: gcc compilation stops with Arithmetic Exception

2016-09-07 Thread michael at mijobe dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77512

Bug ID: 77512
   Summary: gcc compilation stops with Arithmetic Exception
   Product: gcc
   Version: 6.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: michael at mijobe dot org
  Target Milestone: ---

Created attachment 39579
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39579=edit
temporary compiler files

I am trying to install gcc-6.2.0 on Sparc Solaris 10
In stage 2 I get the following error:


$HOME/gcc-6.2.0-build/./prev-gcc/xg++ -B$HOME/gcc-6.2.0-build/./prev-gcc/
-B$HOME/gcc-6.2.0-bin/sparc-sun-solaris2.10/bin/ -nostdinc++
-B$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/src/.libs
-B$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/libsupc++/.libs

-I$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/include/sparc-sun-solaris2.10
 -I$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/include 
-I$HOME/gcc-6.2.0/libstdc++-v3/libsupc++
-L$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/src/.libs
-L$HOME/gcc-6.2.0-build/prev-sparc-sun-solaris2.10/libstdc++-v3/libsupc++/.libs
-fno-PIE -c   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H
-I. -I. -I../../gcc-6.2.0/gcc -I../../gcc-6.2.0/gcc/.
-I../../gcc-6.2.0/gcc/../include -I./../intl
-I../../gcc-6.2.0/gcc/../libcpp/include -I$HOME/gcc-6.2.0-build/./gmp
-I$HOME/gcc-6.2.0/gmp -I$HOME/gcc-6.2.0-build/./mpfr -I$HOME/gcc-6.2.0/mpfr
-I$HOME/gcc-6.2.0/mpc/src  -I../../gcc-6.2.0/gcc/../libdecnumber
-I../../gcc-6.2.0/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../gcc-6.2.0/gcc/../libbacktrace -I$HOME/gcc-6.2.0-build/./isl/include
-I$HOME/gcc-6.2.0/isl/include  -o haifa-sched.o -MT haifa-sched.o -MMD -MP -MF
./.deps/haifa-sched.TPo ../../gcc-6.2.0/gcc/haifa-sched.c
In file included from ../../gcc-6.2.0/gcc/haifa-sched.c:141:0:
../../gcc-6.2.0/gcc/haifa-sched.c: In function 'int dep_cost_1(dep_t, dw_t)':
../../gcc-6.2.0/gcc/sched-int.h:252:31: internal compiler error: Arithmetische
Ausnahme
 #define DEP_COST(D) ((D)->cost)
   ^
../../gcc-6.2.0/gcc/haifa-sched.c:1438:12: note: in expansion of macro
'DEP_COST'
 return DEP_COST (link);
^~~~
First stage was compiled using gcc 4.0.3
Prerequisits are dowload using contrib/download_prerequisites
GNU binutils-2.27 are used
configuration was done with:
../gcc-6.2.0/configure --prefix=$HOME/gcc-6.2.0-bin
--target=sparc-sun-solaris2.10 --with-gnu-as --with-gnu-ld --disable-libgcj
--enable-languages=c,c++

Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-07 Thread Bin.Cheng
On Wed, Sep 7, 2016 at 1:10 AM, kugan  wrote:
> Hi Bin,
>
>
> On 07/09/16 04:54, Bin Cheng wrote:
>>
>> Hi,
>> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
>> overflow in loop niters' type.  Vectorizer needs to generate more code
>> computing vectorized niters if overflow does happen.  However, For common
>> loops, there is no overflow actually, this patch tries to prove the
>> no-overflow information and use that to improve code generation.  At the
>> moment, no-overflow information comes either from loop niter analysis, or
>> the truth that we know loop is peeled for non-zero iterations in prologue
>> peeling.  For the latter case, it doesn't matter if the original
>> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
>> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.
>>
>> Thanks,
>> bin
>>
>> 2016-09-01  Bin Cheng  
>>
>> * tree-vect-loop.c (loop_niters_no_overflow): New func.
>> (vect_transform_loop): Call loop_niters_no_overflow.  Pass the
>> no-overflow information to vect_do_peeling_for_loop_bound and
>> vect_gen_vector_loop_niters.
>>
>>
>> 009-prove-no_overflow-for-vect-niters-20160902.txt
>>
>>
>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> index 0d37f55..2ef1f9b 100644
>> --- a/gcc/tree-vect-loop.c
>> +++ b/gcc/tree-vect-loop.c
>> @@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop,
>> gimple *stmt)
>>  }
>>  }
>>
>> +/* Given loop represented by LOOP_VINFO, return true if computation of
>> +   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
>> +   otherwise.  */
>> +
>> +static bool
>> +loop_niters_no_overflow (loop_vec_info loop_vinfo)
>> +{
>> +  /* Constant case.  */
>> +  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
>> +{
>> +  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);
>
>
> Wouldn't it truncate by assigning this to int?
Probably, now I think it's unnecessary to use int version niters here,
LOOP_VINFO_NITERS can be used directly.
>
>
>> +  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
>> +
>> +  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
>> +  if (wi::to_widest (cst_nitersm1) < cst_niters)
>
>
> Shouldn't you have do the addition and comparison in the type of the loop
> index instead of widest_int to see if that overflows?
You mean the type of loop niters?  NITERS is computed from NITERSM1 +
1, I don't think we need to do it again here.

Thanks,
bin


[Bug other/29842] [meta-bug] outstanding patches / issues from STMicroelectronics

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29842
Bug 29842 depends on bug 24194, which changed state.

Bug 24194 Summary: emit_input_reload_insns secondary reload handling is unsafe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24194

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

[Bug c++/77304] ICE on C++ code with invalid template parameter: in gimplify_expr, at gimplify.c:11260

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77304

Martin Liška  changed:

   What|Removed |Added

   Last reconfirmed||2016-9-7
 CC||marxin at gcc dot gnu.org
   Target Milestone|--- |5.5

--- Comment #1 from Martin Liška  ---
Confirmed.

[Bug rtl-optimization/24194] emit_input_reload_insns secondary reload handling is unsafe

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24194

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

[Bug target/30199] config/freebsd.h does not define HANDLE_PRAGMA_PACK_PUSH_POP

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30199

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |4.3.0

--- Comment #2 from Andrew Pinski  ---
Fixed a long time ago.

[Bug tree-optimization/77450] [5/6 Regression] ICE: in verify_ssa, at tree-ssa.c:1016 on very simple code with vectors

2016-09-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77450

--- Comment #6 from Richard Biener  ---
Author: rguenth
Date: Wed Sep  7 08:22:01 2016
New Revision: 240025

URL: https://gcc.gnu.org/viewcvs?rev=240025=gcc=rev
Log:
2016-09-07  Richard Biener  

PR c/77450
* c-c++-common/vector-subscript-8.c: Move ..
* gcc.dg/pr77450.c: ... here.

Added:
trunk/gcc/testsuite/gcc.dg/pr77450.c
  - copied, changed from r240024,
trunk/gcc/testsuite/c-c++-common/vector-subscript-8.c
Removed:
trunk/gcc/testsuite/c-c++-common/vector-subscript-8.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug other/31566] @missing_file gives bad error message

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31566

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||easyhack
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
This should be something easy for someone new to develop too.

[Bug bootstrap/77510] New: genautomata memory footprint for MIPS

2016-09-07 Thread lly.dev at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77510

Bug ID: 77510
   Summary: genautomata memory footprint for MIPS
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lly.dev at gmail dot com
  Target Milestone: ---

Created attachment 39578
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39578=edit
genautomata output MIPS32

Hello,

bootstrapping/crosscompiling gcc 5.4.0 for MIPS, genautomata uses a lot of
memory. This causes swapping/thrashing and extremely long build times on low
memory systems. According to valgrind massif report, it requires 527MB of heap
now.

genautomata output(-time -progress -stats) attached to PR. Probably we need to
try to decrease reservations for `xlp_cpu', `xlp_fpu' since simple
disable(comment out it in mips.md) of xlp support decreases heap usage down to
43MB.

Similar to bug 70473, which requests that a new bug is filed.
Thank you for your attention!

[Bug target/77493] [6/7 Regression] -fcrossjumping (-O2) on ppc64le causes segfaults (jump to 0x0) (first bad r230091)

2016-09-07 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77493

--- Comment #13 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #11)
> CCing libstdc++ people -- not sure if std::stable_sort (on which kind of
> collection?) is safe for std::shared_ptr.

It's required to work correctly. It should be equivalent to sorting
TrackingRecHit* pointers.

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Florian Weimer

On 09/06/2016 11:31 PM, Paul Eggert wrote:

On 09/06/2016 01:40 PM, Joseph Myers wrote:

Sounds like a defect in C11 to me - none of the examples of flexible
array
members anticipate needing to add to the size to allow for tail padding
with unknown alignment requirements.


Yes, I would prefer calling it a defect, as most code I've seen dealing
with flexible array members does not align the tail size. However, GCC +
valgrind does take advantage of this "defect" and I would not be
surprised if other picky implementations do too.


It might be an inherent limitation of the valgrind approach. 
Speculative loads which cannot result in data races (in the C11 sense) 
due to the way the architecture behaves should be fine.  The alignment 
ensures that the load is on the same page, which is what typically 
prevent this optimization.


Some implementation techniques for C string functions result in the same 
behavior.  valgrind intercepts them or suppresses errors there, but 
that's not possible for code that GCC emits inline, obviously.


valgrind would still treat the bytes beyond the allocation boundary as 
undefined.  But I agree that false positives in this area are annoying.


Florian



[Bug c++/77513] New: -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread petschy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

Bug ID: 77513
   Summary: -Wzero-as-null-pointer-constant vs 0, nullptr, NULL
and __null
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: petschy at gmail dot com
  Target Milestone: ---

Created attachment 39580
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39580=edit
Preprocessed source, generated with g++-7.0.0 -std=c++14
-Wzero-as-null-pointer-constant 20160907-null.cpp -E > 20160907-null.ii

Yesterday I switched on the warning for a ~250kloc codebase to clean it up.
Used 7.0, it was tedious but it was done. I had to replace NULLs also, not just
0s, but at that time I wasn't suspecting anything, though it seemed a bit
strange.

Then, tried to build on another machine with 5.4.1, and to my surprise, tons of
warnings appeared. Then tried to build on my machine with 5.4.1, the same
results. It turned out that NULLs are frowned upon, quite inconsistently.

5.4.1 has problems with 7 cpp files, 6.2.1 and 7.0 with just a single one. Did
a grep for NULL, and as expected for a large and aging codebase, there were
lots of them, but they are not treated equally. All files are c++, and compiled
with the same flags.

Preprocessed 2 problematic files with all three gcc versions mentioned. Diffing
them revealed that there is no difference in the actual code, only what gets
included due to the differing gcc versions. All NULLs were replaced with
__null's by the preprocessor, which is defined in the gcc version specific
stddef.h include.

Crafted a test case:

#include 
char* a = 0;
char* b = nullptr;
char* c = __null;
char* d = NULL;
int main()
{
}

$ g++-5.4.1 -std=c++14 -Wzero-as-null-pointer-constant 20160907-null.cpp
20160907-null.cpp:2:11: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 char* a = 0;
   ^
20160907-null.cpp:4:11: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 char* c = __null;
   ^
6.2.1 and 7.0 print exactly the same warnings.

So NULL is ok, but __null is not? The end of the preprocessed source looks like
this:

# 2 "20160907-null.cpp"
char* a = 0;
char* b = nullptr;
char* c = __null;
char* d = 
# 5 "20160907-null.cpp" 3 4
     __null
# 5 "20160907-null.cpp"
 ;
int main()
{
}

c and d initialized the same except for whitespace and the two "'# 6" lines
around d's __null. I naively thought that these are only to communicate line
info to the compiler, but if I delete the first one:

$ g++-7.0.0 -std=c++14 -Wzero-as-null-pointer-constant 20160907-null.ii
20160907-null.cpp:2:11: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 char* a = 0;
   ^
20160907-null.cpp:4:11: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 char* c = __null;
   ^~
20160907-null.cpp:6:10: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
 int main()
  ^

The interpretation of __null at d changed for some reason. What is going on? It
seems that the interpretation can change unpredictably, and in the problematic
source files __null's are misdiagnosed even when the "# ..." lines are around
them.

For c++11 and later code, why is NULL defined as __null, rather than nullptr?

I put a fast bandaid on my code by redefining NULL to be nullptr after the last
include in the problematic files, but since the number of problematic files
seems to change from gcc version to gcc version, this is rather fragile, let
alone unelegant.

Platform is Debian Jessie AMD64, the gcc versions:

$ g++-5.4.1 -v
Using built-in specs.
COLLECT_GCC=g++-5.4.1
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.4.1/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --enable-languages=c,c++ --disable-multilib
--program-suffix=-5.4.1 --disable-bootstrap CFLAGS='-O2 -march=native'
CXXFLAGS='-O2 -march=native'
Thread model: posix
gcc version 5.4.1 20160829 (GCC)

$ g++-6.2.1 -v
Using built-in specs.
COLLECT_GCC=g++-6.2.1
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --enable-languages=c,c++ --disable-multilib
--program-suffix=-6.2.1 --disable-bootstrap CFLAGS='-O2 -march=native'
CXXFLAGS='-O2 -march=native'
Thread model: posix
gcc version 6.2.1 20160831 (GCC)

$ g++-7.0.0 -v
Using built-in specs.
COLLECT_GCC=g++-7.0.0
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --enable-languages=c,c++ --disable-multilib
--program-suffix=-7.0.0 --disable-bootstrap CFLAGS='-O2 -march=native'
CXXFLAGS='-O2 -march=native'
Thread model: posix
gcc version 7.0.0 20160831 (experimental) (GCC)

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #5 from Andrew Pinski  ---
(In reply to Richard Biener from comment #3)
> It seems with thumb the code is only if-converted after reload for some
> reason.

Most likely because it is going through the cond_exec route.

[Bug c++/77347] [6/7 Regression] Incorrect auto deduction failure in template class member function

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77347

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed.

[Bug bootstrap/10802] Bootstrap of gcc-3.3 on Solaris 9 x86 fails on undefined symbol in libgcj.so

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10802

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #14 from Andrew Pinski  ---
Does this really still happen?

[Bug c++/77508] [7 Regression] ICE on valid C++ code: in finish_class_member_access_expr, at cp/typeck.c:2783

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77508

Martin Liška  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||jason at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
   Target Milestone|--- |7.0
Summary|ICE on valid C++ code: in   |[7 Regression] ICE on valid
   |finish_class_member_access_ |C++ code: in
   |expr, at cp/typeck.c:2783   |finish_class_member_access_
   ||expr, at cp/typeck.c:2783
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, started with r236221.

[Bug middle-end/21786] Segmentation fault under FreeBSD 5.3-RELEASE-p15

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21786

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Andrew Pinski  ---
Fixed so closing.

[Bug tree-optimization/77503] [7 regression] ICE in vect_transform_stmt compiling postgresql

2016-09-07 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77503

--- Comment #11 from Markus Trippelsdorf  ---
(In reply to Andrew Pinski from comment #10)
> (In reply to Markus Trippelsdorf from comment #6)
> > markus@x4 tmp % cat fsmpage.i
> 
> You got to it before I could do that :).

Yeah, sorry. I already finished reducing when I read your message...

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread avieira at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #7 from avieira at gcc dot gnu.org ---
if-convert is a no go here, for the reason Andrew pointed out, sorry missed
that comment!

So I dont know... Only thing I can think of is better "value-range"-like
analysis for combine, but that might be too costly?

The fact is that for the code-hoisting to work here, the pseudo for r112 has to
be shared among both code-paths, so unless you add an extra move:

BB0:
r112:SI = r0:SI

BB 1:
...
r116:SI=r112:SI 0>>0x1
rNEW:SI=zero_extend(r116:SI#0)
...
if CC goto BB2 else BB Extra
BB 2:
r127:SI=rNEW:SI^r129:SI
r112:SI=zero_extend(r127:SI#0)
if LOOP: goto BB1 else BB exit
BB EXTRA:
r112:SI=rNEW:SI
if LOOP: goto BB1 else BB exit

And you end up with an extra move rather than a zero_extend. But maybe the move
can be optimized away in later stages? Or maybe put in the same conditional
execution block as the XOR...

[Bug tree-optimization/77503] [7 regression] ICE in vect_transform_stmt compiling postgresql

2016-09-07 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77503

--- Comment #12 from amker at gcc dot gnu.org ---
ICE is because prologue peeling changes code (to be specific, the initial
argument of reduction phi node in loop header), as a result, the statement to
be vectorized is not the statement that was previously analyzed.
One possible fix is be conservative and fall back to generic conditional
reduction if loop needs to be peeled for prologue.  Another (better) fix is we
don't update/inherit reduction value from prologue loop for vectorized loop,
and after vectorized loop, we combine the result manually.  Code would be like:

prolog_loop:
  c_1 = PHI
  c_2 = (cond) ? 1 : c_1;
  if (prolog_cond)  goto prolog_header
  else  goto prolog_exit

prolog_exit:
  c_3 = PHI;

vector_loop:
  c_4 = PHI;
  c_5 = (cond) ? 1 : c_4
  if (vector_cond) goto vector_header
  else goto vector_exit

vector_exit:
  c_6 = PHI;
  c_7 = (c_3 == 1 || c_5 == 1) ? 1 : 0;

We need to change how loop peeling updates SSA(PHI) nodes.  Since I have other
patches rewriting loop peeling, better to check this method after loop peeling
patches.  For now, I think we'd better go with the first method which disables
the optimization in peeling cases.

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread avieira at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #9 from avieira at gcc dot gnu.org ---

> > So I dont know... Only thing I can think of is better "value-range"-like
> > analysis for combine, but that might be too costly?
> 
> So we are not really looking for combine to combine the shift stmt
> with the xor stmt?  Because combine doesn't consider that because of
> the multi-use.

AFAIK, combine will not combine the shift and xor because they are in different
basic blocks. The multi-use prevents it from tracking the origin of r112 back
to a point where it knows that it its higher bits are all 0.

> > 
> > And you end up with an extra move rather than a zero_extend. But maybe the 
> > move
> > can be optimized away in later stages? Or maybe put in the same conditional
> > execution block as the XOR...
> 
> Well, we run into a general issue of the RTL combiner -- fwprop and
> ree are other passes that are supposed to remove extensions in some
> cases.
> 
> Really, the user could have written the code in a way CSEing the
> shift himself -- it's unfortunate that we now fail to optimize the
> non-CSEd source but that can only be a reason to enhance downstream
> passes.

True, if say the unused 'y' I left in there for some odd reason were used to
CSE (x >> 1) outside the if-then-else, then you would end up with the
zero_extend in both -fcode-hoisting and -fno-code-hoisting.

[Bug tree-optimization/77503] [7 regression] ICE in vect_transform_stmt compiling postgresql

2016-09-07 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77503

--- Comment #13 from amker at gcc dot gnu.org ---
(In reply to amker from comment #12)
> ICE is because prologue peeling changes code (to be specific, the initial
> argument of reduction phi node in loop header), as a result, the statement
> to be vectorized is not the statement that was previously analyzed.
> One possible fix is be conservative and fall back to generic conditional
> reduction if loop needs to be peeled for prologue.  Another (better) fix is
> we don't update/inherit reduction value from prologue loop for vectorized
> loop, and after vectorized loop, we combine the result manually.  Code would

Actually, existing code handles peeling for prologue loop well, I only need to
get the initial constant value from prologue loop.

Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-09-07 Thread Martin Liška
On 08/18/2016 06:06 PM, Richard Biener wrote:
> On August 18, 2016 5:54:49 PM GMT+02:00, Jakub Jelinek  
> wrote:
>> On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote:
 I'd prefer to make updates atomic in multi-threaded applications.
 The best proxy we have for that is -pthread.

 Is it slower, most definitely, but odds are we're giving folks
 garbage data otherwise, which in many ways is even worse.
>>>
>>> It will likely be catastrophically slower in some cases. 
>>>
>>> Catastrophically as in too slow to be usable.
>>>
>>> An atomic instruction is a lot more expensive than a single
>> increment. Also
>>> they sometimes are really slow depending on the state of the machine.
>>
>> Can't we just have thread-local copies of all the counters (perhaps
>> using
>> __thread pointer as base) and just atomically merge at thread
>> termination?
> 
> I suggested that as well but of course it'll have its own class of issues 
> (short lived threads, so we need to somehow re-use counters from terminated 
> threads, large number of threads and thus using too much memory for the 
> counters)
> 
> Richard.

Hello.

I've got written the approach on my TODO list, let's see whether it would be 
doable in a reasonable amount of time.

I've just finished some measurements to illustrate slow-down of 
-fprofile-update=atomic approach.
All numbers are: no profile, -fprofile-generate, -fprofile-generate 
-fprofile-update=atomic
c-ray benchmark (utilizing 8 threads, -O3): 1.7, 15.5., 38.1s
unrar (utilizing 8 threads, -O3): 3.6, 11.6, 38s
tramp3d (1 thread, -O3): 18.0, 46.6, 168s

So the slow-down is roughly 300% compared to -fprofile-generate. I'm not having 
much experience with default option
selection, but these numbers can probably help.

Thoughts?
Martin

> 
>>  Jakub
> 
> 



[Bug middle-end/77383] -fcheck-pointer-bounds -mmpx ICE with VLA struct return type

2016-09-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77383

Martin Liška  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-09-07
 CC||marxin at gcc dot gnu.org
   Target Milestone|--- |5.5
 Ever confirmed|0   |1
  Known to fail||5.4.0, 6.2.0

--- Comment #1 from Martin Liška  ---
Confirmed, starting from 5.1.0, first release where MPX support was added.

Re: [PATCH] Fix PR77450

2016-09-07 Thread Richard Biener
On Wed, 7 Sep 2016, Yvan Roux wrote:

> Hi Richard,
> 
> On 6 September 2016 at 14:41, Richard Biener  wrote:
> >
> > The following fixes PR77450.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >
> > Richard.
> >
> > 2016-09-06  Richard Biener  
> >
> > PR c/77450
> > c-family/
> > * c-common.c (c_common_mark_addressable_vec): Handle
> > COMPOUND_LITERAL_EXPR.
> >
> > * c-c++-common/vector-subscript-7.c: Adjust.
> > * c-c++-common/vector-subscript-8.c: New testcase.
> 
> This new testcase fails in our validation (ARM and x86 targets):
> 
> gcc/testsuite/c-c++-common/vector-subscript-8.c:8:17: error: lvalue
> required as left operand of assignment
> compiler exited with status 1
> FAIL: c-c++-common/vector-subscript-8.c  -std=c++98 (test for excess errors)

Ah, seems my dev tree eats the NON_LVALUE_EXPR we build around it.

So it looks like the testcase is invalid for C++, will move it.

Richard.

> 
> > Index: gcc/c-family/c-common.c
> > ===
> > --- gcc/c-family/c-common.c (revision 240004)
> > +++ gcc/c-family/c-common.c (working copy)
> > @@ -10918,7 +10918,9 @@ c_common_mark_addressable_vec (tree t)
> >  {
> >while (handled_component_p (t))
> >  t = TREE_OPERAND (t, 0);
> > -  if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
> > +  if (!VAR_P (t)
> > +  && TREE_CODE (t) != PARM_DECL
> > +  && TREE_CODE (t) != COMPOUND_LITERAL_EXPR)
> >  return;
> >TREE_ADDRESSABLE (t) = 1;
> >  }
> > Index: gcc/testsuite/c-c++-common/vector-subscript-7.c
> > ===
> > --- gcc/testsuite/c-c++-common/vector-subscript-7.c (revision 240004)
> > +++ gcc/testsuite/c-c++-common/vector-subscript-7.c (working copy)
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O -fdump-tree-ccp1" } */
> > +/* { dg-options "-O -fdump-tree-fre1" } */
> >
> >  typedef int v4si __attribute__ ((vector_size (16)));
> >
> > @@ -11,4 +11,4 @@ main (int argc, char** argv)
> >return ((v4si){1, 2, 42, 0})[j];
> >  }
> >
> > -/* { dg-final { scan-tree-dump "return 42;" "ccp1" } } */
> > +/* { dg-final { scan-tree-dump "return 42;" "fre1" } } */
> > Index: gcc/testsuite/c-c++-common/vector-subscript-8.c
> > ===
> > --- gcc/testsuite/c-c++-common/vector-subscript-8.c (revision 0)
> > +++ gcc/testsuite/c-c++-common/vector-subscript-8.c (working copy)
> > @@ -0,0 +1,9 @@
> > +/* { dg-do compile } */
> > +
> > +typedef int V __attribute__((vector_size(4)));
> > +
> > +void
> > +foo(void)
> > +{
> > +  (V){ 0 }[0] = 0;
> > +}
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[Bug bootstrap/30136] bootstrap fail for 4.3-20061209

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30136

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-09-07
 Ever confirmed|0   |1

--- Comment #8 from Andrew Pinski  ---
Does this work with a much newer GCC?

[Bug fortran/77501] ICE in gfc_match_generic, at fortran/decl.c:9429

2016-09-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77501

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
   Target Milestone|6.3 |---

Re: [PATCH, PING] DWARF: process all TYPE_DECL nodes when iterating on scopes

2016-09-07 Thread Richard Biener
On Tue, Sep 6, 2016 at 10:43 AM, Pierre-Marie de Rodat
 wrote:
> Hello,
>
> On 08/29/2016 11:03 AM, Pierre-Marie de Rodat wrote:
>>
>> Here is another attempt to solve the original issue. This time, with a
>> proper testcase. ;-)
>>
>> Rebased against trunk, boostrapped and regtested on x86_64-linux.
>
>
> Ping for the patch submitted at
> . Thank you in
> advance!

Ok, had time to look at this issue again.  I see the patch works like dwarf2out
works currently with respect to DIE creation order and re-location.

-   /* Output a DIE to represent the typedef itself.  */
-   gen_typedef_die (decl, context_die);
+   {
+ /* Output a DIE to represent the typedef itself.  */
+ gen_typedef_die (decl, context_die);
+
+ /* The above may create a typedef in the proper scope, but the
+underlying type itself could have been created earlier, at a point
+when the scope was not available yet.  If it's the case, relocate
+it.  This is analogous to what is done in process_scope_var,
+except we deal with a TYPE and not a DECL, here.  */
+ dw_die_ref type_die = lookup_type_die (TREE_TYPE (decl));
+ if (type_die != NULL && type_die->die_parent == NULL
+ && DECL_CONTEXT (decl) == TYPE_CONTEXT (TREE_TYPE (decl)))
+   add_child_die (context_die, type_die);
+   }

this might be incomplete though for the case where it's say

 typedef const T X;

thus the type of decl is a qualified type?  In this case the qualification might
be at the correct scope but the actual type not or you just relocate the
qualification but not the type DIE it refers to?

That said, with the idea of early debug in place and thus giving more
responsibility
to the frontends I wonder in what order the Ada FE calls
debug_hooks.early_global_decl ()?
Maybe it's the middle-end and it should arrange for a more natural
order on function
nests.  So I'd appreciate if you can investigate this side of the
issue a bit, that is, simply
avoid the bad ordering.

Richard.

> --
> Pierre-Marie de Rodat


[Bug c++/77508] New: ICE on valid C++ code: in finish_class_member_access_expr, at cp/typeck.c:2783

2016-09-07 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77508

Bug ID: 77508
   Summary: ICE on valid C++ code: in
finish_class_member_access_expr, at cp/typeck.c:2783
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

This is a regression from 6.2. 


$ g++-trunk -v
Using built-in specs.
COLLECT_GCC=g++-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160906 (experimental) [trunk revision 240004] (GCC)
$
$ g++-6.2 small.cpp
$ clang++ small.cpp
$
$ g++-trunk small.cpp
small.cpp: In member function ‘void B::g()’:
small.cpp:10:23: internal compiler error: in finish_class_member_access_expr,
at cp/typeck.c:2783
 this->B::template f <  < T >::g > ();
   ^
0x7e8254 finish_class_member_access_expr(cp_expr, tree_node*, bool, int)
../../gcc-source-trunk/gcc/cp/typeck.c:2783
0x79dd04 cp_parser_postfix_dot_deref_expression
../../gcc-source-trunk/gcc/cp/parser.c:7364
0x79b54c cp_parser_postfix_expression
../../gcc-source-trunk/gcc/cp/parser.c:6967
0x799e6c cp_parser_unary_expression
../../gcc-source-trunk/gcc/cp/parser.c:8019
0x7a3d17 cp_parser_cast_expression
../../gcc-source-trunk/gcc/cp/parser.c:8696
0x7a4315 cp_parser_binary_expression
../../gcc-source-trunk/gcc/cp/parser.c:8798
0x7a4c00 cp_parser_assignment_expression
../../gcc-source-trunk/gcc/cp/parser.c:9086
0x7a74f9 cp_parser_expression
../../gcc-source-trunk/gcc/cp/parser.c:9253
0x7a7b1f cp_parser_expression_statement
../../gcc-source-trunk/gcc/cp/parser.c:10736
0x7b645c cp_parser_statement
../../gcc-source-trunk/gcc/cp/parser.c:10587
0x7b73dc cp_parser_statement_seq_opt
../../gcc-source-trunk/gcc/cp/parser.c:10859
0x7b74cf cp_parser_compound_statement
../../gcc-source-trunk/gcc/cp/parser.c:10813
0x7b767f cp_parser_function_body
../../gcc-source-trunk/gcc/cp/parser.c:20832
0x7b767f cp_parser_ctor_initializer_opt_and_function_body
../../gcc-source-trunk/gcc/cp/parser.c:20868
0x7b8121 cp_parser_function_definition_after_declarator
../../gcc-source-trunk/gcc/cp/parser.c:25565
0x7bbf80 cp_parser_late_parsing_for_member
../../gcc-source-trunk/gcc/cp/parser.c:26446
0x796d56 cp_parser_class_specifier_1
../../gcc-source-trunk/gcc/cp/parser.c:21719
0x798679 cp_parser_class_specifier
../../gcc-source-trunk/gcc/cp/parser.c:21745
0x798679 cp_parser_type_specifier
../../gcc-source-trunk/gcc/cp/parser.c:15971
0x7abdf3 cp_parser_decl_specifier_seq
../../gcc-source-trunk/gcc/cp/parser.c:12889
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
$


---


template < typename T > struct A
{ 
  template < void (T::*Fn) () > void f () {}
};

template < typename T > struct B : A < B < T > >
{ 
  void g ()
  { 
this->B::template f <  < T >::g > ();
  }
};

int main ()
{ 
  B < int > b;
  b.g ();
  return 0;
}

[Bug java/16200] gcj generates dependencies on the full contents of the extensions directory

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16200

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
No feedback in over 4 years so closing and gcj is about to remove :)

[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread avieira at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #6 from avieira at gcc dot gnu.org ---
> so we are talking about the uxthne insn (I don't know arm / thumb very well).

Yes, the uxthne is the "zero_extend" that is otherwise optimized away if you
turn off code-hoisting.

This is because the way the code gets transformed leads to:
r112:SI=r112:SI 0>>0x1, this is the combination of instructions 12 and 13 in my
example earlier. r112 is also the first operand of the xor instruction and
because of the way combine does its "nonzero bit analysis" it always looks at
the last set value for each pseudo. For r112 here, thats an infinite loop and
so it will not be able to recognize that r112 originated from r0, thus loosing
the information that it is at most an unsigned short. Leading to the decision
not to get rid of the zero_extend.

I'll have a look at if-convert.

Re: [PATCH] Detect whether target can use -fprofile-update=atomic

2016-09-07 Thread Martin Liška
On 09/07/2016 09:45 AM, Christophe Lyon wrote:
> On 6 September 2016 at 15:45, Martin Liška  wrote:
>> On 09/06/2016 03:31 PM, Jakub Jelinek wrote:
>>> sizeof (gcov_type) talks about the host gcov type, you want instead the
>>> target gcov type.  So
>>> TYPE_SIZE (gcov_type_node) == 32 vs. 64 (or TYPE_SIZE_UNIT (gcov_type_node)
>>> == 4 vs. 8).
>>> As SImode and DImode are in fact 4*BITS_PER_UNIT and 8*BITS_PER_UNIT,
>>> TYPE_SIZE_UNIT comparisons for 4 and 8 are most natural.
>>> And I wouldn't add gcc_unreachable, just warn for weirdo arches always.
>>>
>>>   Jakub
>>
>> Thank you Jakub for helping me with that. I've used TYPE_SIZE_UNIT macro.
>>
>> Ready for trunk?
>> Martin
> 
> Hi Martin,
> 
> On targets which do not support atomic profile update, your patch generates a
> warning on gcc.dg/tree-prof/val-profiler-threads-1.c, making it fail.
> 
> Do we need a new effective-target ?
> 
> Christophe
> 

Hi.

Thanks for observation, I'm sending a patch that does that.
Can you please test it?

Thanks,
Martin
>From 9a68f2fbf2b5cb547aee7860926c846d5f15d398 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 7 Sep 2016 11:28:13 +0200
Subject: [PATCH] Add new effective target: profile_update_atomic

gcc/testsuite/ChangeLog:

2016-09-07  Martin Liska  

	* g++.dg/gcov/gcov-threads-1.C: Use profile_update_atomic
	effective target.
	* gcc.dg/tree-prof/val-profiler-threads-1.c: Likewise.
	* lib/target-supports.exp: Define the new target.
---
 gcc/testsuite/g++.dg/gcov/gcov-threads-1.C  | 1 +
 gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c | 2 ++
 gcc/testsuite/lib/target-supports.exp   | 7 +++
 3 files changed, 10 insertions(+)

diff --git a/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C b/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
index a4a6f0a..cc9266a 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
+++ b/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
@@ -1,5 +1,6 @@
 /* { dg-options "-fprofile-arcs -ftest-coverage -pthread -fprofile-update=atomic" } */
 /* { dg-do run { target native } } */
+/* { dg-require-effective-target profile_update_atomic } */
 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c b/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c
index e9b04a0..95d6ee3 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c
@@ -1,4 +1,6 @@
 /* { dg-options "-O0 -pthread -fprofile-update=atomic" } */
+/* { dg-require-effective-target profile_update_atomic } */
+
 #include 
 
 #define NUM_THREADS	8
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 545b3dc..6724a7f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7699,3 +7699,10 @@ proc check_effective_target_offload_hsa { } {
 	int main () {return 0;}
 } "-foffload=hsa" ]
 }
+
+# Return 1 if the target support -fprofile-update=atomic
+proc check_effective_target_profile_update_atomic {} {
+return [check_no_compiler_messages profile_update_atomic assembly {
+	int main (void) { return 0; }
+} "-fprofile-update=atomic -fprofile-generate"]
+}
-- 
2.9.2



[Bug c/77511] New: libbacktrace could not find executable to open

2016-09-07 Thread zheltonozhskiy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77511

Bug ID: 77511
   Summary: libbacktrace could not find executable to open
   Product: gcc
   Version: 6.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zheltonozhskiy at gmail dot com
  Target Milestone: ---

When I'm trying to build this -
https://github.com/Randl/skypeopensource2/tree/a1ec02c05797853e1862ae26d8e46a1d52f4b77f
Target skyauth_exe failes on Release configuration with following:

C:\Users\eabes\Downloads\skypeopensource2-4870457f725b332ec52d6d72e688c9beddbdd87b\skypeopensource2-4870457f725b332ec52d6d72e688c9beddbdd87b\skyauth\dh_384.c:
In function 'rc4_key_generate.constprop':
C:\Users\eabes\Downloads\skypeopensource2-4870457f725b332ec52d6d72e688c9beddbdd87b\skypeopensource2-4870457f725b332ec52d6d72e688c9beddbdd87b\skyauth\dh_384.c:49:5:
internal compiler error: in get_ptr_info, at tree-ssanames.c:586
 int rc4_key_generate(int buf, int a_key, unsigned int a_len) { //TODO: MinGW
bug lto-wrapper failed
 ^
libbacktrace could not find executable to open
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
lto-wrapper.exe: fatal error:
C:\PROGRA~1\MINGW-~1\X86_64~1.0-P\mingw64\bin\G__~1.EXE returned 1 exit status
compilation terminated.
C:/PROGRA~1/MINGW-~1/X86_64~1.0-P/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/6.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe:
lto-wrapper failed
collect2.exe: error: ld returned 1 exit status
mingw32-make.exe[2]: *** [skyauth/skyauth_exe.exe] Error 1

for both MinGW 6.1.0 and 6.2.0 x64.

Re: [PATCH] Delete GCJ

2016-09-07 Thread Richard Earnshaw (lists)
On 06/09/16 22:17, Jeff Law wrote:
> On 09/06/2016 03:08 AM, Jakub Jelinek wrote:
>> On Tue, Sep 06, 2016 at 11:06:36AM +0200, Richard Biener wrote:
>>> On Mon, Sep 5, 2016 at 6:17 PM, Andrew Haley  wrote:
 On 05/09/16 17:15, Richard Biener wrote:
> On September 5, 2016 5:13:06 PM GMT+02:00, Andrew Haley
>  wrote:
>> As discussed.  I think I should ask a Global reviewer to approve this
>> one.  For obvious reasons I haven't included the diffs to the deleted
>> gcc/java and libjava directories.  The whole tree, post GCJ-deletion,
>> is at svn+ssh://gcc.gnu.org/svn/gcc/branches/gcj/gcj-deletion-branch
>> if anyone would like to try it.
>
> Isn't there also java specific C++ frontend parts?

 There certainly are, but deleting them without breaking anything else
 is going to be rather delicate.  I'm trying to do this one step at a
 time, rather cautiously.
>>>
>>> Ok, that sounds reasonable.
>>>
>>> You have my approval for this first part then.  Please wait until
>>> after the
>>> GNU Cauldron to allow other global reviewers to object.
>>
>> No objection from me.
> No objection from me either (I'm guessing that's not a surprise).
> 

Nor from me.

R.

> jeff



[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #4 from Richard Biener  ---
Oh, and with -fno-code-hoisting I see

movwr6, #45345
.L5:
smull   lr, r4, r5, r1
sub r4, r4, r1, asr #31
add r4, r4, r4, lsl #1
cmp r1, r4
sub r1, r1, r3
ite ne
eorne   r0, r6, r0, lsr #1
lsreq   r0, r0, #1
cmp r2, r1
blt .L5

while with code hoisting

movwr6, #45345
.L4:
smull   r5, r4, r7, r1
lsrsr0, r0, #1
sub r4, r4, r1, asr #31
eor r5, r0, r6
add r4, r4, r4, lsl #1
cmp r1, r4
sub r1, r1, r3
it  ne
uxthne  r0, r5
cmp r2, r1
blt .L4

so we are talking about the uxthne insn (I don't know arm / thumb very well).

Re: [RFC][SSA] Iterator to visit SSA

2016-09-07 Thread Dominik Inführ
I am not sure about the process, but it may also be nice/useful to add your new 
macro to ForEachMacros in contrib/clang-format.

Dominik

> On 07 Sep 2016, at 02:21, Kugan Vivekanandarajah 
>  wrote:
> 
> Hi Richard,
> 
> On 6 September 2016 at 19:08, Richard Biener  
> wrote:
>> On Tue, Sep 6, 2016 at 2:24 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Richard,
>>> 
>>> On 5 September 2016 at 17:57, Richard Biener  
>>> wrote:
 On Mon, Sep 5, 2016 at 7:26 AM, Kugan Vivekanandarajah
  wrote:
> Hi All,
> 
> While looking at gcc source, I noticed that we are iterating over SSA
> variable from 0 to num_ssa_names in some cases and 1 to num_ssa_names
> in others. It seems that variable 0 is always NULL TREE.
 
 Yeah, that's artificial because we don't assign SSA name version 0 (for 
 some
 unknown reason).
 
> But it can
> confuse people who are looking for the first time. Therefore It might
> be good to follow some consistent usage here.
> 
> It might be also good to gave a FOR_EACH_SSAVAR iterator as we do in
> other case. Here is attempt to do this based on what is done in other
> places. Bootstrapped and regression tested on X86_64-linux-gnu with no
> new regressions. is this OK?
 
 First of all some bikeshedding - FOR_EACH_SSA_NAME would be better
 as SSAVAR might be confused with iterating over SSA_NAME_VAR.
 
 Then, if you add an iterator why leave the name == NULL handling to 
 consumers?
 That looks odd.
 
 Then, SSA names are just in a vector thus why not re-use the vector 
 iterator?
 
 That is,
 
 #define FOR_EACH_SSA_NAME (name) \
  for (unsigned i = 1; SSANAMES (cfun).iterate (i, ); ++i)
 
 would be equivalent to your patch?
 
 Please also don't add new iterators that implicitely use 'cfun' but always 
 use
 one that passes it as explicit arg.
>>> 
>>> I think defining FOR_EACH_SSA_NAME with vector iterator is better. But
>>> we will not be able to skill NULL ssa_names with that.
>> 
>> Why?  Can't you simply do
>> 
>>  #define FOR_EACH_SSA_NAME (name) \
>>for (unsigned i = 1; SSANAMES (cfun).iterate (i, ); ++i) \
>>   if (name)
>> 
>> ?
> 
> Indeed.  I missed the if at the end :(.  Here is an updated patch.
> Bootstrapped and regression tested on x86_64-linux-gnu with no new
> regressions.
> 
> Thanks,
> Kugan
>> 
>>> I also added
>>> index variable to the macro so that there want be any conflicts if the
>>> index variable "i" (or whatever) is also defined in the loop.
>>> 
>>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>>> regressions. Is this OK for trunk?
>>> 
>>> Thanks,
>>> Kugan
>>> 
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2016-09-06  Kugan Vivekanandarajah  
>>> 
>>>* tree-ssanames.h (FOR_EACH_SSA_NAME): New.
>>>* cfgexpand.c (update_alias_info_with_stack_vars): Use
>>>FOR_EACH_SSA_NAME to iterate over SSA variables.
>>>(pass_expand::execute): Likewise.
>>>* omp-simd-clone.c (ipa_simd_modify_function_body): Likewise.
>>>* tree-cfg.c (dump_function_to_file): Likewise.
>>>* tree-into-ssa.c (pass_build_ssa::execute): Likewise.
>>>(update_ssa): Likewise.
>>>* tree-ssa-alias.c (dump_alias_info): Likewise.
>>>* tree-ssa-ccp.c (ccp_finalize): Likewise.
>>>* tree-ssa-coalesce.c (build_ssa_conflict_graph): Likewise.
>>>(create_outofssa_var_map): Likewise.
>>>(coalesce_ssa_name): Likewise.
>>>* tree-ssa-operands.c (dump_immediate_uses): Likewise.
>>>* tree-ssa-pre.c (compute_avail): Likewise.
>>>* tree-ssa-sccvn.c (init_scc_vn): Likewise.
>>>(scc_vn_restore_ssa_info): Likewise.
>>>(free_scc_vn): Likwise.
>>>(run_scc_vn): Likewise.
>>>* tree-ssa-structalias.c (compute_points_to_sets): Likewise.
>>>* tree-ssa-ter.c (new_temp_expr_table): Likewise.
>>>* tree-ssa-copy.c (fini_copy_prop): Likewise.
>>>* tree-ssa.c (verify_ssa): Likewise.
>>> 
 
 Thanks,
 Richard.
 
 
> Thanks,
> Kugan
> 
> 
> gcc/ChangeLog:
> 
> 2016-09-05  Kugan Vivekanandarajah  
> 
>* tree-ssanames.h (ssa_iterator::ssa_iterator): New.
>(ssa_iterator::get): Likewise.
>(ssa_iterator::next): Likewise.
>(FOR_EACH_SSAVAR): Likewise.
>* cfgexpand.c (update_alias_info_with_stack_vars): Use
>FOR_EACH_SSAVAR to iterate over SSA variables.
>(pass_expand::execute): Likewise.
>* omp-simd-clone.c (ipa_simd_modify_function_body): Likewise.
>* tree-cfg.c (dump_function_to_file): Likewise.
>* tree-into-ssa.c (pass_build_ssa::execute): Likewise.
>(update_ssa): Likewise.
>* tree-ssa-alias.c (dump_alias_info): Likewise.
>* 

Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Richard Biener
On Tue, 6 Sep 2016, Jakub Jelinek wrote:

> On Tue, Sep 06, 2016 at 04:59:23PM +0100, Kyrill Tkachov wrote:
> > On 06/09/16 16:32, Jakub Jelinek wrote:
> > >On Tue, Sep 06, 2016 at 04:14:47PM +0100, Kyrill Tkachov wrote:
> > >>The v3 of this patch addresses feedback I received on the version posted 
> > >>at [1].
> > >>The merged store buffer is now represented as a char array that we splat 
> > >>values onto with
> > >>native_encode_expr and native_interpret_expr. This allows us to merge 
> > >>anything that native_encode_expr
> > >>accepts, including floating point values and short vectors. So this 
> > >>version extends the functionality
> > >>of the previous one in that it handles floating point values as well.
> > >>
> > >>The first phase of the algorithm that detects the contiguous stores is 
> > >>also slightly refactored according
> > >>to feedback to read more fluently.
> > >>
> > >>Richi, I experimented with merging up to MOVE_MAX bytes rather than word 
> > >>size but I got worse results on aarch64.
> > >>MOVE_MAX there is 16 (because it has load/store register pair 
> > >>instructions) but the 128-bit immediates that we ended
> > >>synthesising were too complex. Perhaps the TImode immediate store RTL 
> > >>expansions could be improved, but for now
> > >>I've left the maximum merge size to be BITS_PER_WORD.
> > >At least from playing with this kind of things in the RTL PR22141 patch,
> > >I remember storing 64-bit constants on x86_64 compared to storing 2 32-bit
> > >constants usually isn't a win (not just for speed optimized blocks but 
> > >also for
> > >-Os).  For 64-bit store if the constant isn't signed 32-bit or unsigned
> > >32-bit you need movabsq into some temporary register which has like 3 
> > >times worse
> > >latency than normal store if I remember well, and then store it.
> > 
> > We could restrict the maximum width of the stores generated to 32 bits on 
> > x86_64.
> > I think this would need another parameter or target macro for the target to 
> > set.
> > Alternatively, is it a possibility for x86 to be a bit smarter in its 
> > DImode mov-immediate
> > expansion? For example break up the 64-bit movabsq immediate into two 
> > SImode immediates?
> 
> If you want a 64-bit store, you'd need to merge the two, and that would be
> even more expensive.  It is a matter of say:
>   movl $0x12345678, (%rsp)
>   movl $0x09abcdef, 4(%rsp)
> vs.
>   movabsq $0x09abcdef12345678, %rax
>   movq %rax, (%rsp)
> vs.
>   movl $0x09abcdef, %eax
>   salq $32, %rax
>   orq $0x12345678, %rax
>   movq $rax, (%rsp)

vs.

movq $LC0, (%rsp)

?

> etc.  Guess it needs to be benchmarked on contemporary CPUs, I'm pretty sure
> the last sequence is the worst one.

I think the important part to notice is that it should be straight forward
for a target / the expander to split a large store from an immediate
into any of the above but very hard to do the opposite.  Thus from a
GIMPLE side "canonicalizing" to large stores (that are eventually
supported and well-aligned) seems best to me.
 
> > >What alias set is used for the accesses if there are different alias sets
> > >involved in between the merged stores?
> > 
> > As per https://gcc.gnu.org/ml/gcc/2016-06/msg00162.html the type used in 
> > those cases
> > would be ptr_type_node. See the get_type_for_merged_store function in the 
> > patch.
> 
> Richi knows this best.  I just wonder if e.g. all the stores go into fields
> of the same structure it wouldn't be better if you need to punt use that
> structure as the type rather than alias set 0.

Well, yes - if the IL always accesses a common handled component you
can use the alias set of that component.  But it's some work to do
this correctly as you can have MEM[ + 4] = 1; which stores an 'int'
to a struct with two floats.   So you do have to be careful, also when
merging overlapping stores (in which case you'll certainly have an
irregular situation).

Which is why I suggested the "easy" approach above which should handle
a lot of cases well already.

> > I'm aware of that. The patch already has logic to avoid emitting unaligned 
> > accesses
> > for SLOW_UNALIGNED_ACCESS targets. Beyond that the patch introduces the 
> > parameter
> > PARAM_STORE_MERGING_ALLOW_UNALIGNED that can be used by the user or target 
> > to
> > forbid generation of unaligned stores by the pass altogether. Beyond that 
> > I'm not sure
> > how to behave more intelligently here. Any ideas?
> 
> Dunno, the heuristics was the main problem with my patch.  Generally, I'd
> say there is a difference between cold and hot blocks, in cold ones perhaps
> unaligned stores are more appropriate (if supported at all and not way too
> slow), while in hot ones less desirable.

Note that I repeatedly argue that if we can canonicalize sth to "larger"
then even if unaligned, the expander should be able to produce optimal
code again (it might not do, of course).

Hope to look at the patch in detail soon.


[Bug tree-optimization/20192] Unnecessary jump from &

2016-09-07 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20192

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |5.0
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
Fixed a while back, maybe even in GCC 4.9.x

Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Florian Weimer

On 09/06/2016 10:40 PM, Joseph Myers wrote:

On Tue, 6 Sep 2016, Paul Eggert wrote:


One way to correct the code is to increase malloc's argument up to a multiple
of alignof(max_align_t). (One cannot portably use alignof(struct s) due to


Sounds like a defect in C11 to me - none of the examples of flexible array
members anticipate needing to add to the size to allow for tail padding
with unknown alignment requirements.


I agree, this is a defect in C99 and C11.  The language hasn't changed 
since C99, and C99 has the same issue because it's unrelated to 
alignment specifiers.  It's a confusion between struct sizes (which are 
multiples of the struct alignment) and object sizes (which are not 
necessarily so).


I have reopened PR1 with a more elaborate test case which shows that 
GCC packs objects more tightly than the struct alignment would permit.


Thanks,
Florian


[Bug rtl-optimization/77499] [7 Regression] Regression after code-hoisting, due to combine pass failing to evaluate known value range

2016-09-07 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499

--- Comment #8 from rguenther at suse dot de  ---
On Wed, 7 Sep 2016, avieira at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77499
> 
> --- Comment #7 from avieira at gcc dot gnu.org ---
> if-convert is a no go here, for the reason Andrew pointed out, sorry missed
> that comment!
> 
> So I dont know... Only thing I can think of is better "value-range"-like
> analysis for combine, but that might be too costly?

So we are not really looking for combine to combine the shift stmt
with the xor stmt?  Because combine doesn't consider that because of
the multi-use.

> The fact is that for the code-hoisting to work here, the pseudo for r112 has 
> to
> be shared among both code-paths, so unless you add an extra move:
> 
> BB0:
> r112:SI = r0:SI
> 
> BB 1:
> ...
> r116:SI=r112:SI 0>>0x1
> rNEW:SI=zero_extend(r116:SI#0)
> ...
> if CC goto BB2 else BB Extra
> BB 2:
> r127:SI=rNEW:SI^r129:SI
> r112:SI=zero_extend(r127:SI#0)
> if LOOP: goto BB1 else BB exit
> BB EXTRA:
> r112:SI=rNEW:SI
> if LOOP: goto BB1 else BB exit
> 
> And you end up with an extra move rather than a zero_extend. But maybe the 
> move
> can be optimized away in later stages? Or maybe put in the same conditional
> execution block as the XOR...

Well, we run into a general issue of the RTL combiner -- fwprop and
ree are other passes that are supposed to remove extensions in some
cases.

Really, the user could have written the code in a way CSEing the
shift himself -- it's unfortunate that we now fail to optimize the
non-CSEd source but that can only be a reason to enhance downstream
passes.

RFA: Small PATCH to add pow2p_hwi to hwint.h

2016-09-07 Thread Jason Merrill
Various places in GCC use negate, bit-and and compare to test whether
an integer is a power of 2, but I think it would be clearer for this
test to be wrapped in a function.

OK for trunk?
commit e2ca9914ce46d56775854f50c21506b220fd50b6
Author: Jason Merrill 
Date:   Wed Sep 7 16:22:32 2016 -0400

* hwint.h (pow2p_hwi): New.

diff --git a/gcc/hwint.h b/gcc/hwint.h
index 6b4d537..3d85fc3 100644
--- a/gcc/hwint.h
+++ b/gcc/hwint.h
@@ -299,4 +299,12 @@ absu_hwi (HOST_WIDE_INT x)
   return x >= 0 ? (unsigned HOST_WIDE_INT)x : -(unsigned HOST_WIDE_INT)x;
 }
 
+/* True if X is a power of two.  */
+
+inline bool
+pow2p_hwi (unsigned HOST_WIDE_INT x)
+{
+  return (x & -x) == x;
+}
+
 #endif /* ! GCC_HWINT_H */


Re: [x86] Disable STV pass if -mstackrealign is enabled.

2016-09-07 Thread H.J. Lu
On Wed, Aug 31, 2016 at 12:29 PM, Uros Bizjak  wrote:
>> the new STV pass generates SSE instructions in 32-bit mode very late in the
>> pipeline and doesn't bother about realigning the stack, so it wreaks havoc on
>> OSes where you need to realign the stack, e.g. Windows, but I guess Solaris 
>> is
>> equally affected.  Therefore the attached patch disables it if -mstackrealign
>> is enabled (the option is automatically enabled on Windows and Solaris when
>> SSE support is enabled), as already done for -mpreferred-stack-boundary={2,3}
>> and -mincoming-stack-boundary={2,3}.
>>
>> Tested on x86/Windows, OK for mainline and 6 branch?
>>
>>
>> 2016-08-31  Eric Botcazou  
>>
>>* config/i386/i386.c (ix86_option_override_internal): Also disable the
>>STV pass if -mstackrealign is enabled.
>
> OK for mainline and gcc-6 branch.
>

Is there a testcase to show the problem with -mincoming-stack-boundary=
on Linux?

-- 
H.J.


[Bug target/77483] [6/7 regression] gcc.target/i386/mask-unpack.c etc. FAIL

2016-09-07 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77483

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #5 from H.J. Lu  ---
(In reply to Eric Botcazou from comment #2)
> 
> Well, this patch is a workaround for a pass that wreaks serious havoc except
> on Linux.  Feel free to come up with a better solution...

Is there a bug report?

PING: Re: [PATCH, LRA] Fix PR rtl-optimization 77289, LRA matching constraint problem

2016-09-07 Thread Peter Bergner
Ping this patch:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02099.html

Peter



Re: [PING] Re: [PATCH, i386] Fix some warnings/errors that appear when enabling -Wnarrowing when building gcc

2016-09-07 Thread Uros Bizjak
On Tue, Sep 6, 2016 at 8:06 PM, Eric Gallager  wrote:
> On 9/6/16, Uros Bizjak  wrote:
>> On Tue, Sep 6, 2016 at 5:33 PM, Eric Gallager  wrote:
>>> Ping? CC-ing an i386 maintainer since the patch mostly touches
>>> i386-specific files. Also, to clarify, I say "warnings/errors" because
>>> they start off as warnings in stage 1 but then become errors in stage
>>> 2. Note also that my patch leaves out the part where I modify the
>>> configure script to enable -Wnarrowing, because the rest of the code
>>> isn't quite ready for that yet.
>>
>> You are probably referring to [1]? It looks OK, modulo:
>>
>> +DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~(0U))
>>
>> where parenthesis are not needed.
>>
>>
>> Please resubmit the patch with a ChangeLog entry, as instructed in [2]
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02129.html
>> [2] https://gcc.gnu.org/contribute.html#patches
>>
>> Uros.
>>
>
>
> Okay, reattached. Here's a ChangeLog entry to put in gcc/ChangeLog:
>
> 2016-09-06  Eric Gallager  
>
> * config/i386/i386.c: Add 'U' suffix to constants to avoid
> -Wnarrowing.
> * config/i386/x86-tune.def: Likewise.
> * opts.c: Likewise.
>
>
> (Please also note that I don't have commit access.)

Thanks, committed with slightly adjusted ChangeLog:

2016-09-07  Eric Gallager  

* config/i386/i386.c: Add 'U' suffix to processor feature bits
to avoid -Wnarrowing warning.
* config/i386/x86-tune.def: Likewise for DEF_TUNE selector bitmasks.
* opts.c: Likewise for SANITIZER_OPT bitmasks.

Uros.


Re: [PATCH] Detect whether target can use -fprofile-update=atomic

2016-09-07 Thread Christophe Lyon
On 7 September 2016 at 11:34, Martin Liška  wrote:
> On 09/07/2016 09:45 AM, Christophe Lyon wrote:
>> On 6 September 2016 at 15:45, Martin Liška  wrote:
>>> On 09/06/2016 03:31 PM, Jakub Jelinek wrote:
 sizeof (gcov_type) talks about the host gcov type, you want instead the
 target gcov type.  So
 TYPE_SIZE (gcov_type_node) == 32 vs. 64 (or TYPE_SIZE_UNIT (gcov_type_node)
 == 4 vs. 8).
 As SImode and DImode are in fact 4*BITS_PER_UNIT and 8*BITS_PER_UNIT,
 TYPE_SIZE_UNIT comparisons for 4 and 8 are most natural.
 And I wouldn't add gcc_unreachable, just warn for weirdo arches always.

   Jakub
>>>
>>> Thank you Jakub for helping me with that. I've used TYPE_SIZE_UNIT macro.
>>>
>>> Ready for trunk?
>>> Martin
>>
>> Hi Martin,
>>
>> On targets which do not support atomic profile update, your patch generates a
>> warning on gcc.dg/tree-prof/val-profiler-threads-1.c, making it fail.
>>
>> Do we need a new effective-target ?
>>
>> Christophe
>>
>
> Hi.
>
> Thanks for observation, I'm sending a patch that does that.
> Can you please test it?
>
It does work indeed, thanks.
(tested on arm* targets)

Christophe

> Thanks,
> Martin


[Bug target/63346] xserver_xorg-server-1.15.1 crash on RaspberryPi when compiled with gcc-4.9

2016-09-07 Thread ps.report at gmx dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63346

--- Comment #5 from Peter Seiderer  ---
Seems to be fixed in 5.4.0, tested with the original buildroot/xserver/dillo
testcase (with up to date buildroot) and the provided fbpict.c testcase.

[Patch libgcc] Enable HCmode multiply and divide (mulhc3/divhc3)

2016-09-07 Thread James Greenhalgh

Hi,

This patch arranges for half-precision complex multiply and divide
routines to be built if __LIBGCC_HAS_HF_MODE__.  This will be true
if the target supports the _Float16 type.

OK?

Thanks,
James

---

libgcc/

2016-09-07  James Greenhalgh  

*  Makefile.in (lib2funcs): Build _mulhc3 and _divhc3.
* libgcc2.h (LIBGCC_HAS_HF_MODE): Conditionally define.
(HFtype): Likewise.
(HCtype): Likewise.
(__divhc3): Likewise.
(__mulhc3): Likewise.
* libgcc2.c: Support _mulhc3 and _divhc3.

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index ba37c65..53e3ea2 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -414,8 +414,9 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _ashldi3 _ashrdi3 _cmpdi2 _ucmpdi2	   \
 	_negvsi2 _negvdi2 _ctors _ffssi2 _ffsdi2 _clz _clzsi2 _clzdi2  \
 	_ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2	   \
 	_paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2	   \
-	_mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3	   \
-	_divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
+	_mulhc3 _mulsc3 _muldc3 _mulxc3 _multc3 _divhc3 _divsc3	   \
+	_divdc3 _divxc3 _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2	   \
+	_clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
diff --git a/libgcc/libgcc2.c b/libgcc/libgcc2.c
index 0a716bf..ec3b21f 100644
--- a/libgcc/libgcc2.c
+++ b/libgcc/libgcc2.c
@@ -1852,7 +1852,8 @@ NAME (TYPE x, int m)
 
 #endif
 
-#if ((defined(L_mulsc3) || defined(L_divsc3)) && LIBGCC2_HAS_SF_MODE) \
+#if((defined(L_mulhc3) || defined(L_divhc3)) && LIBGCC2_HAS_HF_MODE) \
+|| ((defined(L_mulsc3) || defined(L_divsc3)) && LIBGCC2_HAS_SF_MODE) \
 || ((defined(L_muldc3) || defined(L_divdc3)) && LIBGCC2_HAS_DF_MODE) \
 || ((defined(L_mulxc3) || defined(L_divxc3)) && LIBGCC2_HAS_XF_MODE) \
 || ((defined(L_multc3) || defined(L_divtc3)) && LIBGCC2_HAS_TF_MODE)
@@ -1861,7 +1862,13 @@ NAME (TYPE x, int m)
 #undef double
 #undef long
 
-#if defined(L_mulsc3) || defined(L_divsc3)
+#if defined(L_mulhc3) || defined(L_divhc3)
+# define MTYPE	HFtype
+# define CTYPE	HCtype
+# define MODE	hc
+# define CEXT	__LIBGCC_HF_FUNC_EXT__
+# define NOTRUNC __LIBGCC_HF_EXCESS_PRECISION__
+#elif defined(L_mulsc3) || defined(L_divsc3)
 # define MTYPE	SFtype
 # define CTYPE	SCtype
 # define MODE	sc
@@ -1922,7 +1929,7 @@ extern void *compile_type_assert[sizeof(INFINITY) == sizeof(MTYPE) ? 1 : -1];
 # define TRUNC(x)	__asm__ ("" : "=m"(x) : "m"(x))
 #endif
 
-#if defined(L_mulsc3) || defined(L_muldc3) \
+#if defined(L_mulhc3) || defined(L_mulsc3) || defined(L_muldc3) \
 || defined(L_mulxc3) || defined(L_multc3)
 
 CTYPE
@@ -1992,7 +1999,7 @@ CONCAT3(__mul,MODE,3) (MTYPE a, MTYPE b, MTYPE c, MTYPE d)
 }
 #endif /* complex multiply */
 
-#if defined(L_divsc3) || defined(L_divdc3) \
+#if defined(L_divhc3) || defined(L_divsc3) || defined(L_divdc3) \
 || defined(L_divxc3) || defined(L_divtc3)
 
 CTYPE
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 72bb873..c46fb77 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -34,6 +34,12 @@ extern void __clear_cache (char *, char *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
   __attribute__ ((__noreturn__));
 
+#ifdef __LIBGCC_HAS_HF_MODE__
+#define LIBGCC2_HAS_HF_MODE 1
+#else
+#define LIBGCC2_HAS_HF_MODE 0
+#endif
+
 #ifdef __LIBGCC_HAS_SF_MODE__
 #define LIBGCC2_HAS_SF_MODE 1
 #else
@@ -133,6 +139,10 @@ typedef unsigned int UTItype	__attribute__ ((mode (TI)));
 #endif
 #endif
 
+#if LIBGCC2_HAS_HF_MODE
+typedef		float HFtype	__attribute__ ((mode (HF)));
+typedef _Complex float HCtype	__attribute__ ((mode (HC)));
+#endif
 #if LIBGCC2_HAS_SF_MODE
 typedef 	float SFtype	__attribute__ ((mode (SF)));
 typedef _Complex float SCtype	__attribute__ ((mode (SC)));
@@ -424,6 +434,10 @@ extern SItype __negvsi2 (SItype);
 #endif /* COMPAT_SIMODE_TRAPPING_ARITHMETIC */
 
 #undef int
+#if LIBGCC2_HAS_HF_MODE
+extern HCtype __divhc3 (HFtype, HFtype, HFtype, HFtype);
+extern HCtype __mulhc3 (HFtype, HFtype, HFtype, HFtype);
+#endif
 #if LIBGCC2_HAS_SF_MODE
 extern DWtype __fixsfdi (SFtype);
 extern SFtype __floatdisf (DWtype);


[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2016-09-07 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

--- Comment #3 from Jonathan Wakely  ---
(In reply to petschy from comment #0)
> For c++11 and later code, why is NULL defined as __null, rather than nullptr?

Because defining NULL as nullptr would violate the requirements of the
standard, which very intentionally says that NULL is an integral constant
expression, not nullptr.

[Bug fortran/60483] [5/6/7 Regression] No IMPLICIT type error with: ASSOCIATE( X => derived_type() ) [i.e. w/ structure constructor]

2016-09-07 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60483

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||vehre at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |vehre at gcc dot gnu.org

[Bug fortran/66459] bogus warning 'w.offset' may be used uninitialized in this function [-Wmaybe-uninitialized]

2016-09-07 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66459

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #3 from Manuel López-Ibáñez  ---
If you do

gfortran -Wuninitialized test.f90 -fdump-tree-all-all-lineno -O1

and look at test.f90.162t.uninit1, we see:

  # .MEM_24 = PHI <.MEM_15(D)(28), .MEM_68(37)>
  # w$dim$1$stride_46 = PHI 
  # w$offset_26 = PHI 

but this code is transformed by optimization. The unoptimized SSA contains:

  [test.f90:9:0] # VUSE <.MEM_18>
  _22 = [test.f90:9:0] *m_21(D);
  [test.f90:9:0] _23 = MAX_EXPR <_22, 0>;
  [test.f90:9:0] _24 = (integer(kind=8)D.9) _23;
...
  [test.f90:9:0] _39 = _24;
...
  [test.f90:9:0] _68 = ~_39;
...
  [test.f90:9:0] wD.3400.offsetD.3387 = _68;

and the gimple generated by Fortran contains something similar:

[test.f90:9:0] D.3429 = [test.f90:9:0] *mD.3381;
[test.f90:9:0] D.3430 = MAX_EXPR ;
[test.f90:9:0] D.3402 = (integer(kind=8)D.9) D.3430;

It seems that *mD.3381 is not initialized.

(It is very strange that gfortran converts user-defined variables to lowercase.
It makes reading the dumps more difficult. It also does many unnecessary
copies, making the code harder to analyze.)

[Bug c++/71710] [7 Regression] ICE on valid C++11 code with decltype and alias template: in lookup_member, at cp/search.c:1255

2016-09-07 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71710

--- Comment #3 from Zhendong Su  ---
A related, but simpler test that triggers the same ICE: 





template < typename > struct A
{
  A a;
  template < int > using B = decltype (a);
  B < 0 > b;
};

[Bug middle-end/77475] unnecessary or misleading context in reporting command line problems

2016-09-07 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77475

--- Comment #4 from Christophe Lyon  ---
Author: clyon
Date: Wed Sep  7 20:18:17 2016
New Revision: 240030

URL: https://gcc.gnu.org/viewcvs?rev=240030=gcc=rev
Log:
PR middle-end/77475: Fix AArch64 testcases.

2016-09-07  Jakub Jelinek  

PR middle-end/77475
* gcc.target/aarch64/arch-diagnostics-1.c: Expect error on line 0.
* gcc.target/aarch64/arch-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-1.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-2.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-3.c: Likewise.
* gcc.target/aarch64/cpu-diagnostics-4.c: Likewise.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/arch-diagnostics-1.c
trunk/gcc/testsuite/gcc.target/aarch64/arch-diagnostics-2.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-1.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-2.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-3.c
trunk/gcc/testsuite/gcc.target/aarch64/cpu-diagnostics-4.c

Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Bernhard Reutner-Fischer
On September 6, 2016 5:14:47 PM GMT+02:00, Kyrill Tkachov 
 wrote:
>Hi all,

s/contigous/contiguous/
s/ where where/ where/

+struct merged_store_group
+{
+  HOST_WIDE_INT start;
+  HOST_WIDE_INT width;
+  unsigned char *val;
+  unsigned int align;
+  auto_vec stores;
+  /* We record the first and last original statements in the sequence because
+ because we'll need their vuse/vdef and replacement position.  */
+  gimple *last_stmt;

s/ because because/ because/

Why aren't these two HWIs unsigned, likewise in store_immediate_info and in 
most other spots in the patch?

+ fprintf (dump_file, "Afer writing ");
s/Afer /After/

/access if prohibitively slow/s/ if /is /

I'd get rid of successful_p in imm_store_chain_info::output_merged_stores.


+unsigned int
+pass_store_merging::execute (function *fun)
+{
+  basic_block bb;
+  hash_set orig_stmts;
+
+  FOR_EACH_BB_FN (bb, fun)
+{
+  gimple_stmt_iterator gsi;
+  HOST_WIDE_INT num_statements = 0;
+  /* Record the original statements so that we can keep track of
+statements emitted in this pass and not re-process new
+statements.  */
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gimple_set_visited (gsi_stmt (gsi), false);
+ num_statements++;
+   }
+
+  if (num_statements < 2)
+   continue;

What about debug statements? ISTM you should skip those.
(Isn't visited reset before entry of a pass?)

Maybe I missed the bikeshedding about the name but I'd have used -fmerge-stores 
instead.

Thanks,
>
>The v3 of this patch addresses feedback I received on the version
>posted at [1].
>The merged store buffer is now represented as a char array that we
>splat values onto with
>native_encode_expr and native_interpret_expr. This allows us to merge
>anything that native_encode_expr
>accepts, including floating point values and short vectors. So this
>version extends the functionality
>of the previous one in that it handles floating point values as well.
>
>The first phase of the algorithm that detects the contiguous stores is
>also slightly refactored according
>to feedback to read more fluently.
>
>Richi, I experimented with merging up to MOVE_MAX bytes rather than
>word size but I got worse results on aarch64.
>MOVE_MAX there is 16 (because it has load/store register pair
>instructions) but the 128-bit immediates that we ended
>synthesising were too complex. Perhaps the TImode immediate store RTL
>expansions could be improved, but for now
>I've left the maximum merge size to be BITS_PER_WORD.
>
>I've disabled the pass for PDP-endian targets as the merging code
>proved to be quite fiddly to get right for different
>endiannesses and I didn't feel comfortable writing logic for
>BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN targets without serious
>testing capabilities. I hope that's ok (I note the bswap pass also
>doesn't try to do anything on such targets).
>
>Tested on arm, aarch64, x86_64 and on big-endian arm and aarch64.
>
>How does this version look?
>Thanks,
>Kyrill
>
>[1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01512.html
>
>2016-09-06  Kyrylo Tkachov  
>
> PR middle-end/22141
> * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
> * common.opt (fstore-merging): New Optimization option.
> * opts.c (default_options_table): Add entry for
> OPT_ftree_store_merging.
> * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
> * passes.def: Insert pass_tree_store_merging.
> * tree-pass.h (make_pass_store_merging): Declare extern
> prototype.
> * gimple-ssa-store-merging.c: New file.
> * doc/invoke.texi (Optimization Options): Document
> -fstore-merging.
>
>2016-09-06  Kyrylo Tkachov  
> Jakub Jelinek  
>
> PR middle-end/22141
> * gcc.c-torture/execute/pr22141-1.c: New test.
> * gcc.c-torture/execute/pr22141-2.c: Likewise.
> * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
> * gcc.target/aarch64/ldp_stp_4.c: Likewise.
> * gcc.dg/store_merging_1.c: New test.
> * gcc.dg/store_merging_2.c: Likewise.
> * gcc.dg/store_merging_3.c: Likewise.
> * gcc.dg/store_merging_4.c: Likewise.
> * gcc.dg/store_merging_5.c: Likewise.
> * gcc.dg/store_merging_6.c: Likewise.
> * gcc.target/i386/pr22141.c: Likewise.
> * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.




Re: [PATCH][v3] GIMPLE store merging pass

2016-09-07 Thread Jakub Jelinek
On Wed, Sep 07, 2016 at 10:19:11AM +0200, Richard Biener wrote:
> > If you want a 64-bit store, you'd need to merge the two, and that would be
> > even more expensive.  It is a matter of say:
> > movl $0x12345678, (%rsp)
> > movl $0x09abcdef, 4(%rsp)
> > vs.
> > movabsq $0x09abcdef12345678, %rax
> > movq %rax, (%rsp)
> > vs.
> > movl $0x09abcdef, %eax
> > salq $32, %rax
> > orq $0x12345678, %rax
> > movq $rax, (%rsp)
> 
> vs.
> 
> movq $LC0, (%rsp)

You don't want to store the address, so you'd use
movq .LC0, %rax
movq %rax, (%rsp)

> I think the important part to notice is that it should be straight forward
> for a target / the expander to split a large store from an immediate
> into any of the above but very hard to do the opposite.  Thus from a
> GIMPLE side "canonicalizing" to large stores (that are eventually
> supported and well-aligned) seems best to me.

I bet many programs assume that say 64-bit aligned store in the source is
atomic in 64-bit apps, without using __atomic_store (..., __ATOMIC_RELAXED);
So such a change would break that.

Jakub


Re: Make max_align_t respect _Float128 [version 2]

2016-09-07 Thread Bernd Edlinger
On 09/07/16 22:04, Joseph Myers wrote:
> On Wed, 7 Sep 2016, Bernd Edlinger wrote:
>
>> interesting.  I just tried the test case from PR 77330 with _Decimal128.
>> result: _Decimal128 did *not* trap with gcc4.8.4, but it does trap with
>> gcc-7.0.0.
>
> I checked with GCC 4.3; __alignof__ (_Decimal128) was 16 back then.
> Whether particular code happens to make use of that alignment requirement
> is inherently unpredictable.
>

Oh, now I see...

Alignof(_Decimal128) was 16, but gcc4.8 did not enable -msse, and
that must have changed.  When I use gcc -m32 -msse the test case
starts to fail with gcc-4.8.4.

With gcc-7.0.0 -m32 -mno-sse fixes the test case but the alignment is
still 16, as you already said.

Apparently the different -msse default setting made the situation worse.
I think that will not run on a pentium4 any more.


Bernd.


Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-09-07 Thread Alessandro Fanfarillo
Dear all,
the attached patch supports failed images also when -fcoarray=single is used.

Built and regtested on x86_64-pc-linux-gnu.

Cheers,
Alessandro

2016-08-09 5:22 GMT-06:00 Paul Richard Thomas :
> Hi Sandro,
>
> As far as I can see, this is OK barring a couple of minor wrinkles and
> a question:
>
> For coarray_failed_images_err.f90 and coarray_image_status_err.f90 you
> have used the option -fdump-tree-original without making use of the
> tree dump.
>
> Mikael asked you to provide an executable test with -fcoarray=single.
> Is this not possible for some reason?
>
> Otherwise, this is OK for trunk.
>
> Thanks for the patch.
>
> Paul
>
> On 4 August 2016 at 05:07, Alessandro Fanfarillo
>  wrote:
>> * PING *
>>
>> 2016-07-21 13:05 GMT-06:00 Alessandro Fanfarillo :
>>> Dear Mikael and all,
>>>
>>> in attachment the new patch, built and regtested on x86_64-pc-linux-gnu.
>>>
>>> Cheers,
>>> Alessandro
>>>
>>> 2016-07-20 13:17 GMT-06:00 Mikael Morin :
 Le 20/07/2016 à 11:39, Andre Vehreschild a écrit :
>
> Hi Mikael,
>
>
>>> +  if(st == ST_FAIL_IMAGE)
>>> +new_st.op = EXEC_FAIL_IMAGE;
>>> +  else
>>> +gcc_unreachable();
>>
>> You can use
>> gcc_assert (st == ST_FAIL_IMAGE);
>> foo...;
>> instead of
>> if (st == ST_FAIL_IMAGE)
>> foo...;
>> else
>> gcc_unreachable ();
>
>
> Be careful, this is not 100% identical in the general case. For older
> gcc version (gcc < 4008) gcc_assert() is mapped to nothing, esp. not to
> an abort(), so the behavior can change. But in this case everything is
> fine, because the patch is most likely not backported.
>
 Didn't know about this. The difference seems to be very subtle.
 I don't mind much anyway. The original version can stay if preferred, this
 was just a suggestion.

 By the way, if the function is inlined in its single caller, the assert or
 unreachable statement can be removed, which avoids choosing between them.
 That's another suggestion.


>>> +
>>> +  return MATCH_YES;
>>> +
>>> + syntax:
>>> +  gfc_syntax_error (st);
>>> +
>>> +  return MATCH_ERROR;
>>> +}
>>> +
>>> +match
>>> +gfc_match_fail_image (void)
>>> +{
>>> +  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement
>>> at %C")) */
>>> +  /*   return MATCH_ERROR; */
>>> +
>>
>> Can this be uncommented?
>>
>>> +  return fail_image_statement (ST_FAIL_IMAGE);
>>> +}
>>>
>>>  /* Match LOCK/UNLOCK statement. Syntax:
>>>   LOCK ( lock-variable [ , lock-stat-list ] )
>>> diff --git a/gcc/fortran/trans-intrinsic.c
>>> b/gcc/fortran/trans-intrinsic.c index 1aaf4e2..b2f5596 100644
>>> --- a/gcc/fortran/trans-intrinsic.c
>>> +++ b/gcc/fortran/trans-intrinsic.c
>>> @@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr
>>> *expr) m, lbound));
>>>  }
>>>
>>> +static void
>>> +gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
>>> +{
>>> +  unsigned int num_args;
>>> +  tree *args,tmp;
>>> +
>>> +  num_args = gfc_intrinsic_argument_list_length (expr);
>>> +  args = XALLOCAVEC (tree, num_args);
>>> +
>>> +  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
>>> +
>>> +  if (flag_coarray == GFC_FCOARRAY_LIB)
>>> +{
>>
>> Can everything be put under the if?
>> Does it work with -fcoarray=single?
>
>
> IMO coarray=single should not generate code here, therefore putting
> everything under the if should to fine.
>
 My point was more avoiding generating code for the arguments if they are 
 not
 used in the end.
 Regarding the -fcoarray=single case, the function returns a result, which
 can be used in an expression, so I don't think it will work without at 
 least
 hardcoding a fixed value as result in that case.
 But even that wouldn't be enough, as the function wouldn't work 
 consistently
 with the fail image statement.

> Sorry for the comments ...
>
 Comments are welcome here, as far as I know. ;-)

 Mikael
>
>
>
> --
> The difference between genius and stupidity is; genius has its limits.
>
> Albert Einstein
commit 13213642603b4941a2e4ea085b0bfd5cb37f
Author: Alessandro Fanfarillo 
Date:   Wed Sep 7 13:00:17 2016 -0600

Second Review of failed image patch

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index ff5e80b..110bec0 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1217,6 +1217,82 @@ gfc_check_event_query (gfc_expr *event, gfc_expr *count, 
gfc_expr *stat)
   return true;
 }
 
+bool
+gfc_check_image_status (gfc_expr *image, gfc_expr 

[Bug fortran/48298] [F03] User-Defined Derived-Type IO (DTIO)

2016-09-07 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48298

--- Comment #19 from Paul Thomas  ---
Author: pault
Date: Wed Sep  7 21:21:16 2016
New Revision: 240032

URL: https://gcc.gnu.org/viewcvs?rev=240032=gcc=rev
Log:
2016-09-07  Dominique Dhumieres  

PR fortran/48298
* gfortran.dg/assumed_rank_12.f90: Correct tree scan.
* gfortran.dg/assumed_type_2.f90: Correct tree scans.
* gfortran.dg/coarray_lib_comm_1.f90: Likewise.
* gfortran.dg/coarray_lib_this_image_2.f90: Likewise.
* gfortran.dg/coarray_lock_7.f90: Likewise.
* gfortran.dg/coarray_stat_function.f90: Likewise.
* gfortran.dg/no_arg_check_2.f90: Likewise.
* gfortran.dg/pr32921.f: Likewise.

Modified:
branches/fortran-dev/gcc/testsuite/ChangeLog.fortran-dev
branches/fortran-dev/gcc/testsuite/gfortran.dg/assumed_rank_12.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/assumed_type_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lib_comm_1.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lib_this_image_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_lock_7.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/coarray_stat_function.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/no_arg_check_2.f90
branches/fortran-dev/gcc/testsuite/gfortran.dg/pr32921.f

[Bug c/77521] New: %qc format directive should quote non-printable characters

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77521

Bug ID: 77521
   Summary: %qc format directive should quote non-printable
characters
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

The GCC-specific %qc format directive prints its character argument in quotes. 
One might expect the directive to quote non-printable characters similarly to
the %s directive but that's not what happens.  %qc prints the character as is. 
As a result, callers of the warning_at and error APIs that use the %qc
directive must be careful not to call it with non-printable characters (for
instance, by using the ISGRAPH() macro as done in c-family/c-format.c) and use
a different directive for those.  Those that don't might end up corrupting the
compiler stderr output as in the test case below.  Those that are careful end
up using a different alternate directive (e.g., %x or %o) resulting in
inconsistent diagnostics.

This bug is to change the %qc directive in the GCC pretty printer to format
non-printable characters using some other directive than the C %c (for example,
"\x%x" as done in c-family/c-format.c).

$ cat t.c && /build/gcc-trunk/gcc/xgcc -B /build/gcc-trunk/gcc -S -Wformat t.c
void g (int foo, int bar)
{
  asm ("combine %2, %0" : "=r" (foo) : "0" (foo), "\n" (bar));
}
t.c: In function ‘g’:
t.c:3:3: error: invalid punctuation ‘
’ in constraint
   asm ("combine %2, %0" : "=r" (foo) : "0" (foo), "\n" (bar));
   ^~~
t.c:3:3: error: invalid punctuation ‘
’ in constraint


The following is a list of GCC formatted output functions with the %qc
directive:

find gcc -name "*.c" ! -path "*/testsuite/*" | xargs grep "%qc")
gcc/fortran/io.c:  const char *unexpected_element  = _("Unexpected element %qc
in format "
gcc/fortran/matchexp.c: gfc_error ("Bad character %qc in OPERATOR name at %C",
name[i]);
gcc/fortran/symbol.c: gfc_error ("Letter %qc already set in IMPLICIT
statement at %C",
gcc/fortran/symbol.c: gfc_error ("Letter %qc already has an IMPLICIT
type at %C",
gcc/c-family/c-lex.c: error_at (*loc, "stray %qc in program", (int) c);
gcc/c-family/c-format.c:" %qc in format",
gcc/c-family/c-format.c:  "use of %qs length
modifier with %qc type"
grep: gcc/cp/.#decl.c: No such file or directory
gcc/stmt.c: warning (0, "output constraint %qc for operand %d "
gcc/stmt.c: error ("input operand constraint contains %qc",
constraint[j]);
gcc/stmt.c: error ("invalid punctuation %qc in constraint",
constraint[j]);
gcc/config/mmix/mmix.c:  internal_error ("MMIX Internal: Missing %qc case
in mmix_print_operand", code);
gcc/config/avr/driver-avr.c:error ("strange device name %qs after %qs:
bad character %qc",
gcc/gcc.c:  error ("spec failure: unrecognized spec option %qc", c);
gcc/gcc.c:  fatal_error (input_location, "braced spec %qs is invalid at %qc",
orig, *p);

Re: Ping**2! Re: [PATCH, Fortran] Extension: AUTOMATIC/STATIC symbol attributes with -fdec-static

2016-09-07 Thread Fritz Reese
On Wed, Sep 7, 2016 at 9:31 AM, Andre Vehreschild  wrote:
> Hi Fritz,
>
> please note: I do not have official review privileges. So my vote here
> is rather an advise to you and the official reviewers. Often such a
> inofficial review helps to speed things up, because the official ones
> are pointed to the nics and nacs and don't have to bother with the
> minor things.

Andre,

Thank you very much for your comments. I have only been contributing
to GNU/GCC for a year or two and appreciate the advice. I definitely
strive to keep my patches in line with the relevant standards and
style guides. Attached is a replacement for the original patch which
addresses your concerns.


> - Do I understand this correctly: AUTOMATIC and STATIC have to come last,
>   i.e., right before the :: where declaring, e.g., a variable?
Not quite, you are probably misled by this code in decl.c:

> match
> gfc_match_static (void)
> {
...
> gfc_match (" ::");

(And equivalent for gfc_match_automatic.) These are similar to
gfc_match_save, and like gfc_match_save are only called to match
variable attribute specification _statements_, such as:
> SAVE :: x
> AUTOMATIC :: y

STATIC and AUTOMATIC are matched in any order in an attribute
specification _list_, as with SAVE, through the giant switch() earlier
in decl.c/match_attr_spec(). This applies to the following:
> INTEGER, SAVE, DIMENSION(3) :: x
> INTEGER, AUTOMATIC, DIMENSION(3) :: y


> - Running:
>
>   $ contrib/check_GNU_style.sh dec_static.patch
>
>   Reports some style issues in the C code, that should be fixed before
>   commit. (Style in Fortran testcases does not matter that much.)
I was not aware of this script - thanks!


> Please change formatting in a separate patch or not at all (here!).
> This policy is to distinguish cosmetic changes from relevant ones.
Fixed. These changes are usually accidental - I try not to reformat
code that I'm not otherwise touching.


>> diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
...
>> +With @option{-fdec-static} GNU Fortran supports the explicit specification 
>> of
>> +two addition variable attributes: @code{STATIC} and @code{AUTOMATIC}. These
...
> But is it only for variables? Can't it be used for equivalences or
> other constructs, too?
Yes, good point, perhaps 'entities' is a better term here:

+With @option{-fdec-static} GNU Fortran supports the DEC extended attributes
+@code{STATIC} and @code{AUTOMATIC} to provide explicit specification of entity
+storage. [...]

>> diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
...
>> +@item -fdec-static
>> +@opindex @code{fdec-static}
>> +Enable STATIC and AUTOMATIC as attributes specifying storage location.
>> +STATIC is equivalent to SAVE, and locals are typically AUTOMATIC by default.
>
> Well, this description to me sounds like: "Those attributes are
> useless, because they can be substituted." This is clearly not what you
> intend. I propose to include into the description that with "this
> switch the dec-extension" is available "to explicitly specify the
> storage of entities". Then the last sentence is still a good hint for
> all fortraneers that don't know the extension.

I guess I subconsciously made them sound "useless" because I hoped
users would think twice about using the extensions and use standard
conforming constructs instead. :-)  But, maybe you are right and this
would be clearer:

+@item -fdec-static
+@opindex @code{fdec-static}
+Enable DEC-style STATIC and AUTOMATIC attributes to explicitly specify
+the storage of variables and other objects.


>> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
...
>> +fdec-static
>> +Fortran Var(flag_dec_static)
>> +Enable STATIC and AUTOMATIC attributes.
>
> How about: Enable the dec-extension of STATIC and AUTOMATIC attributes.
> Just a proposal.
How about this, to match invoke.texi:

+fdec-static
+Fortran Var(flag_dec_static)
+Enable DEC-style STATIC and AUTOMATIC attributes.


> -  Please add some testcases where the new error messages are tested.
Yes, good idea! cf. attached for dec_static_3.f90 and
dec_static_4.f90. These tests gave me a chance to realize I should
emit some better error messages so I made a minor change there:

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index db431dd..be8e9f7 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -7838,6 +7838,8 @@ gfc_match_automatic (void)
   switch (m)
   {
   case MATCH_NO:
+break;
+
   case MATCH_ERROR:
 return MATCH_ERROR;

@@ -7856,7 +7858,7 @@ gfc_match_automatic (void)

   if (!seen_symbol)
 {
-  gfc_error ("Expected var-list in AUTOMATIC statement at %C");
+  gfc_error ("Expected entity-list in AUTOMATIC statement at %C");
   return MATCH_ERROR;
 }

... And similar for gfc_match_static. (Nb. "entity-list" was chosen to
correspond with the "save-entity-list" descriptor used by F90 to
specify the SAVE statement.)

Andre, thanks again for your comments. I 

Re: [PATCH] -fsanitize=thread fixes (PR sanitizer/68260)

2016-09-07 Thread Jakub Jelinek
On Wed, Sep 07, 2016 at 09:07:45AM +0200, Richard Biener wrote:
> > @@ -493,6 +504,8 @@ instrument_builtin_call (gimple_stmt_ite
> > if (!tree_fits_uhwi_p (last_arg)
> > || memmodel_base (tree_to_uhwi (last_arg)) >= MEMMODEL_LAST)
> >   return;
> > +   if (lookup_stmt_eh_lp (stmt))
> > + remove_stmt_from_eh_lp (stmt);
> 
> These changes look bogus to me -- how can the tsan instrumentation
> function _not_ throw when the original function we replace it can?

The __tsan*atomic* functions are right now declared to be nothrow, so the
patch just matches how they are declared.
While the sync-builtins.def use
#define ATTR_NOTHROWCALL_LEAF_LIST (flag_non_call_exceptions ? \
ATTR_LEAF_LIST : ATTR_NOTHROW_LEAF_LIST)
I guess I could use the same for the tsan atomics, but wonder if it will work
properly when the libtsan is built with exceptions disabled and without
-fnon-call-exceptions.  Perhaps it would, at least if it is built with
-fasynchronous-unwind-tables (which is the case for x86_64 and aarch64 and
tsan isn't supported elsewhere).
> 
> It seems you are just avoiding the ICEs for now "wrong-code".  (and
> how does this relate to -fnon-call-exceptions as both are calls?)
> 
> The instrument_expr case seems to leave the original stmt in-place
> (and thus the EH).

Those are different.  For loads and stores there is just added call that
logs in the load or store, but for atomics the atomics are done inside of
the library and the library tracks everything.

Jakub


[Bug libgcc/77519] New: [5/6/7 Regression] complex multiply / divide excess precision handling inverted

2016-09-07 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77519

Bug ID: 77519
   Summary: [5/6/7 Regression] complex multiply / divide excess
precision handling inverted
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jsm28 at gcc dot gnu.org
  Target Milestone: ---

libgcc complex multiply and divide are meant to eliminate excess precision from
certain internal values by forcing them to memory in exactly those cases where
the type has excess precision.  But in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01894.html I accidentally
inverted the logic so that values get forced to memory in exactly the cases
where it's not needed.  (This is a pessimization in the no-excess-precision
case, in principle could lead to bad results depending on code generation in
the excess-precision case.)

Re: [Patch RFC] Modify excess precision logic to permit FLT_EVAL_METHOD=16

2016-09-07 Thread Joseph Myers
On Wed, 7 Sep 2016, Joseph Myers wrote:

> How about instead having more than one target macro / hook.  One would 
> indicate that excess precision is used by insn patterns (and be set only 
> for i386 and m68k).  Another would indicate the API-level excess precision 

Or, maybe there would be a single hook taking a tristate argument.

Target hooks need to provide the following information:

(a) What excess precision is implicit in the insn patterns (that is, when 
a value described in middle-end IR as having a particular mode/type may in 
fact implicitly have extra range and precision because insn patterns 
produce results with such extra range and precision, not just with the 
range and precision of the output mode).

(b) What excess precision should be added explicitly to the IR by the 
front end in "fast" mode.

(c) What excess precision should be added explicitly to the IR by the 
front end in "standard" mode.

All of these may be represented by FLT_EVAL_METHOD values.  In what 
follows, "none" means no excess precision (whether the value is 0 or 16); 
"unpredictable" means inherently unpredictable even in the absence of 
register spills (like -mfpmath=sse+387), so FLT_EVAL_METHOD == -1.  In 
principle there could be cases of predictable excess precision that have 
to map to -1 because there is no other value they could map to, but in 
practice I don't expect that to be an issue (given the TS 18661-3 values 
of FLT_EVAL_METHOD being available).

(a) should always be "none" except for the existing x86 and m68k cases.

(b) is "none" in all existing cases, but we have the issue of ARM cases 
without direct binary16 arithmetic where it would be desirable for it to 
apply excess precision to _Float16 values.

(c) is not "none" at present in exactly those cases where 
TARGET_FLT_EVAL_METHOD is nonzero (but in the case of s390 this is really 
a target bug that should be fixed).  It might also be not "none" in future 
for ARM cases like in (b).

(a) can be "unpredictable".  (b) and (c) never can.

Rather than init_excess_precision setting flag_excess_precision, possibly 
turning "standard" into "fast", I think it should set some variable that 
describes the result of whichever of (b) and (c) is applicable - and in 
the cases where "standard" turns into "fast", it would simply happen that 
both (b) and (c) produce the same result.

The effective excess precision seen by the user is the result of applying 
first one of (b) and (c), then (a).  If (a) is not "none", this is not 
entirely predictable.  It's a broken compiler configuration if applying 
(c) yields a type on which (a) is not a no-op, except in the case where 
(a) is "unpredictable" and (c) is "none".  I'm not aware of any likely 
cases where a type would actually get promoted twice by applying those two 
operations.

Right now TARGET_FLT_EVAL_METHOD_NON_DEFAULT is used to give errors for 
-fexcess-precision=standard for languages not supporting it.  With a 
conversion to hooks, that needs to be rethought.  The point is to give an 
error if predictability was requested but cannot be achieved, so I suppose 
ideally the error should be about (a) being not "none", together with 
-fexcess-precision=standard being used.  But if the relevant back-end 
options aren't available at this point to use the hook for (a), the error 
could just be given for all targets (for those languages when that option 
is given).

Effective predictability, for __GCC_IEC_559 in flag_iso mode, means that 
(a) does nothing to any type resulting from whichever of (b) and (c) is in 
effect.

The way __LIBGCC_*_EXCESS_PRECISION__ is used is about eliminating excess 
precision from results assigned to variables - meaning it should be about 
(a) only.

That leaves the question of setting FLT_EVAL_METHOD.  It should relate to 
the effective excess precision seen by the user, the combination of 
whichever of (b) and (c) is in effect with (a).  The only problem is the 
case where that combination is most precisely described by "16", which as 
discussed is not a C11 value and may affect existing code not expecting 
such a value.  The value -1 is compatible with C11 and TS 18661-3 but 
suboptimal, while the value 0 is compatible with C11 only, not with TS 
18661-3 even when no feature test macros are defined.

We already have the option -fno-fp-int-builtin-inexact to ensure certain 
built-in functions follow TS 18661-1 semantics.  It might be reasonable to 
have a new option to enable FLT_EVAL_METHOD using new values.  However, 
I'd be inclined to think that such an option should be on by default for 
-std=gnu*, only off for strict conformance modes.  (There would be both 
__FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__, say, predefined macros, 
so that  could also always use the new value if 
__STDC_WANT_IEC_60559_TYPES_EXT__ is defined.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug c/77520] wrong value for extended ASCII characters in -Wformat message

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77520

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
   Severity|normal  |minor

[Bug c/77520] New: wrong value for extended ASCII characters in -Wformat message

2016-09-07 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77520

Bug ID: 77520
   Summary: wrong value for extended ASCII characters in -Wformat
message
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

The argument_parser::find_format_char_info function is c-family/c-format.c
passes the value of an unexpected format character whose type is char as an
argument to a %x directive.  When char is a signed type and the value is in the
extended ASCII range ('\x80' and greater) it is sign-extended and yields a very
large value on output as in the test case below.  The character should be cast
to unsigned char to avoid the sign extension.  The following test case shows
the problem:

$ cat t.c && /home/msebor/build/gcc-49905/gcc/xgcc
-B/home/msebor/build/gcc-49905/gcc -S -Wformat t.c
void f (void)
{
  __builtin_printf ("%\x80");
}
t.c: In function ‘f’:
t.c:3:23: warning: unknown conversion type character 0xff80 in format
[-Wformat=]
   __builtin_printf ("%\x80");
   ^~~~

Clang produces better output:

t.c:3:23: warning: invalid conversion specifier '\x80'

  1   2   >