Re: [PATCH 6/8] Handle SCRATCH in decompose_address

2015-01-27 Thread Maxim Kuvyrkov
On Oct 23, 2014, at 4:18 AM, Jeff Law  wrote:

> On 10/22/14 17:01, Maxim Kuvyrkov wrote:
>> On Oct 23, 2014, at 9:02 AM, Jeff Law  wrote:
>> 
>>> On 10/20/14 21:35, Maxim Kuvyrkov wrote:
 Hi,
 
 This patch is a simple fix to allow decompose_address to handle
 SCRATCH'es during 2nd scheduler pass. This patch is a
 prerequisite for a scheduler improvement that relies on
 decompose_address to parse insns.
 
 Bootstrapped and regtested on x86_64-linux-gnu and regtested on
 arm-linux-gnueabihf and aarch64-linux-gnu.
>>> I'd like to see some further discussion here.
>>> 
>>> get_base_term is supposed to look at its argument as a base
>>> address. I'm curious under what circumstances you want to have a
>>> SCRATCH as a base address?
>>> 
>>> I didn't see anything in patch #8 which obviously dependended on
>>> this, but maybe it's in there, but more subtle than expected.
>>> 
>>> If you can justify why it's useful to handle scratch in here, then
>>> the patch will be fine.
>> 
>> Without this patch decompose_address() ICEs during second scheduler
>> pass on prologue instructions that usually have "(clobber (mem:BLK
>> (scratch))".  The only reason for this patch is to prevent that fault
>> and enable use of decompose_address during 2nd scheduler pass.
>> 
>> Does this answer your question, or are you looking for a more
>> in-depth reason?
> Yea, that's everything I needed to know.  Patch approved.

Hi,

Turns out that the above patch applies without conflicts to two functions in 
rtlanal.c: get_base_term(), for which the patch is intended, and 
get_index_term(), for which the patch is not.

Due to git rebases and patch updates, I have accidentally pushed the patch 
twice and unintentionally changed get_index_term().  From what I can tell the 
change is benign, but, still, it is unnecessary.  The attached patch reverts 
the accidental commit.  It was bootstrapped arm-linux-gnueabihf.

OK for stage 1?  I'll regtest it before committing, just in case.

Thanks,

--
Maxim Kuvyrkov
www.linaro.org




0001-Revert-accidental-commit-get_base_index-was-the-inte.patch
Description: Binary data


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law
I'm withdrawing the combine_simplify_rtx hunk of this patch.  While 
working cleaning up my improvements for the remaining of testcases I 
stumbled upon a simpler change which covers all the tests.


What's kind of funny is I'd been staring at the relevant code a goodly 
part of the weekend without seeing how easily it could be extended and 
that the result doesn't have to match a pattern as combine can split the 
horrid mess in such a way that we two matched insns which when combined 
with other nearby insns ultimately collapse into precisely what we want.




We're still going to need the changes to the heuristic to enable 4 insn 
combinations as we need to be giving nice big blobs of code to 
combine_simplify_rtx and its children.


It's actually kind of cool to see something like this flow into 
make_field_assignment:


(set (mem/c:HI (symbol_ref:SI ("y") [flags 0x40] 0x7670bcf0 y>) [2 y+0 S2 A16])
(subreg:HI (ior:SI (zero_extend:SI (mem/c:QI (symbol_ref:SI ("y") 
[flags 0x40] ) [2 y+0 S1 A16]))

(reg:SI 100 [ D.1569 ])) 0))

And make_field_assignment turns it into:

(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI ("y") [flags 0x40] 
)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])) 0))

Combine then chooses the lshift expression as a split point and emits an 
insn for the lshift and substitutes a nice simple reg into that hunk of 
RTL above for the lshift


(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI ("y") [flags 0x40] 
)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0))

oh, that looks perfect.  Now it's just a simple matter of 
cleanup :-)  Which is actually kindof fun to watch:


We'll have this after the combine & split step:

(insn 9 8 10 2 (parallel [
(set (reg:SI 99 [ D.1569 ])
(ashift:SI (reg:SI 96 [ c ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 510 {*ashlsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil

(insn 10 9 13 2 (parallel [
(set (reg:SI 100 [ D.1569 ])
(and:SI (reg:SI 99 [ D.1569 ])
(const_int 65280 [0xff00])))
(clobber (reg:CC 17 flags))
]) j.c:33 380 {*andsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 99 [ D.1569 ])
(nil

(insn 13 10 14 2 (parallel [
(set (reg:SI 103)
(lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 543 {*lshrsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 100 [ D.1569 ])
(nil

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI ("y") 
[flags 0x40] )

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 103)
(nil)))

Eventually all those participate in combinations again resulting in just:

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI ("y") 
[flags 0x40]  )

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 96 [ c ]) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil)))

ie, movb %al, y+1


Sometimes I malign the combiner, but there are also days when I just 
have to say "wow", not bad given its last major revamp was in 1990/1991 
(which brought in the splitting code noted above.


Anyway, onward bootstrapping and testing...


Jeff

On 01/26/15 20:07, Jeff Law wrote:

Segher: I know you're not officially noted as a maintainer or reviewer
for combine.c, but that's something I'd like to change if you're
interested in a larger role.  In the mean time, any feedback you have
would be appreciated.


So the issue mentioned in the BZ is that fairly obvious code sequences
that ought to use simple byte moves are expanding into hideous sequences
(load, store, couple bitwise logicals, maybe a shift or extension thrown
in for good measure).

As mentioned in the BZ, one of the issues is that combine is limited in
terms of how many insns it will look at.  As it turns out that was
addressed not terribly low ago and we can do 4 insn combinations. With
just a little work in combine.c we can get the desired code for the
first two testcases as well as two of my own.

The first issue is 4 insn combinations are (reasonably) guarded in such
a way as to avoid them if they are unlikely to succeed.  We basically
look at the operands of the 4 insns and try to guess if there's a
reasonable chance a combination would succeed.  If not, no 4 insn
combinations are tried.

So the first part of this patch improves that heuristic.  What we see
with these byte accesses is a pattern like

RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-27 Thread Terry Guo


> -Original Message-
> From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
> Sent: Monday, January 26, 2015 7:34 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
> Subject: Re: [Patch][wwwdocs]Deprecate the ARM TPCS related options in
> gcc 5.0
> 
> On Monday 2015-01-26 16:47, Terry Guo wrote:
> > This patch intends to update gcc 5.0 change.html to deprecate TPCS
> > related options because TPCS is obsoleted per the ABI document at
> >
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapc
> s.pdf.
> > Is it OK?
> 
> From a language perspective I suggest to say "The options < here>> related to the old ABI..." or "The options related to the old ABI
--
> <> -- ...", where I somewhat prefer the former.
> 
> Please wait for Richard or Ramana for final review and approval.
> 
> Gerald

Thanks Gerald. Patch is updated. Is this one OK?

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.73
diff -u -p -r1.73 changes.html
--- htdocs/gcc-5/changes.html   26 Jan 2015 09:40:03 -  1.73
+++ htdocs/gcc-5/changes.html   27 Jan 2015 09:35:32 -
@@ -513,8 +513,9 @@ void operator delete[] (void *, std::siz
The deprecated option -mwords-little-endian
has been removed.
   
-   The options relating to the old ABI -mapcs and
-  -mapcs-frame have been deprecated.
+   The options -mapcs, -mapcs-frame,
+  -mtpcs-frame and -mtpcs-leaf-frame
+  which are only applicable to the old ABI have been deprecated.
   
   The transitional options -mlra and
-mno-lra
have been removed. The ARM backend now uses the local register
allocator





Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread David Edelsohn
On Tue, Jan 27, 2015 at 7:27 PM, Alan Modra  wrote:
> On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
>> I want to avoid duplicating the -mcpu parsing logic or the Rube
>> Goldberg mechanism to re-generate the -mXXX assembler directive.
>
> Oh well, I had fun writing the patch.  I thought it reasonably
> elegant, meeting the goals you state above.  You think differently,
> and I won't push my approach further.  The bug isn't important enough
> to argue over.

Alan,

I am sorry that you do not want to finish the patch.  I don't
understand why you find the command line argument so appealing when
the .machine pseudo-op was designed for this purpose.

Thanks, David


Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread Alan Modra
On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
> I want to avoid duplicating the -mcpu parsing logic or the Rube
> Goldberg mechanism to re-generate the -mXXX assembler directive.

Oh well, I had fun writing the patch.  I thought it reasonably
elegant, meeting the goals you state above.  You think differently,
and I won't push my approach further.  The bug isn't important enough
to argue over.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Add comdat_group effective target (PR bootstrap/64612)

2015-01-27 Thread Mike Stump
On Jan 27, 2015, at 7:10 AM, Jakub Jelinek  wrote:
> 
> This patch introduces a new effective target check and adds it to the 
> pr64612.C
> - if comdat groups aren't used, there is no guarantee that the D2 dtor will
> be emitted always alongside of D1 dtor.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 06:04:53PM +0300, Ilya Enkovich wrote:
> 2015-01-27 17:27 GMT+03:00 Jakub Jelinek :
> > I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
> > far fewer enum values than the 1204 that are being defined.
> >
> > This patch requires builtins.def to say explicitly (by using
> > DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
> > ones need that, for all the others only space in the enum is reserved and
> > nothing else.
> >
> > I'd hope this could work around the buggy AIX stabs handling, but even
> > on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
> > by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
> > just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
> > without dwz between TUs).  The cost without dwz is obviously mainly
> > from repeating that in most of the translation units.  But why declare
> > BUILT_IN_*_CHKP enums that are never used by anything...
> 
> Enum values not mentioned in the code are not fully useless.  When we
> have builtin functions defined as 'always_inline' functions, they are
> instrumented and enum names may be used in dumps and debugging.
> That's not a big value though.  Thanks a lot for taking care of it!

Note, patch successfully bootstrapped/regtested on x86_64-linux and
i686-linux, and David said that on AIX it passed stage1 cc1 linking.

Ok for trunk?

As for the enums, I doubt the pain is worth the trouble.
What perhaps could be done (apparently preexisting issue, because you
include builtins.def just once in built_in_names, would be to tweak
fprintf (file, " built-in %s:%s",
 built_in_class_names[(int) DECL_BUILT_IN_CLASS (node)],
 built_in_names[(int) DECL_FUNCTION_CODE (node)]);
so that if DECL_FUNCTION_CODE is in between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS you don't print (null) on glibc
there or crash (on various other hosts), but actually print
built_in_names[(int) DECL_FUNCTION_CODE (node) - (int) BEGIN_CHKP_BUILTINS - 1]
concatenated with "_CHKP".

> > 2015-01-27  Jakub Jelinek  
> >
> > * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
> > (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
> > (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
> > of DEF_BUILTIN.
> > (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, 
> > BUILT_IN_STRCAT,
> > BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
> > DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
> > (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
> > BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
> > BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
> > DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
> > * tree-core.h (enum built_in_function): In between
> > BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
> > for builtins that use DEF_BUILTIN_CHKP macro.
> >
> > --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
> > +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
> > @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
> >
> > The builtins is registered only if COND is true.  */
> >
> > +/* A macro for builtins where the
> > +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
> > +   enums should be defined too.  */
> > +#ifndef DEF_BUILTIN_CHKP
> > +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
> > +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
> > +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
> > + NONANSI_P, ATTRS, IMPLICIT, COND)
> > +#endif
> > +
> >  /* A GCC builtin (like __builtin_saveregs) is provided by the
> > compiler, but does not correspond to a function in the standard
> > library.  */
> > @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
> >  #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
> >DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
> >true, true, false, ATTRS, true, true)
> > +#undef DEF_LIB_BUILTIN_CHKP
> > +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
> > +  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
> > +   TYPE, true, true, false, ATTRS, true, true)
> >
> >  /* Like DEF_LIB_BUILTIN, except that the function is not one that is
> > specified by ANSI/ISO C.  So, when we're being fully conformant we
> > @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
> >  #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
> >DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
> >true, true, true, ATTRS, false, true)
> > +#undef DEF_EXT_LIB_BUILTIN_CHKP
> > +#define DEF_EXT_LIB_BUIL

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 14:21, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?


That is the one, thanks.

It's not going to help 61225.  The key insns in 61225 are:

(insn 6 3 7 2 (set (reg:SI 91 [ *x_3(D) ])
(mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])) k.c:11 90 
{*movsi_internal}

 (nil))
(insn 7 6 8 2 (parallel [
(set (reg:SI 88 [ D.1494 ])
(plus:SI (reg:SI 91 [ *x_3(D) ])
(const_int -1 [0x])))
(clobber (reg:CC 17 flags))
]) k.c:11 220 {*addsi_1}
 (expr_list:REG_DEAD (reg:SI 91 [ *x_3(D) ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUAL (plus:SI (mem:SI (reg/v/f:SI 90 [ x ]) 
[1 *x_3(D)+0 S4 A32])

(const_int -1 [0x]))
(nil)
(insn 8 7 9 2 (set (mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])
(reg:SI 88 [ D.1494 ])) k.c:11 90 {*movsi_internal}
 (nil))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 88 [ D.1494 ])
(const_int 0 [0]))) k.c:11 3 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 88 [ D.1494 ])
(nil)))


Note how REG:SI 88 has two uses.   Never do we pass a set of insns into 
try_combine that are useful to optimize this particular case (we never 
include insn #9 in any of the attempted combinations).


Even a bridge pattern isn't going to help here.


jeff








Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Vladimir Makarov
On 01/27/2015 12:11 PM, Richard Sandiford wrote:
> Vladimir Makarov  writes:
>> On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>>> Yeah, but in practice that's only ever going to be a partial transition.
>>> Many port maintainers won't look at this, so we'll have to support both
>>> versions indefinitely, even if the new behaviour turns out to be the
>>> best for all cases.
>>>
>>> I just think we're going to regret having two sets of constraints with
>>> such subtly different meanings.
>>>
>>> Looking back at the original PR, Jakub said:
>>>
>>>   The ! has been added by me for PR63594, so it isn't there from the era
>>>   when i?86 backend was using reload.  If there is a better way to
>>>   express that RA should prefer to use memory or xmm register and only
>>>   use r constraint if it already is in a r register and doesn't need to
>>>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>>>   ! description in gcc docs just fitted most what I wanted...
>>>
>>> In some ways this seems to match the intention of "*".  Originally I think
>>> it was just an RA-only thing and was ignored by reload, but LRA does take it
>>> into account too (which sounds like progress to me).
>>   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
>> should decide what class is better based on costs of alternatives and
>> the explicit exclusion of register class by using '*' is a bad practice.
>>
>>   Saying that I believe we should do register class preferrencing
>> algorithm more alternative oriented.  The algorithm should choose first
>> an alternative (of may be subset of alternatives) and then register
>> classes.  I think it is more logical.  It would permits us to rid off
>> all such constraints including '*' and use only one like '?' which
>> increases the alternative cost.
>>
>>   In perspective it is even better to rid of '?' too and have some hook
>> (or attribute) to get insn alternative costs which can be depended on
>> sub-target or other run-time characteristics.  Otherwise we need to
>> duplicate insn descriptions and put different insn guards.  I am going
>> to work on this.  But it is hard to say will it work well (may be I have
>> some performance issues with this).  This hook somehow (min or average
>> of the values for all alternatives) can be used in combiner and other
>> algorithms need an insn cost. That is how I see the solution of the
>> problem in a long perspective.
> Definitely agree that it'd be better to remove these constraints
> in favour of a new attribute.  preferred_for_size and preferred_for_speed
> give something similar, though they're much more stringent than what
> we need here.
>
>>> If I revert the patch locally and change the *vec_dup pattern to
>>> use "*", it passes both the test for PR64110 and the tests for PR63594.
>>> Would that be OK as an alternative?
>>>
>>   I don't think it will work in general case.  It probably works because
>> a different class is chosen in IRA.  If IRA for some reasons choose the
>> same class, we might see the same problem in LRA.
> But isn't that the point of '*'?  It should stop IRA from using the 'r'
> alternative as an indication that 'r' is a good choice for this instruction.
> If IRA chooses 'r' anyway, it must be because other instructions that
> use the same allocno strongly prefer 'r'.
>
> And in those some circumstances -- i.e. if IRA does choose 'r' despite
> the constraints in this instruction -- then I think we do want to use the
> 'r' alternative.  And AIUI that's also what the new constraint is designed
> to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
> the 'r' alternative _even if_ another operand (the destination) has to
> be reloaded, which is the fundamental difference between the new constraint
> and '!'.
>
> So I'm still not sure why '*' wouldn't do what we want.
Frequently use of '*' (and sometimes '!' for reload) means that we need
splitting for this alternative probably into 2/3 insns.  Instead of '*'
use we would need to set up costs of all these insns.  I believe just
ignoring the class with '*' is wrong.  There are some cases where we
need '*' to avoid definitely this reg class, e.g. mmx when we use other
classes for fp values.  But I guess this solution is not reliable and
without the constraints we could set the alternative cost very high to
have a reliable right solution.

>>  I also don't like when register classes are excluded by '*' for IRA
>> (see my thoughts above).
> Understood, and I agree it would be good to move to attributes.
> But in a way, I think that's an even better reason to try to avoid
> adding these new constraints.  It sounds like we're hoping to get rid
> of them as soon as we've added them :-)
>
>
Sometimes to get rid off, you should add more :)

But to be serious, what I wrote can not be implemented for GCC-5.0 (and
the generated code performance is still unknown for the proposed
approach).  I believe the current solution is more re

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 13:36, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:

On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1 > M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?

We're generally working with pseudos, so we could estimate, but not know
for sure if we're dealing with multiple hard regs.  But more
importantly, I'm not sure what that check would buy us.


I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.
We might lose for a case like (subreg:DI ({zero,sign}_extend:SI (x))) on 
a 32 bit target if something were able to recognize that the upper bits 
were don't cares.


The most likely place for that to happen would be at assembly output 
time -- but that would require the target to exploit the don't care 
semantics of those bits.  I don't recall any port doing that.


We could exploit this in generic splitting code, but I don't think we 
do.  lower-subreg slams in a zero when it finds a paradoxical subreg and 
we've asked for the high word.  I don't immediately see that does 
anything special when the operand of the subreg is anything other than 
another reg or mem.


combine exploits the "don't care" nature of those bits to eliminate 
masking and such.  It's not going to be able to eliminate the subreg 
entirely unless it folks into some later insn and we're ultimately able 
to narrow the operation back down to SImode.



Also note that while ports may not have special cases around the subreg 
variant, several have special cases for ZERO_EXTEND.  Basically they 
slam in a zero to the upper word, either via a splitter or during 
assembly code output.  Those special cases will be recognized more often 
now.



Jeff


C++ PATCH for c++/58597 (lambda in default arg)

2015-01-27 Thread Jason Merrill
Here, sometimes we can end up in maybe_add_lambda_conv_op with 
current_function_decl set but not cfun.  If we push_function_context in 
that case, the later pop doesn't clear cfun, but leaves it with a value 
that leads to a crash later on.  So let's avoid calling 
push_function_context in that case.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ce65568ba19c4613c25f48064a0d5e66454265ac
Author: Jason Merrill 
Date:   Tue Jan 27 14:26:18 2015 -0500

	PR c++/58597
	* lambda.c (maybe_add_lambda_conv_op): Check cfun rather than
	current_function_decl.

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 6c9e224..b160c8c 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -854,7 +854,7 @@ prepare_op_call (tree fn, int nargs)
 void
 maybe_add_lambda_conv_op (tree type)
 {
-  bool nested = (current_function_decl != NULL_TREE);
+  bool nested = (cfun != NULL);
   bool nested_def = decl_function_context (TYPE_MAIN_DECL (type));
   tree callop = lambda_function (type);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
new file mode 100644
index 000..fe8767a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
@@ -0,0 +1,9 @@
+// PR c++/58597
+// { dg-do compile { target c++11 } }
+
+template struct A
+{
+  template A(T, int = []{ return 0; }()) {}
+};
+
+A a = 0;


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool
On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:
> >I do have a specific PR in mind, but I cannot currently find it.  It was
> >about x86, dec mem and then using the flags...  Must have sent 100 emails
> >in that thread...  And cannot find it now!
> Are you referring to 61225?

That is the one, thanks.


Segher


Bug 62044 - [4.8/4.9 Regression] ICE in USE statement with RENAME for extended derived type

2015-01-27 Thread Paul Richard Thomas
Dear All,

The highly embarrassing bug in mold = allocations to class entities
has been fixed in revisions 220140 and 220191 for trunk and 4.9
respectively. The PR has been set as RESOLVED.

Cheers

Paul


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 13:36, Segher Boessenkool wrote:

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

OK.  Let me play with that a bit.



Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

There are :-)  It surprised me as well.




Does this do anything good for the "dec mem" thing on x86?  That would
be a nice bonus :-)

It might, but I haven't tested for that specifically.  If you've got
sample code or a PR in mind, pass it along and I'll take a look.  I'd
think dec mem would generally be handled by 3->1 insn combination code
unless there's something else going on.


I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?

Jeff


Re: [debug-early] C++ clones and limbo DIEs

2015-01-27 Thread Jason Merrill

On 01/23/2015 01:45 PM, Aldy Hernandez wrote:

It would expect [the flush] to be before free_lang_data and LTO streaming.


The reason this wouldn't make a difference is because, as it stands,
dwarf for the clones are not generated until final.c:

   if (!DECL_IGNORED_P (current_function_decl))
 debug_hooks->function_decl (current_function_decl);

which happens after free_lang_data.


I agree that the current code doesn't have this effect, but we're 
talking about changing things, right? :)



Unfortunately, this sets DECL_ABSTRACT_P for the "static_p" above, and
refuses to unset it after the call to dwarf2out_decl.


Well, that sounds like a bug.  Why isn't it being unset?  Is it because 
DECL_ABSTRACT_P was already set for the function, so we don't call 
set_decl_abstract_flags (decl, 0)?  Perhaps a solution to that would be 
to avoid calling set_decl_abstract_flags (decl, 1) if the function is 
already marked as abstract.  Or to teach set_decl_abstract_flags not to 
mess with static local variables.


Jason



[Patch, fortran] PR63205 - [OOP] Wrongly rejects type = class (for identical declared type)

2015-01-27 Thread Paul Richard Thomas
Dear All,

This patch enables the passing of an allocatable class object, scalar
or array, to a derived type of the declared type, either in an
assignment or as an actual argument. Much of the effort went into
sorting out the finalization call so that the 'left over' allocatable
components added by the dynamic type do not leak memory. At the
moment, the existence of the finalization function is tested for. A
check to see if the dynamic type is the same as the declared type
could be added.

Note that adding the 'must_finalize' field to gfc_expr will be useful
in enabling the missing mandatory finalization calls.

There are still interrogation marks about the patch; especially in
build_class_array_ref, where I do not understand why the added code
does not work in general, except for hidden function results.
Nonetheless, the code does not leak memory, apart perhaps from the
compound derived type constructors, with allocatable components that
already show leaks elsewhere. It is also well ringfenced and so should
not cause any regressions... touch wood!

Bootstraps and regtests on x86_64/FC21 - OK for trunk?

Paul

2015-01-27  Paul Thomas  

PR fortran/63205
* gfortran.h: Add 'must finalize' field to gfc_expr and
prototypes for gfc_is_alloc_class_scalar_function and for
gfc_is_alloc_class_array_function.
* expr.c (gfc_is_alloc_class_scalar_function,
gfc_is_alloc_class_array_function): New functions.
* trans-array.c (gfc_add_loop_ss_code): Do not move the
expression for allocatable class scalar functions outside the
loop.
(conv_array_index_offset): Cope with deltas being NULL_TREE.
(build_class_array_ref): Do not return with allocatable class
array functions. Add code to pick out the returned class array.
Dereference if necessary and return if not a class object.
(gfc_conv_scalarized_array_ref): Cope with offsets being NULL.
(gfc_walk_function_expr): Return an array ss for the result of
an allocatable class array function.
* trans-expr.c (gfc_conv_subref_array_arg): Remove the assert
that the argument should be a variable. If an allocatable class
array function, set the offset to zero and skip the write-out
loop in this case.
(gfc_conv_procedure_call): Add allocatable class array function
to the assert. Call gfc_conv_subref_array_arg for allocatable
class array function arguments with derived type formal arg..
Add the code for handling allocatable class functions, including
finalization calls to prevent memory leaks.
(arrayfunc_assign_needs_temporary): Return if an allocatable
class array function.
(gfc_trans_assignment_1): Set must_finalize to rhs expression
for allocatable class functions. Set scalar_to_array as needed
for scalar class allocatable functions assigned to an array.
Nullify the allocatable components corresponding the the lhs
derived type so that the finalization does not free them.

2015-01-27  Paul Thomas  

PR fortran/63205
* gfortran.dg/class_to_type_4.f90: New test
Index: gcc/fortran/gfortran.h
===
*** gcc/fortran/gfortran.h  (revision 208092)
--- gcc/fortran/gfortran.h  (working copy)
*** typedef struct gfc_expr
*** 1753,1758 
--- 1753,1761 
/* Mark an expression as being a MOLD argument of ALLOCATE.  */
unsigned int mold : 1;

+   /* Will require finalization after use.  */
+   unsigned int must_finalize : 1;
+
/* If an expression comes from a Hollerith constant or compile-time
   evaluation of a transfer statement, it may have a prescribed target-
   memory representation, and these cannot always be backformed from
*** bool gfc_expr_check_typed (gfc_expr*, gf
*** 2804,2809 
--- 2807,2814 

  gfc_component * gfc_get_proc_ptr_comp (gfc_expr *);
  bool gfc_is_proc_ptr_comp (gfc_expr *);
+ bool gfc_is_alloc_class_scalar_function (gfc_expr *);
+ bool gfc_is_alloc_class_array_function (gfc_expr *);

  bool gfc_ref_this_image (gfc_ref *ref);
  bool gfc_is_coindexed (gfc_expr *);
Index: gcc/fortran/expr.c
===
*** gcc/fortran/expr.c  (revision 208092)
--- gcc/fortran/expr.c  (working copy)
*** gfc_is_proc_ptr_comp (gfc_expr *expr)
*** 4274,4279 
--- 4274,4313 
  }


+ /* Determine if an expression is a function with an allocatable class scalar
+result.  */
+ bool
+ gfc_is_alloc_class_scalar_function (gfc_expr *expr)
+ {
+   if (expr->expr_type == EXPR_FUNCTION
+   && expr->value.function.esym
+   && expr->value.function.esym->result
+   && expr->value.function.esym->result->ts.type == BT_CLASS
+   && !CLASS_DATA (expr->value.function.esym->result)->attr.dimension
+   && CLASS_DATA (expr->value.function.esym->result)->attr.allocatable)
+ return true;
+
+   return false;
+ }
+
+
+ /* Determine if an expression is a function wit

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool
On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:
> On 01/26/15 22:11, Segher Boessenkool wrote:
> >On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:
> >>The second change we need is an additional simplification.
> >>
> >>If we have
> >>(subreg:M1 (zero_extend:M2 (x))
> >>
> >>Where M1 > M2 and both are scalar integer modes.  It's advantageous to
> >>strip the SUBREG and instead have a wider extension.
> >
> >Should you also check M1 is not multiple registers?
> We're generally working with pseudos, so we could estimate, but not know 
> for sure if we're dealing with multiple hard regs.  But more 
> importantly, I'm not sure what that check would buy us.

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

> Earlier versions checked reg_equal_p on the MEM.  But that's often a 
> mistake because the modes of the two memory references may be different. 
>  I don't recall which of the various tests, but I was definitely seeing 
> SImode in the load and HImode in the store.
> 
> Similarly you don't want to check reg_equal_p on the addresses as they 
> aren't necessarily the same either (they're obviously related).
> 
> That's how I ultimately settled on rtx_referenced_p form you see above. 
>  I'm still not sure that's 100% what I want, but I don't have any tests 
> yet which require something more complex.

Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

> >Does this do anything good for the "dec mem" thing on x86?  That would
> >be a nice bonus :-)
> It might, but I haven't tested for that specifically.  If you've got 
> sample code or a PR in mind, pass it along and I'll take a look.  I'd 
> think dec mem would generally be handled by 3->1 insn combination code 
> unless there's something else going on.

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!


Segher


Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Tobias Burnus

Jakub Jelinek wrote:

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.


For completeness: gfortran currently produces "GNU Fortran" and 
DW_LANG_Fortran95; GCC itself also handles ...Fortran77 and 
...Fortran90, but those are not produced with gfortran.


With the patch, it produces for -gdwarf-2/3/4 (4 is default) or 
"-gdwarf-5 -std=f95" the same as above. For -std=f2003 -gdwarf-5, it 
yields "GNU Fortran2003" and DW_LANG_Fortran2003. And for -gdwarf-5 and 
the rest of -std= (f2008, f2008ts, gnu, legacy), it produces "GNU 
Fortran2008" and DW_LANG_Fortran2008.


(In principle, they could have prepared for the future and added Fortran 
2015 as well.)



Regarding the change: it is fine with me. (However, I wonder how much 
will break, once the "|| !dwarf_strict" is enabled, knowing that 
compilers are often more frequently updated as debuggers, valgrind and 
similar programs. On the other, except of debuggers, most tools should 
care much about the DW_LANG.)


Tobias

PS: Talking about DWARF5, do you know when it will be available as 
public draft? I am especially looking forward to 
http://dwarfstd.org/ShowIssue.php?issue=121221.1 (Allow DW_AT_type with 
DW_TAG_string_type), which would be a low-hanging fruit in terms of 
implementation. Contrary to the array additions of 130313.5.



As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle "GNU Fortran2003" and
"GNU Fortran2008" language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1 > M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?
We're generally working with pseudos, so we could estimate, but not know 
for sure if we're dealing with multiple hard regs.  But more 
importantly, I'm not sure what that check would buy us.






Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
Thoughts?


It looks fine to me.  Well, some comments...


@@ -2643,6 +2644,24 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   || GET_CODE (src) == LSHIFTRT)
nshift++;
}
+
+  /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
+are likely manipulating its value.  Ideally we'll be able to combine
+all four insns into a bitfield insertion of some kind.
+
+Note the source in I0 might be inside a sign/zero extension and the
+memory modes in I0 and I3 might be different.  So extract the address
+from the destination of I3 and search for it in the source of I0.
+
+In the event that there's a match but the source/dest do not actually
+refer to the same memory, the worst that happens is we try some
+combinations that we wouldn't have otherwise.  */
+  if ((set0 = single_set (i0))
+ && (set3 = single_set (i3))
+ && GET_CODE (SET_DEST (set3)) == MEM
+ && rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0)))
+   ngood += 2;


I think you should test MEM_P (SET_SRC (set0)), too.  Or even just test
rtx_equal_p (SET_DEST (set3), SET_SRC (set0)) ?
Yea, we need a tighter test on set0 to ensure it's a MEM.  That code got 
twidded before the last testrun.  I'll take care of that.


Earlier versions checked reg_equal_p on the MEM.  But that's often a 
mistake because the modes of the two memory references may be different. 
 I don't recall which of the various tests, but I was definitely seeing 
SImode in the load and HImode in the store.


Similarly you don't want to check reg_equal_p on the addresses as they 
aren't necessarily the same either (they're obviously related).


That's how I ultimately settled on rtx_referenced_p form you see above. 
 I'm still not sure that's 100% what I want, but I don't have any tests 
yet which require something more complex.








+
if (ngood < 2 && nshift < 2)
return 0;
  }
@@ -5663,6 +5682,25 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  return CONST0_RTX (mode);
}

+  /* If we have (subreg:M1 (zero_extend:M2 (x))) or
+(subreg:M1 (sign_extend: M2 (x))) where M1 is wider
+then M2, then go ahead and just widen the original extension.
+
+While the subreg is useful in saying "I don't care about those
+upper bits.  Squashing out the subreg results in simpler RTL that
+is more easily matched.  */


Closing quote missing.

Fixed locally.




+  if ((GET_CODE (SUBREG_REG (x)) == ZERO_EXTEND
+  || GET_CODE (SUBREG_REG (x)) == SIGN_EXTEND)
+ && SCALAR_INT_MODE_P (GET_MODE (x))
+ && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (x)))
+ && GET_MODE (x) > GET_MODE (SUBREG_REG (x)))


GET_MODE_SIZE instead?
It's work as-is.  But using GET_MODE_SIZE shows the intent clearer. 
I'll fix that momentarily.





Does this do anything good for the "dec mem" thing on x86?  That would
be a nice bonus :-)
It might, but I haven't tested for that specifically.  If you've got 
sample code or a PR in mind, pass it along and I'll take a look.  I'd 
think dec mem would generally be handled by 3->1 insn combination code 
unless there's something else going on.


jef


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Tobias Burnus

Tobias Burnus wrote:

This one compiles just as well, of course.

 From my side, that patch (using MAX) is fine. Thanks for
bearing the bootstrap failure and for the patch.


I have now committed it (i.e. Rainer's patch) as Rev. 220182.
I have also committed the fixed-up/combined patch to the 4.9 branch as 
Rev. 220184.


(BTW: The original patch was approved by Paul on IRC.)

Tobias


C++ PATCH for c++/63889 (ICE with member variable template)

2015-01-27 Thread Jason Merrill
We were trying to instantiate is_ok with only the innermost set of 
template arguments; we need to make sure that the outer args are 
provided as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e2df55ffbe254dfc15801a204af16d012aeb4cb5
Author: Jason Merrill 
Date:   Mon Jan 26 10:55:42 2015 -0500

	PR c++/63889
	* pt.c (finish_template_variable): Move from semantics.c.
	Handle multiple template arg levels.  Handle coercion here.
	(lookup_template_variable): Not here.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bc26530..d377daa 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8091,13 +8091,28 @@ tree
 lookup_template_variable (tree templ, tree arglist)
 {
   tree type = unknown_type_node;
-  tsubst_flags_t complain = tf_warning_or_error;
-  tree parms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (templ));
-  arglist = coerce_template_parms (parms, arglist, templ, complain,
-   /*req_all*/true, /*use_default*/true);
   return build2 (TEMPLATE_ID_EXPR, type, templ, arglist);
 }
 
+/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
+
+tree
+finish_template_variable (tree var)
+{
+  tree templ = TREE_OPERAND (var, 0);
+
+  tree arglist = TREE_OPERAND (var, 1);
+  tree tmpl_args = DECL_TI_ARGS (DECL_TEMPLATE_RESULT (templ));
+  arglist = add_outermost_template_args (tmpl_args, arglist);
+
+  tree parms = DECL_TEMPLATE_PARMS (templ);
+  tsubst_flags_t complain = tf_warning_or_error;
+  arglist = coerce_innermost_template_parms (parms, arglist, templ, complain,
+	 /*req_all*/true,
+	 /*use_default*/true);
+
+  return instantiate_template (templ, arglist, complain);
+}
 
 struct pair_fn_data
 {
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 915048d..75aa501 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2454,15 +2454,6 @@ finish_call_expr (tree fn, vec **args, bool disallow_virtual,
   return result;
 }
 
-/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
-
-tree
-finish_template_variable (tree var)
-{
-  return instantiate_template (TREE_OPERAND (var, 0), TREE_OPERAND (var, 1),
-   tf_error);
-}
-
 /* Finish a call to a postfix increment or decrement or EXPR.  (Which
is indicated by CODE, which should be POSTINCREMENT_EXPR or
POSTDECREMENT_EXPR.)  */
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ22.C b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
new file mode 100644
index 000..9ddc925
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
@@ -0,0 +1,14 @@
+// PR c++/63889
+// { dg-do compile { target c++14 } }
+
+template
+struct A
+{
+  template
+  static constexpr bool is_ok = true;
+
+  template>
+  A(T) { }
+};
+
+A p(42);


Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 01:52:12PM -0500, David Malcolm wrote:
> > @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
> >  
> >gfc_cpp_post_options ();
> >  
> > +  if (gfc_option.allow_std & GFC_STD_F2008)
> > +lang_hooks.name = "GNU Fortran2008";
> > +  else if (gfc_option.allow_std & GFC_STD_F2003)
> > +lang_hooks.name = "GNU Fortran2003";
> > +
> 
> Did you test this on rs6000?
> 
> In particular, rs6000_output_function_epilogue has a:
>   else if (! strcmp (language_string, "GNU F77")
>  || ! strcmp (language_string, "GNU Fortran"))
>   i = 1;

You're right, missed that.  Consider that changed to lang_GNU_Fortran ().

Jakub


Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread David Malcolm
On Tue, 2015-01-27 at 19:19 +0100, Jakub Jelinek wrote:
> Hi!
> 
> DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
> accepted into DWARF5.  This patch changes GCC to handle those similarly to
> how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.
> 
> As it will take some time for consumers to catch up, I'm enabling that
> only if -gdwarf-5 is used for now.
> 
> 2015-01-27  Jakub Jelinek  
> 
>   * dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
>   and DW_LANG_Fortran08.
>   * dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
>   or DW_LANG_Fortran08.
>   (lower_bound_default): Return 1 for DW_LANG_Fortran03 or
>   DW_LANG_Fortran08.
>   (gen_compile_unit_die): Handle "GNU Fortran2003" and
>   "GNU Fortran2008" language strings.
>   * dbxout.c (get_lang_number): Use lang_GNU_Fortran.
>   * langhooks.h (lang_GNU_Fortran): New prototype.
>   * langhooks.c (lang_GNU_Fortran): New function.
> fortran/
>   * options.c: Include langhooks.h.
>   (gfc_post_options): Change lang_hooks.name based on
>   selected -std= mode.

(...snip...)

> --- gcc/fortran/options.c.jj  2015-01-12 21:29:11.0 +0100
> +++ gcc/fortran/options.c 2015-01-27 19:07:33.729285229 +0100
> @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
>  #include "cpp.h"
>  #include "diagnostic.h"  /* For global_dc.  */
>  #include "tm.h"
> +#include "langhooks.h"
>  
>  gfc_option_t gfc_option;
>  
> @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
>  
>gfc_cpp_post_options ();
>  
> +  if (gfc_option.allow_std & GFC_STD_F2008)
> +lang_hooks.name = "GNU Fortran2008";
> +  else if (gfc_option.allow_std & GFC_STD_F2003)
> +lang_hooks.name = "GNU Fortran2003";
> +

Did you test this on rs6000?

In particular, rs6000_output_function_epilogue has a:
  else if (! strcmp (language_string, "GNU F77")
   || ! strcmp (language_string, "GNU Fortran"))
i = 1;

Does that conditional need updating to track the langhooks.name change
(maybe to use your new lang_GNU_Fortran function?)

Dave




Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil
2015-01-27 19:23 GMT+01:00 Jakub Jelinek :
> On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
>> 2015-01-27 10:30 GMT+01:00 Jakub Jelinek :
>> > Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
>> > directory and hack up ubsan.exp in there
>>
>> Thanks for the remark, I was suspecting something like that. However,
>> for this case it's not really worth the hassle. In fact the test case
>> does not really need the sanitizer and should also work without it. So
>> I'll just remove the -fsanitize option:
>>
>> Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
>> ===
>> --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
>> +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
>> @@ -1,5 +1,4 @@
>>  ! { dg-do run }
>> -! { dg-options "-fsanitize=undefined" }
>>  !
>>  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
>> compiler-generated finalizer for allocatable component
>>  !
>
> LGTM.

Good, committed as r220181. Since I had already backported the
original patch to 4.9 yesterday, I'll do the same there ...

Cheers,
Janus


Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Jack Howarth
Thomas,
 Any plans to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64635 soon? On x86_64
darwin, the OpenACC merge resulted a huge number of failures in the
libgomp test suite…


=== libgomp Summary ===
# of expected passes 10628
# of unexpected failures 724
# of unsupported tests 562

which are resolved with a fix similar to
https://gcc.gnu.org/bugzilla/attachment.cgi?id=34480.
   Jack


On Mon, Jan 26, 2015 at 8:44 AM, Thomas Schwinge
 wrote:
> Hi!
>
> Sorry for the late answer -- I've been on sick leave, and just now
> returning to work.  Julian, would you please have a look at the following
> issues?
>
>> > > In r219682, I have committed to trunk our current set of OpenACC changes,
>> > > which we had prepared on gomp-4_0-branch.  Thanks to everyone who has
>> > > been contributing!
>
> On Fri, 23 Jan 2015 20:20:53 +0300, Ilya Verbin  wrote:
>> On 17 Jan 02:16, Ilya Verbin wrote:
>> > Unfortunately, it broke offloading from shared libraries (I mean common 
>> > libs
>> > with NEEDED entries, not dlopened).
>
> Sorry for that!
>
>> > Such things are not covered by the
>> > testsuite, that's why you missed this issue.  Here is a simple testcase:
>
> 
>
> Probably a good motivation for adding such a test case.  ;-)
>
>> > So, you don't assume that a device can have multiple images from multiple 
>> > libs?
>>
>> Ping?
>
> This probably is "just" a bug that we introduced with our changes?
> (Julian?)
>
>
>> Also, could you please explain, why did you divide a device initialization 
>> into
>> two functions -- gomp_init_device and gomp_init_tables?
>
> As I understand it (again, Julian, please correct me if I got that
> wrong), the reason is that for OpenACC support, we need these as two
> separate (independent) actions.  Is this causing problems for OpenMP
> offloading?
>
>
>> Currently I'm trying to rebase on trunk my old patch, which fixes offloading
>> from dlopened libraries: 
>> http://gcc.gnu.org/ml/gcc-patches/2014-11/msg01604.html
>> It works for OpenMP and MIC, but I don't know how not to break OpenACC and 
>> PTX.
>
>
> Grüße,
>  Thomas


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
> 2015-01-27 10:30 GMT+01:00 Jakub Jelinek :
> > Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
> > directory and hack up ubsan.exp in there
> 
> Thanks for the remark, I was suspecting something like that. However,
> for this case it's not really worth the hassle. In fact the test case
> does not really need the sanitizer and should also work without it. So
> I'll just remove the -fsanitize option:
> 
> Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
> ===
> --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
> +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
> @@ -1,5 +1,4 @@
>  ! { dg-do run }
> -! { dg-options "-fsanitize=undefined" }
>  !
>  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
> compiler-generated finalizer for allocatable component
>  !

LGTM.

Jakub


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil
> On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:
>>
>> > 2015-01-19  Janus Weil  
>> >
>> > PR fortran/64230
>> > * gfortran.dg/class_allocate_18.f90: Extended.
>>
>> FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
>> Excess errors:
>> /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Sorry for the breakage, guys!


2015-01-27 10:30 GMT+01:00 Jakub Jelinek :
> Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
> directory and hack up ubsan.exp in there

Thanks for the remark, I was suspecting something like that. However,
for this case it's not really worth the hassle. In fact the test case
does not really need the sanitizer and should also work without it. So
I'll just remove the -fsanitize option:

Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
===
--- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
+++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-fsanitize=undefined" }
 !
 ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
compiler-generated finalizer for allocatable component
 !


Cheers,
Janus


[RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Jakub Jelinek
Hi!

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.

As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle "GNU Fortran2003" and
"GNU Fortran2008" language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.

--- include/dwarf2.h.jj 2014-11-26 20:35:01.0 +0100
+++ include/dwarf2.h2015-01-27 17:55:18.086122137 +0100
@@ -312,6 +312,8 @@ enum dwarf_source_language
 DW_LANG_C_plus_plus_11 = 0x001a, /* dwarf5.20141029.pdf DRAFT */
 DW_LANG_C11 = 0x001d,
 DW_LANG_C_plus_plus_14 = 0x0021,
+DW_LANG_Fortran03 = 0x0022,
+DW_LANG_Fortran08 = 0x0023,
 
 DW_LANG_lo_user = 0x8000,  /* Implementation-defined range start.  */
 DW_LANG_hi_user = 0x,  /* Implementation-defined range start.  */
--- gcc/dwarf2out.c.jj  2015-01-27 17:54:13.0 +0100
+++ gcc/dwarf2out.c 2015-01-27 19:03:30.632411565 +0100
@@ -4736,7 +4736,9 @@ is_fortran (void)
 
   return (lang == DW_LANG_Fortran77
  || lang == DW_LANG_Fortran90
- || lang == DW_LANG_Fortran95);
+ || lang == DW_LANG_Fortran95
+ || lang == DW_LANG_Fortran03
+ || lang == DW_LANG_Fortran08);
 }
 
 /* Return TRUE if the language is Ada.  */
@@ -16720,6 +16722,8 @@ lower_bound_default (void)
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   return 1;
 case DW_LANG_UPC:
 case DW_LANG_D:
@@ -19781,8 +19785,17 @@ gen_compile_unit_die (const char *filena
 {
   if (strcmp (language_string, "GNU Ada") == 0)
language = DW_LANG_Ada95;
-  else if (strcmp (language_string, "GNU Fortran") == 0)
-   language = DW_LANG_Fortran95;
+  else if (strncmp (language_string, "GNU Fortran", 11) == 0)
+   {
+ language = DW_LANG_Fortran95;
+ if (dwarf_version >= 5 /* || !dwarf_strict */)
+   {
+ if (strcmp (language_string, "GNU Fortran2003") == 0)
+   language = DW_LANG_Fortran03;
+ else if (strcmp (language_string, "GNU Fortran2008") == 0)
+   language = DW_LANG_Fortran08;
+   }
+   }
   else if (strcmp (language_string, "GNU Java") == 0)
language = DW_LANG_Java;
   else if (strcmp (language_string, "GNU Objective-C") == 0)
@@ -19796,7 +19809,7 @@ gen_compile_unit_die (const char *filena
}
 }
   /* Use a degraded Fortran setting in strict DWARF2 so is_fortran works.  */
-  else if (strcmp (language_string, "GNU Fortran") == 0)
+  else if (strncmp (language_string, "GNU Fortran", 11) == 0)
 language = DW_LANG_Fortran90;
 
   add_AT_unsigned (die, DW_AT_language, language);
@@ -19806,6 +19819,8 @@ gen_compile_unit_die (const char *filena
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   /* Fortran has case insensitive identifiers and the front-end
 lowercases everything.  */
   add_AT_unsigned (die, DW_AT_identifier_case, DW_ID_down_case);
--- gcc/dbxout.c.jj 2015-01-15 20:25:30.0 +0100
+++ gcc/dbxout.c2015-01-27 18:58:58.286033152 +0100
@@ -967,7 +967,7 @@ get_lang_number (void)
 return N_SO_CC;
   else if (strcmp (language_string, "GNU F77") == 0)
 return N_SO_FORTRAN;
-  else if (strcmp (language_string, "GNU Fortran") == 0)
+  else if (lang_GNU_Fortran ())
 return N_SO_FORTRAN90; /* CHECKME */
   else if (strcmp (language_string, "GNU Pascal") == 0)
 return N_SO_PASCAL;
--- gcc/langhooks.c.jj  2015-01-09 21:59:54.0 +0100
+++ gcc/langhooks.c 2015-01-27 18:58:37.375387995 +0100
@@ -731,3 +731,11 @@ lang_GNU_CXX (void)
 {
   return strncmp (lang_hooks.name, "GNU C++", 7) == 0;
 }
+
+/* Returns true if the current lang_hooks represents the GNU Fortran frontend. 
 */
+
+bool
+lang_GNU_Fortran (void)
+{
+  return strncmp (lang_hooks.name, "GNU Fortran", 11) == 0;
+}
--- gcc/langhooks.h.jj  2015-01-05 13:07:13.0 +0100
+++ gcc/langhooks.h 2015-01-27 18:57:51.139172602 +0100
@@ -509,5 +509,6 @@ extern tree add_builtin_type (const char
 
 extern bool

Re: [PATCH][2/2] Improve array-bound warnings and VRP

2015-01-27 Thread Martin Uecker


Richard Biener wrote:

> On Mon, 26 Jan 2015, Jakub Jelinek wrote:

> > Then it probably should be ok.  I'm really afraid of emitting more warnings
> > with such high false positive rate now.
>
> As the patch also mitigates some of the code bloat we get with
> the complete peeling (regression against 4.7) I have installed it.
> It's also the easiest vehicle to verify range-info is not broken
> by passes between vrp1 and vrp2.

You could make warnings appear only for warn_array_bounds > 1
if there are concerns about false positives.

For what it's worth, I tested the old version of both patches on 
one of my projects (mostly numerical algorithms) and it did not 
produce additional warnings.

I really appreciate all improvements in this area.

Martin




Re: [ping] Re: proper name of i386/x86-64/etc targets

2015-01-27 Thread Uros Bizjak
On Tue, Jan 27, 2015 at 2:56 AM, Sandra Loosemore
 wrote:
> On 01/20/2015 12:02 PM, H.J. Lu wrote:
>>
>> On Tue, Jan 20, 2015 at 10:51 AM, Eric Botcazou 
>> wrote:

 Ping?  Any thoughts?
>>>
>>>
>>> x86 for the family and x86-32/x86-64 for the 2 architectures?
>>>
>>
>> Works for me.
>
>
> [redirecting from gcc@ to gcc-patches@]
>
> OK, here is a patch that attempts to implement that convention.  I'd
> appreciate review from a target maintainer to check that I've correctly
> disambiguated places where "i386" was referring to both 32- and 64-bit
> variants vs 32-bit only.  I've left alone some instances of "i386" where it
> seemed appropriate to name a specific processor -- e.g. there are a bunch of
> examples in the inline asm section that are described as "i386 code".
>
> If this is OK to commit, I will follow it up with another patch to
> re-alphabetize the renamed sections ("i386 whatever" to "x86 whatever").
> Trying to do both the renaming and the shuffling in a single patch would
> have made it impossible to review the actual changes to content. When I was
> working on this I also realized that some of the x86-specific material in
> extend.texi really needs copy-editing; again, best to do that in a separate
> patch.

-@node i386 and x86-64 Options
-@subsection Intel 386 and AMD x86-64 Options
+@node x86 Options
+@subsection x86 Options
 @cindex i386 Options
-@cindex x86-64 Options
+@cindex x86 Options
+@cindex IA-32 Options
 @cindex Intel 386 Options
 @cindex AMD x86-64 Options

Let's go all the way and remove all but  "@cindex x86 Options".

-These @samp{-m} options are defined for the i386 and x86-64 family of
-computers:
+These @samp{-m} options are defined for the x86 family of computers,
+including both x86-32 (IA-32 and Intel 386) and AMD x86-64:

Also here. "... the x86 family of computers.". Without the "including ..." part.

-@node i386 and x86-64 Windows Options
-@subsection i386 and x86-64 Windows Options
-@cindex i386 and x86-64 Windows Options
+@node x86 Windows Options
+@subsection x86 Windows Options
+@cindex x86 Windows Options
+@cindex i386 Windows Options
+@cindex Intel 386 Windows Options
+@cindex AMD x86-64 Windows Options
+@cindex Windows Options for x86

IMO, all but "@cindex x86 Windows Options" should be removed.

Others LGTM.

Thanks,
Uros.


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Richard Sandiford
Vladimir Makarov  writes:
> On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>> Yeah, but in practice that's only ever going to be a partial transition.
>> Many port maintainers won't look at this, so we'll have to support both
>> versions indefinitely, even if the new behaviour turns out to be the
>> best for all cases.
>>
>> I just think we're going to regret having two sets of constraints with
>> such subtly different meanings.
>>
>> Looking back at the original PR, Jakub said:
>>
>>   The ! has been added by me for PR63594, so it isn't there from the era
>>   when i?86 backend was using reload.  If there is a better way to
>>   express that RA should prefer to use memory or xmm register and only
>>   use r constraint if it already is in a r register and doesn't need to
>>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>>   ! description in gcc docs just fitted most what I wanted...
>>
>> In some ways this seems to match the intention of "*".  Originally I think
>> it was just an RA-only thing and was ignored by reload, but LRA does take it
>> into account too (which sounds like progress to me).
>   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
> should decide what class is better based on costs of alternatives and
> the explicit exclusion of register class by using '*' is a bad practice.
>
>   Saying that I believe we should do register class preferrencing
> algorithm more alternative oriented.  The algorithm should choose first
> an alternative (of may be subset of alternatives) and then register
> classes.  I think it is more logical.  It would permits us to rid off
> all such constraints including '*' and use only one like '?' which
> increases the alternative cost.
>
>   In perspective it is even better to rid of '?' too and have some hook
> (or attribute) to get insn alternative costs which can be depended on
> sub-target or other run-time characteristics.  Otherwise we need to
> duplicate insn descriptions and put different insn guards.  I am going
> to work on this.  But it is hard to say will it work well (may be I have
> some performance issues with this).  This hook somehow (min or average
> of the values for all alternatives) can be used in combiner and other
> algorithms need an insn cost. That is how I see the solution of the
> problem in a long perspective.

Definitely agree that it'd be better to remove these constraints
in favour of a new attribute.  preferred_for_size and preferred_for_speed
give something similar, though they're much more stringent than what
we need here.

>> If I revert the patch locally and change the *vec_dup pattern to
>> use "*", it passes both the test for PR64110 and the tests for PR63594.
>> Would that be OK as an alternative?
>>
>   I don't think it will work in general case.  It probably works because
> a different class is chosen in IRA.  If IRA for some reasons choose the
> same class, we might see the same problem in LRA.

But isn't that the point of '*'?  It should stop IRA from using the 'r'
alternative as an indication that 'r' is a good choice for this instruction.
If IRA chooses 'r' anyway, it must be because other instructions that
use the same allocno strongly prefer 'r'.

And in those some circumstances -- i.e. if IRA does choose 'r' despite
the constraints in this instruction -- then I think we do want to use the
'r' alternative.  And AIUI that's also what the new constraint is designed
to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
the 'r' alternative _even if_ another operand (the destination) has to
be reloaded, which is the fundamental difference between the new constraint
and '!'.

So I'm still not sure why '*' wouldn't do what we want.

>  I also don't like when register classes are excluded by '*' for IRA
> (see my thoughts above).

Understood, and I agree it would be good to move to attributes.
But in a way, I think that's an even better reason to try to avoid
adding these new constraints.  It sounds like we're hoping to get rid
of them as soon as we've added them :-)

Thanks,
Richard



[PATCH] PR jit/64780: configure: --enable-host-shared and the jit

2015-01-27 Thread David Malcolm
Currently the jit requires you to specify --enable-host-shared, or the
build eventually fails with  linker errors (this is something of a FAQ
for people trying out the jit).

We seem to have two choices here:

(A) default to --enable-host-shared when jit is an enabled language
(B) have the toplevel configure reject jit as language if
--enable-host-shared is not supplied.

FWIW apparently Darwin defaults to position-independent code, so it's not
explicitly needed there.

I think (B) is the better option for us, since there is a performance
cost: there are people who perform benchmarking of GCC (and publish
their results on prominent websites).  If they turn on the jit and use
the same configuration to do their benchmarking of the rest of GCC,
they'll see GCC 5 be apparently slower than earlier releases.  This
is sufficiently subtle that I don't think it's reasonable to simply
document it and expect 3rd-party reviewers to see such a note in the
documentation before benchmarking.

The attached patch implements (B), with a note in the error message
recommending that people configure and build GCC twice to avoid the
performance hit, so that it can be self-documenting.

Tested by hand with various combinations of values for
--enable-host-shared and --enable-languages.

OK for stage 4?

PR jit/64780
* configure.ac: Require the user to explicitly specify
--enable-host-shared if the jit is enabled.
* configure: Regenerate.
---
 configure| 24 
 configure.ac | 24 
 2 files changed, 48 insertions(+)

diff --git a/configure b/configure
index 5860241..dd794db 100755
--- a/configure
+++ b/configure
@@ -14750,6 +14750,30 @@ fi
 
 
 
+# PR jit/64780: Require the user to explicitly specify
+# --enable-host-shared if the jit is enabled, hinting
+# that they might want to do a separate configure/build of
+# the jit, to avoid users from slowing down the rest of the
+# compiler by enabling the jit.
+if test ${host_shared} = "no" ; then
+  case "${enable_languages}" in
+*jit*)
+  as_fn_error "
+Enabling language \"jit\" requires --enable-host-shared.
+
+--enable-host-shared typically slows the rest of the compiler down by
+a few %, so you must explicitly enable it.
+
+If you want to build both the jit and the regular compiler, it is often
+best to do this via two separate configure/builds, in separate
+directories, to avoid imposing the performance cost of
+--enable-host-shared on the regular compiler." "$LINENO" 5
+  ;;
+*)
+  ;;
+  esac
+fi
+
 # Specify what files to not compare during bootstrap.
 
 compare_exclusions="gcc/cc*-checksum\$(objext) | gcc/ada/*tools/*"
diff --git a/configure.ac b/configure.ac
index 267c8e6..4ea5e00 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3467,6 +3467,30 @@ AC_ARG_ENABLE(host-shared,
 [host_shared=$enableval], [host_shared=no])
 AC_SUBST(host_shared)
 
+# PR jit/64780: Require the user to explicitly specify
+# --enable-host-shared if the jit is enabled, hinting
+# that they might want to do a separate configure/build of
+# the jit, to avoid users from slowing down the rest of the
+# compiler by enabling the jit.
+if test ${host_shared} = "no" ; then
+  case "${enable_languages}" in
+*jit*)
+  AC_MSG_ERROR([
+Enabling language "jit" requires --enable-host-shared.
+
+--enable-host-shared typically slows the rest of the compiler down by
+a few %, so you must explicitly enable it.
+
+If you want to build both the jit and the regular compiler, it is often
+best to do this via two separate configure/builds, in separate
+directories, to avoid imposing the performance cost of
+--enable-host-shared on the regular compiler.])
+  ;;
+*)
+  ;;
+  esac
+fi
+
 # Specify what files to not compare during bootstrap.
 
 compare_exclusions="gcc/cc*-checksum\$(objext) | gcc/ada/*tools/*"
-- 
1.8.5.3



Fix ICE in ipa-devirt

2015-01-27 Thread Jan Hubicka
Hi,
the two testcases show somewhat crazy layout of C++ object that goes
in order base1,base2,virtual_base_of_base1
this confuses the walk in get_binfo_at_offset while looking for
virtual_base_of_base1 to look into base2 instead of base1.

It seems that in the case of virtual inheritance we simply want to do
fully recursive search for the given binfo - it is not that expensiv ebecause
bases are not many.

Bootstrapped/regtested x86_64-linux.  Will commit it today after rebuilding
firefox.

PR IPA/60871
PR IPA/64139
* tree.c (lookup_binfo_at_offset): New function.
(get_binfo_at_offset): Use it.

* g++.dg/torture/pr64139.C: New testcase.
* g++.dg/torture/pr60871.C: Likewise.
Index: tree.c
===
--- tree.c  (revision 220142)
+++ tree.c  (working copy)
@@ -11990,6 +11990,23 @@ type_in_anonymous_namespace_p (const_tre
   return (TYPE_STUB_DECL (t) && !TREE_PUBLIC (TYPE_STUB_DECL (t)));
 }
 
+/* Lookup sub-BINFO of BINFO of TYPE at offset POS.  */
+
+tree
+lookup_binfo_at_offset (tree binfo, tree type, HOST_WIDE_INT pos)
+{
+  unsigned int i;
+  tree base_binfo, b;
+
+  for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+if (pos == tree_to_shwi (BINFO_OFFSET (base_binfo))
+   && types_same_for_odr (TREE_TYPE (base_binfo), type))
+  return base_binfo;
+else if ((b = lookup_binfo_at_offset (base_binfo, type, pos)) != NULL)
+  return b;
+  return NULL;
+}
+
 /* Try to find a base info of BINFO that would have its field decl at offset
OFFSET within the BINFO type and which is of EXPECTED_TYPE.  If it can be
found, return, otherwise return NULL_TREE.  */
@@ -12027,42 +12044,22 @@ get_binfo_at_offset (tree binfo, HOST_WI
 represented in the binfo for the derived class.  */
   else if (offset != 0)
{
- tree base_binfo, binfo2 = binfo;
-
- /* Find BINFO corresponding to FLD.  This is bit harder
-by a fact that in virtual inheritance we may need to walk down
-the non-virtual inheritance chain.  */
- while (true)
-   {
- tree containing_binfo = NULL, found_binfo = NULL;
- for (i = 0; BINFO_BASE_ITERATE (binfo2, i, base_binfo); i++)
-   if (types_same_for_odr (TREE_TYPE (base_binfo), TREE_TYPE 
(fld)))
- {
-   found_binfo = base_binfo;
-   break;
- }
-   else
- if ((tree_to_shwi (BINFO_OFFSET (base_binfo)) 
-  - tree_to_shwi (BINFO_OFFSET (binfo)))
- * BITS_PER_UNIT < pos
- /* Rule out types with no virtual methods or we can get 
confused
-here by zero sized bases.  */
- && TYPE_BINFO (BINFO_TYPE (base_binfo))
- && BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (base_binfo)))
- && (!containing_binfo
- || (tree_to_shwi (BINFO_OFFSET (containing_binfo))
- < tree_to_shwi (BINFO_OFFSET (base_binfo)
-   containing_binfo = base_binfo;
- if (found_binfo)
-   {
- binfo = found_binfo;
- break;
-   }
- if (!containing_binfo)
-   return NULL_TREE;
- binfo2 = containing_binfo;
-   }
-   }
+ tree found_binfo = NULL, base_binfo;
+ int offset = (tree_to_shwi (BINFO_OFFSET (binfo)) + pos
+   / BITS_PER_UNIT);
+
+ for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+   if (tree_to_shwi (BINFO_OFFSET (base_binfo)) == offset
+   && types_same_for_odr (TREE_TYPE (base_binfo), TREE_TYPE (fld)))
+ {
+   found_binfo = base_binfo;
+   break;
+ }
+ if (found_binfo)
+   binfo = found_binfo;
+ else
+   binfo = lookup_binfo_at_offset (binfo, TREE_TYPE (fld), offset);
+}
 
   type = TREE_TYPE (fld);
   offset -= pos;
Index: testsuite/g++.dg/torture/pr64139.C
===
--- testsuite/g++.dg/torture/pr64139.C  (revision 0)
+++ testsuite/g++.dg/torture/pr64139.C  (revision 0)
@@ -0,0 +1,34 @@
+// { dg-do compile }
+class IObject {
+public:
+  virtual ~IObject();
+};
+class A {
+  virtual int m_fn1();
+};
+class B {
+public:
+  virtual int m_fn2(B) const;
+};
+class D : IObject, public virtual B {};
+class G : public D, A {
+public:
+  G(A);
+};
+class F : B {
+  friend class C;
+};
+class C {
+  void m_fn3(const IObject &, int &);
+  void m_fn4(const B &, int &);
+};
+A a;
+void C::m_fn3(const IObject &, int &p2) {
+  G r(a);
+  m_fn4(r, p2);
+}
+void C::m_fn4(const B &p1, int &) {
+  F b;
+  p1.m_fn2(b);
+}
+
Index: testsuite/g++.dg/torture/pr60871.C
==

FIx gimple-fold ICE

2015-01-27 Thread Jan Hubicka
Hi,
this patch fixes ICE on type inconsistent programs where vtable pointer
is worked out to be arbitrary pointer to something else.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 220176)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2015-01-27  Jan Hubicka  
+
+   PR ipa/64282
+   * gimple-fold.c (gimple_get_virt_method_for_vtable): Remove assert
+   on vtable being vtable.
+
 2015-01-27  Dominik Vogt  
 
 * doc/extend.texi: s/390: Update documentation of hotpatch attribute.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 220176)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2015-01-27  Jan Hubicka  
+
+   PR ipa/64282
+   * g++.dg/torture/pr64282.C: New testcase.
+
 2015-01-27  Kyrylo Tkachov  
 
* gcc.target/aarch64/store-pair-1.c: Update scan-assembler to check
Index: testsuite/g++.dg/torture/pr64282.C
===
--- testsuite/g++.dg/torture/pr64282.C  (revision 0)
+++ testsuite/g++.dg/torture/pr64282.C  (revision 0)
@@ -0,0 +1,101 @@
+// { dg-do compile }
+template  struct A
+{
+  _T1 first;
+};
+struct B
+{
+  int operator!=(B);
+};
+template  struct C
+{
+  C (B);
+  _Tp operator*();
+  int operator!=(C);
+};
+template  class D
+{
+public:
+  typedef C<_Tp> const_iterator;
+  const_iterator m_fn1 () const;
+  B m_fn2 ();
+  void m_fn3 ();
+};
+class F
+{
+  struct G
+  {
+static G &
+m_fn5 ()
+{
+  void fn1 ();
+  return *reinterpret_cast (fn1);
+}
+int *
+m_fn6 ()
+{
+  return reinterpret_cast (this);
+}
+  };
+  struct _Alloc_hider
+  {
+_Alloc_hider (int *p1, int) : _M_p (p1) {}
+int *_M_p;
+  } _M_dataplus;
+  G &
+  m_fn4 ()
+  {
+return G::m_fn5 ();
+  }
+public:
+  F () : _M_dataplus (m_fn4 ().m_fn6 (), 0) {}
+};
+class H
+{
+  void m_fn7 (const F &, bool &);
+  bool m_fn8 (const D &, const F &, F &);
+};
+typedef A CandPair;
+class I
+{
+public:
+  virtual void m_fn9 (const F &, bool, D &);
+};
+class J : I
+{
+public:
+  void m_fn9 (const F &, bool, D &);
+};
+D c;
+void
+J::m_fn9 (const F &, bool, D &)
+{
+  D a;
+  for (B b; b != a.m_fn2 ();)
+;
+}
+inline void
+fn2 (F p1, int, int, J *p4, D)
+{
+  D d;
+  d.m_fn3 ();
+  p4->m_fn9 (p1, 0, d);
+  for (D::const_iterator e = c.m_fn1 (); e != c.m_fn2 ();)
+(*e)->m_fn9 (p1, 0, d);
+}
+void
+H::m_fn7 (const F &, bool &)
+{
+  A f;
+  D g;
+  F h;
+  m_fn8 (g, f.first, h);
+}
+bool
+H::m_fn8 (const D &p1, const F &, F &)
+{
+  F i;
+  p1.m_fn1 ();
+  D j;
+  fn2 (i, 0, 0, 0, j);
+}
Index: gimple-fold.c
===
--- gimple-fold.c   (revision 220142)
+++ gimple-fold.c   (working copy)
@@ -5649,7 +5649,6 @@ gimple_get_virt_method_for_vtable (HOST_
   if (TREE_CODE (v) != VAR_DECL
   || !DECL_VIRTUAL_P (v))
 {
-  gcc_assert (in_lto_p);
   /* Pass down that we lost track of the target.  */
   if (can_refer)
*can_refer = false;


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Vladimir Makarov
On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>
> Yeah, but in practice that's only ever going to be a partial transition.
> Many port maintainers won't look at this, so we'll have to support both
> versions indefinitely, even if the new behaviour turns out to be the
> best for all cases.
>
> I just think we're going to regret having two sets of constraints with
> such subtly different meanings.
>
> Looking back at the original PR, Jakub said:
>
>   The ! has been added by me for PR63594, so it isn't there from the era
>   when i?86 backend was using reload.  If there is a better way to
>   express that RA should prefer to use memory or xmm register and only
>   use r constraint if it already is in a r register and doesn't need to
>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>   ! description in gcc docs just fitted most what I wanted...
>
> In some ways this seems to match the intention of "*".  Originally I think
> it was just an RA-only thing and was ignored by reload, but LRA does take it
> into account too (which sounds like progress to me).
  I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
should decide what class is better based on costs of alternatives and
the explicit exclusion of register class by using '*' is a bad practice.

  Saying that I believe we should do register class preferrencing
algorithm more alternative oriented.  The algorithm should choose first
an alternative (of may be subset of alternatives) and then register
classes.  I think it is more logical.  It would permits us to rid off
all such constraints including '*' and use only one like '?' which
increases the alternative cost.

  In perspective it is even better to rid of '?' too and have some hook
(or attribute) to get insn alternative costs which can be depended on
sub-target or other run-time characteristics.  Otherwise we need to
duplicate insn descriptions and put different insn guards.  I am going
to work on this.  But it is hard to say will it work well (may be I have
some performance issues with this).  This hook somehow (min or average
of the values for all alternatives) can be used in combiner and other
algorithms need an insn cost. That is how I see the solution of the
problem in a long perspective.
> If I revert the patch locally and change the *vec_dup pattern to
> use "*", it passes both the test for PR64110 and the tests for PR63594.
> Would that be OK as an alternative?
>
  I don't think it will work in general case.  It probably works because
a different class is chosen in IRA.  If IRA for some reasons choose the
same class, we might see the same problem in LRA.  I also don't like
when register classes are excluded by '*' for IRA (see my thoughts above).




[PATCH][AArch64][test][committed] Fix FAIL: gcc.target/aarch64/store-pair-1.c scan-assembler stp\tw[0-9]+, w[0-9]+

2015-01-27 Thread Kyrill Tkachov

Hi all,

I notice this test fails on aarch64-none-elf because the scan-assembler 
scans for w registers when one of them can be the wzr reg since we 
store a 0 into *a.

This patch updates the pattern that is scanned for.

Committed as obvious with r220176.

Thanks,
Kyrill

2015-01-27  Kyrylo Tkachov  

* gcc.target/aarch64/store-pair-1.c: Update scan-assembler to check
for wzr reg.diff --git a/gcc/testsuite/gcc.target/aarch64/store-pair-1.c b/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
index a726d64..a90fc61 100644
--- a/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
@@ -9,4 +9,4 @@ int f(int *a, int b)
 }
 
 /* We should be able to produce store pair for the store of 28/29 store. */
-/* { dg-final { scan-assembler "stp\tw\[0-9\]+, w\[0-9\]+" } } */
+/* { dg-final { scan-assembler "stp\tw(\[0-9\]+)\|(zr), w\[0-9\]+" } } */

Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Tobias Burnus
Rainer Orth wrote:
> > Why don't you use MAX macro instead of std::max as everywhere else
> > in the gcc sources?
>
> No idea, ask Tobias :-)

No real reason - presumably, because I had MAX not in mind and thought
of the general move towards standard features.

> Anyway, the original patch would most likely
> have worked: system.h already includes .

It did on my system :-)

> This one compiles just as well, of course.

>From my side, that patch (using MAX) is fine. Thanks for
bearing the bootstrap failure and for the patch.

Tobias


> 2015-01-27  Rainer Orth  
> 
>   * interface.c: Remove .
>   (check_dummy_characteristics): Use MAX instead of std::max.
> 
> # HG changeset patch
> # Parent a742f8ce2a00e481ddf92dbecaf8d1ee01448911
> Avoid std::max
> 
> diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
> --- a/gcc/fortran/interface.c
> +++ b/gcc/fortran/interface.c
> @@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
> formal argument list points to symbols within the same namespace as
> the program unit name.  */
>  
> -#include   /* For std::max.  */
> -
>  #include "config.h"
>  #include "system.h"
>  #include "coretypes.h"
> @@ -1215,7 +1213,7 @@ check_dummy_characteristics (gfc_symbol 
>   }
>  
>if (s1->as->type == AS_EXPLICIT)
> - for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++)
> + for (i = 0; i < s1->as->rank + MAX (0, s1->as->corank-1); i++)
> {
>   shape1 = gfc_subtract (gfc_copy_expr (s1->as->upper[i]),
> gfc_copy_expr (s1->as->lower[i]));


Re: [patch, libobjc] export __objc_get_forward_imp, get_imp again

2015-01-27 Thread Matthias Klose
On 01/22/2015 05:09 PM, Matthias Klose wrote:
> On 01/22/2015 12:56 AM, Andrew Pinski wrote:
>> On Wed, Jan 21, 2015 at 8:51 AM, Jakub Jelinek  wrote:
>>> On Wed, Jan 21, 2015 at 08:41:46AM -0800, pins...@gmail.com wrote:
> On Jan 21, 2015, at 1:02 AM, Matthias Klose  wrote:
>
> __objc_get_forward_imp and get_imp were exported in libobjc since GCC 
> 4.1, for
> some reason these are not exported anymore in GCC 5 (both declared 
> inline).  So
> either export these as before, or don't export them and bump the soname.  
> The
> latter seems to be unwanted, and at least gnustep-base is using the 
> get_imp
> function.  So better keep the references in GCC 5?
>
> Is this an intended change in GCC 5 to not to export inline methods 
> anymore?

 Just remove the inline instead.
>>>
>>> The comments like:
>>>
>>> /* The new name of get_imp().  */
>>> IMP
>>> class_getMethodImplementation (Class class_, SEL selector)
>>> {
>>>   if (class_ == Nil  ||  selector == NULL)
>>> return NULL;
>>>
>>>   /* get_imp is inlined, so we're good.  */
>>>   return get_imp (class_, selector);
>>> }
>>>
>>> don't make me very confident in such a change.
>>> The extern prototypes really work with both -std=gnu89 and -std=gnu11 and
>>> thus will at least keep status quo.
>>
>> Let's do that then.
> 
> get_imp was renamed to class_getMethodImplementation, which is exported from
> objc/runtime.h.  GNUstep-base uses get_imp to define it's own
> class_getMethodImplementation, so get_imp isn't really needed. So either make
> the two functions inline, and don't export them, or declare the prototypes.  
> For
> the latter I would suggest objc-private/runtime.h 
> (class_getMethodImplementation
> is declared in objc/runtime.h).

now commited the following patch after approval on IRC.

  Matthias


libobjc/

2015-01-27  Matthias Klose  

	* sendmsg.c: Add prototypes for __objc_get_forward_imp and get_imp.

Index: libobjc/sendmsg.c
===
--- libobjc/sendmsg.c	(revision 220167)
+++ libobjc/sendmsg.c	(working copy)
@@ -104,6 +104,10 @@
 struct objc_method * search_for_method_in_list (struct objc_method_list * list, SEL op);
 id nil_method (id, SEL);
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP __objc_get_forward_imp (id, SEL);
+
 /* Given a selector, return the proper forwarding implementation.  */
 inline
 IMP
@@ -320,6 +324,10 @@
   return res;
 }
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP get_imp (Class, SEL);
+
 inline
 IMP
 get_imp (Class class, SEL sel)


[PATCH][AArch64] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Alex Velenko

Hi,

This patch fixes aarch64/atomic-op-consume.c test to expect safe "LDAXR"
instruction to be generated when __ATOMIC_CONSUME semantics is requested.

This patch was tested by running the modified test on aarch64-none-elf
compiler.

Is this patch ok?

Alex

2015-01-27  Alex Velenko  

gcc/testsuite/

  * gcc.target/aarch64/atomic-op-consume.c (scan-assember-times): Adjust
  scan-assembler-times pattern.

diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c 
b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
index 38d6c2c..7ece5b1 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
@@ -3,5 +3,8 @@
 
 #include "atomic-op-consume.x"
 
-/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 6 } 
} */
+/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is always
+   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This causes
+   "LDAXR" to be generated instead of "LDXR".  */
+/* { dg-final { scan-assembler-times "ldaxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 6 } 
} */
 /* { dg-final { scan-assembler-times "stxr\tw\[0-9\]+, w\[0-9\]+, 
\\\[x\[0-9\]+\\\]" 6 } } */

Re: [PATCH][AArch32] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Ramana Radhakrishnan
On Tue, Jan 27, 2015 at 4:06 PM, Alex Velenko  wrote:
>
> Hi,
>
> This patch fixes arm/atomic-op-consume.c test to expect safe "LDAEX"
> instruction to be generated when __ATOMIC_CONSUME semantics is requested.
>
> This patch was tested by running the modified test on arm-none-eabi and
> arm-none-linux-gnueabi compilers.
>
> Is this patch ok?

Ok. Please remember James's comments in the future about cover notes.

Ramana

>
> Alex
>
> 2015-01-27  Alex Velenko  
>
> gcc/testsuite/
>
>   * gcc.target/arm/atomic-op-consume.c (scan-assember-times): Adjust
>   scan-assembler-times pattern.
>
> diff --git a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c 
> b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
> index 0354717..cc6c028 100644
> --- a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
> +++ b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
> @@ -5,6 +5,9 @@
>
>  #include "../aarch64/atomic-op-consume.x"
>
> -/* { dg-final { scan-assembler-times "ldrex\tr\[0-9\]+, \\\[r\[0-9\]+\\\]" 6 
> } } */
> +/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is 
> always
> +   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This 
> causes
> +   "LDAEX" to be generated instead of "LDREX".  */
> +/* { dg-final { scan-assembler-times "ldaex\tr\[0-9\]+, \\\[r\[0-9\]+\\\]" 6 
> } } */
>  /* { dg-final { scan-assembler-times "strex\t...?, r\[0-9\]+, 
> \\\[r\[0-9\]+\\\]" 6 } } */
>  /* { dg-final { scan-assembler-not "dmb" } } */


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Rainer Orth
Jakub Jelinek  writes:

>> The problem is (as so often) that  was included *before*
>> config.h.  Moving it after the other includes allows interface.c to
>> compile without warnings.
>
> Why don't you use MAX macro instead of std::max as everywhere else
> in the gcc sources?

No idea, ask Tobias :-)  Anyway, the original patch would most likely
have worked: system.h already includes .

This one compiles just as well, of course.

Rainer


2015-01-27  Rainer Orth  

* interface.c: Remove .
(check_dummy_characteristics): Use MAX instead of std::max.

# HG changeset patch
# Parent a742f8ce2a00e481ddf92dbecaf8d1ee01448911
Avoid std::max

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
formal argument list points to symbols within the same namespace as
the program unit name.  */
 
-#include   /* For std::max.  */
-
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -1215,7 +1213,7 @@ check_dummy_characteristics (gfc_symbol 
 	}
 
   if (s1->as->type == AS_EXPLICIT)
-	for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++)
+	for (i = 0; i < s1->as->rank + MAX (0, s1->as->corank-1); i++)
 	  {
 	shape1 = gfc_subtract (gfc_copy_expr (s1->as->upper[i]),
   gfc_copy_expr (s1->as->lower[i]));

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH][AArch32] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Alex Velenko

Hi,

This patch fixes arm/atomic-op-consume.c test to expect safe "LDAEX"
instruction to be generated when __ATOMIC_CONSUME semantics is requested.

This patch was tested by running the modified test on arm-none-eabi and
arm-none-linux-gnueabi compilers.

Is this patch ok?

Alex

2015-01-27  Alex Velenko  

gcc/testsuite/

  * gcc.target/arm/atomic-op-consume.c (scan-assember-times): Adjust
  scan-assembler-times pattern.

diff --git a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c 
b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
index 0354717..cc6c028 100644
--- a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
@@ -5,6 +5,9 @@
 
 #include "../aarch64/atomic-op-consume.x"
 
-/* { dg-final { scan-assembler-times "ldrex\tr\[0-9\]+, \\\[r\[0-9\]+\\\]" 6 } 
} */
+/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is always
+   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This causes
+   "LDAEX" to be generated instead of "LDREX".  */
+/* { dg-final { scan-assembler-times "ldaex\tr\[0-9\]+, \\\[r\[0-9\]+\\\]" 6 } 
} */
 /* { dg-final { scan-assembler-times "strex\t...?, r\[0-9\]+, 
\\\[r\[0-9\]+\\\]" 6 } } */
 /* { dg-final { scan-assembler-not "dmb" } } */

[Committed] S/390: Increase memory access costs

2015-01-27 Thread Andreas Krebbel
Hi,

I've committed the attached patch which fixes a 4.8 vs 4.9/5.0
performance regression introduced with the aggressive use of FPRs as
spill slots.

Committed to mainline and 4.9 branch.

Bye,

-Andreas-

2015-01-27  Andreas Krebbel  

* config/s390/s390.c (s390_memory_move_cost): Increase costs for
memory accesses.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 1409fa8..9c67157 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2434,7 +2434,7 @@ s390_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
   reg_class_t rclass ATTRIBUTE_UNUSED,
   bool in ATTRIBUTE_UNUSED)
 {
-  return 1;
+  return 2;
 }
 
 /* Compute a (partial) cost for rtx X.  Return true if the complete



Re: [PATCH] Fix PR64798

2015-01-27 Thread Jonathan Wakely

On 27/01/15 14:43 +0100, Richard Biener wrote:


The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?


Yes, thanks.



[Committed] S/390: Increase register move costs for FPR->GPR moves

2015-01-27 Thread Andreas Krebbel
Hi,

I've committed the attached patch which fixes a 4.8 vs 4.9/5.0
performance regression introduced with the aggressive use of FPRs as
spill slots.

Committed to mainline and 4.9 branch.

Bye,

-Andreas-

2015-01-27  Andreas Krebbel  

* config/s390/s390.c (s390_register_move_cost): Increase costs for
FPR->GPR moves.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 36b547d..fcde638 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2393,16 +2393,29 @@ s390_float_const_zero_p (rtx value)
 /* Implement TARGET_REGISTER_MOVE_COST.  */
 
 static int
-s390_register_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
+s390_register_move_cost (machine_mode mode,
  reg_class_t from, reg_class_t to)
 {
-  /* On s390, copy between fprs and gprs is expensive as long as no
- ldgr/lgdr can be used.  */
-  if ((!TARGET_Z10 || GET_MODE_SIZE (mode) != 8)
-  && ((reg_classes_intersect_p (from, GENERAL_REGS)
-  && reg_classes_intersect_p (to, FP_REGS))
- || (reg_classes_intersect_p (from, FP_REGS)
- && reg_classes_intersect_p (to, GENERAL_REGS
+  /* On s390, copy between fprs and gprs is expensive.  */
+
+  /* It becomes somewhat faster having ldgr/lgdr.  */
+  if (TARGET_Z10 && GET_MODE_SIZE (mode) == 8)
+{
+  /* ldgr is single cycle. */
+  if (reg_classes_intersect_p (from, GENERAL_REGS)
+ && reg_classes_intersect_p (to, FP_REGS))
+   return 1;
+  /* lgdr needs 3 cycles. */
+  if (reg_classes_intersect_p (to, GENERAL_REGS)
+ && reg_classes_intersect_p (from, FP_REGS))
+   return 3;
+}
+
+  /* Otherwise copying is done via memory.  */
+  if ((reg_classes_intersect_p (from, GENERAL_REGS)
+   && reg_classes_intersect_p (to, FP_REGS))
+  || (reg_classes_intersect_p (from, FP_REGS)
+ && reg_classes_intersect_p (to, GENERAL_REGS)))
 return 10;
 
   return 1;



Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2015-01-27 Thread Ramana Radhakrishnan
On Fri, Jan 9, 2015 at 7:43 AM, Terry Guo  wrote:
>
>
>> -Original Message-
>> From: Richard Earnshaw
>> Sent: Monday, December 08, 2014 7:31 PM
>> To: Terry Guo; gcc-patches@gcc.gnu.org
>> Cc: Ramana Radhakrishnan
>> Subject: Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the
>> instruction that moves pc to low register
>>
>> On 08/12/14 08:24, Terry Guo wrote:
>> > Hi there,
>> >
>> > When compile below simple code:
>> >
>> > terguo01@terry-pc01:mtpcs-frame$ cat test.c int main(void) {
>> > return 0;
>> > }
>> >
>> > I got ICE with option -mtpcs-leaf-frame (no error if remove this
> option).
>> >
>> > terguo01@terry-pc01:mtpcs-frame$
>> > /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-
>> gc
>> > c -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
>> > test.c: In function 'main':
>> > test.c:4:1: error: unrecognizable insn:
>> >  }
>> >  ^
>> > (insn 20 19 21 (set (reg:SI 2 r2)
>> > (reg:SI 15 pc)) test.c:2 -1
>> >  (nil))
>> > test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
>> > Please submit a full bug report, with preprocessed source if
>> > appropriate.
>> > See http://gcc.gnu.org/bugs.html\ for instructions.
>> >
>> > This RTL is generated in function thumb1_expand_prologue. The expected
>> > insn pattern is thumb1_movsi_insn in thumb1.md. And instruction like
>> "mov r2, pc"
>> > is a legal instruction. Because gcc returns NO_REG for PC register, so
>> > no valid pattern to match instruction that move pc to low register.
>> > This patch intends to add a new insn pattern to legalize such thing.
>> >
>> > Tested with GCC regression test. No regression. Is it OK to trunk?
>> >
>> > BR,
>> > Terry
>> >
>> > 2014-12-08  Terry Guo  terry@arm.com
>> >
>> >  * config/arm/predicates.md (pc_register): New to match PC register.
>> >  * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.
>> >
>> > gcc/testsuite/ChangeLog:
>> > 2014-12-08  Terry Guo  terry@arm.com
>> >
>> >  * gcc.target/arm/thumb1-mov-pc.c: New test.
>> >
>> >
>> > thumb1-move-pc-v1.txt
>> >
>> >
>> > diff --git a/gcc/config/arm/predicates.md
>> > b/gcc/config/arm/predicates.md index 032808c..c5ef5ed 100644
>> > --- a/gcc/config/arm/predicates.md
>> > +++ b/gcc/config/arm/predicates.md
>> > @@ -361,6 +361,10 @@
>> >(and (match_code "smin,smax,umin,umax")
>> > (match_test "mode == GET_MODE (op)")))
>> >
>> > +(define_special_predicate "pc_register"
>> > +  (and (match_code "reg")
>> > +   (match_test "REGNO (op) == PC_REGNUM")))
>> > +
>> >  (define_special_predicate "cc_register"
>> >(and (match_code "reg")
>> > (and (match_test "REGNO (op) == CC_REGNUM") diff --git
>> > a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index
>> > ddedc39..8e6057c 100644
>> > --- a/gcc/config/arm/thumb1.md
>> > +++ b/gcc/config/arm/thumb1.md
>> > @@ -1780,6 +1780,16 @@
>> >"
>> >  )
>> >
>> > +(define_insn "*thumb1_movpc_insn"
>> > +  [(set (match_operand:SI 0 "low_register_operand")
>>
>> This needs constraints.
>>
>
> The constraint is used now. Is this one OK?


This is OK now.

Ramana
>
> BR,
> Terry
>
> 2015-01-09  Terry Guo  terry@arm.com
>
>  * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.


[PATCH] Add comdat_group effective target (PR bootstrap/64612)

2015-01-27 Thread Jakub Jelinek
Hi!

This patch introduces a new effective target check and adds it to the pr64612.C
- if comdat groups aren't used, there is no guarantee that the D2 dtor will
be emitted always alongside of D1 dtor.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-27  Jakub Jelinek  

PR bootstrap/64612
* lib/target-supports.exp (check_effective_target_comdat_group): New.
* g++.dg/ipa/pr64612.C: Guard scan-assembler test with
{ target comdat_group }.

* doc/sourcebuild.texi (comdat_group): Document.

--- gcc/testsuite/lib/target-supports.exp.jj2015-01-15 23:39:06.0 
+0100
+++ gcc/testsuite/lib/target-supports.exp   2015-01-26 15:24:55.325236098 
+0100
@@ -6198,3 +6198,13 @@ proc check_effective_target_pie_copyrelo
 
 return $pie_copyreloc_available_saved
 }
+
+# Return 1 if the target uses comdat groups.
+
+proc check_effective_target_comdat_group {} {
+return [check_no_messages_and_pattern comdat_group 
"\.section\[^\n\r]*,comdat" assembly {
+   // C++
+   inline int foo () { return 1; }
+   int (*fn) () = foo;
+}]
+}
--- gcc/testsuite/g++.dg/ipa/pr64612.C.jj   2015-01-26 15:25:43.301410027 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr64612.C  2015-01-26 15:23:11.380025863 +0100
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -std=c++11" } */
-/* { dg-final { scan-assembler "_ZN5QListI7QStringED1Ev" } } */
+/* { dg-final { scan-assembler "_ZN5QListI7QStringED1Ev" { target comdat_group 
} } } */
 
 class A
 {
--- gcc/doc/sourcebuild.texi.jj 2015-01-15 23:39:02.0 +0100
+++ gcc/doc/sourcebuild.texi2015-01-27 16:07:37.504081520 +0100
@@ -1930,6 +1930,9 @@ Target supports @code{wchar_t} that is c
 
 @item wchar_t_char32_t_compatible
 Target supports @code{wchar_t} that is compatible with @code{char32_t}.
+
+@item comdat_group
+Target uses comdat groups.
 @end table
 
 @subsubsection Local to tests in @code{gcc.target/i386}

Jakub


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Marcus Shawcroft
On 27 January 2015 at 14:31, Jiong Wang  wrote:

> 2015-01-19  Ramana Radhakrishnan  
> Jiong Wang  
>
>   gcc/
> * config/aarch64/aarch64.md (tb1): Clobber CC reg instead
> of scratch reg.
> (cb1): Likewise.
> * config/aarch64/iterators.md (bcond): New define_code_attr.

OK /Marcus


>   gcc/testsuite/
> * gcc.dg/long_branch.c: New testcase.


Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Ilya Enkovich
2015-01-27 17:27 GMT+03:00 Jakub Jelinek :
> Hi!
>
> I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
> far fewer enum values than the 1204 that are being defined.
>
> This patch requires builtins.def to say explicitly (by using
> DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
> ones need that, for all the others only space in the enum is reserved and
> nothing else.
>
> I'd hope this could work around the buggy AIX stabs handling, but even
> on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
> by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
> just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
> without dwz between TUs).  The cost without dwz is obviously mainly
> from repeating that in most of the translation units.  But why declare
> BUILT_IN_*_CHKP enums that are never used by anything...

Enum values not mentioned in the code are not fully useless.  When we
have builtin functions defined as 'always_inline' functions, they are
instrumented and enum names may be used in dumps and debugging.
That's not a big value though.  Thanks a lot for taking care of it!

Ilya
>
> 2015-01-27  Jakub Jelinek  
>
> * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
> (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
> (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
> of DEF_BUILTIN.
> (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
> BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
> DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
> (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
> BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
> BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
> DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
> * tree-core.h (enum built_in_function): In between
> BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
> for builtins that use DEF_BUILTIN_CHKP macro.
>
> --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
> +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
> @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
>
> The builtins is registered only if COND is true.  */
>
> +/* A macro for builtins where the
> +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
> +   enums should be defined too.  */
> +#ifndef DEF_BUILTIN_CHKP
> +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
> +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
> +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
> + NONANSI_P, ATTRS, IMPLICIT, COND)
> +#endif
> +
>  /* A GCC builtin (like __builtin_saveregs) is provided by the
> compiler, but does not correspond to a function in the standard
> library.  */
> @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
>  #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
>DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
>true, true, false, ATTRS, true, true)
> +#undef DEF_LIB_BUILTIN_CHKP
> +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
> +  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
> +   TYPE, true, true, false, ATTRS, true, true)
>
>  /* Like DEF_LIB_BUILTIN, except that the function is not one that is
> specified by ANSI/ISO C.  So, when we're being fully conformant we
> @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
>  #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
>DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
>true, true, true, ATTRS, false, true)
> +#undef DEF_EXT_LIB_BUILTIN_CHKP
> +#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
> +  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
> +   TYPE, true, true, true, ATTRS, false, true)
>
>  /* Like DEF_LIB_BUILTIN, except that the function is only a part of
> the standard in C94 or above.  */
> @@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
>  /* Builtin used by the implementation of Pointer Bounds Checker.  */
>  #undef DEF_CHKP_BUILTIN
>  #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
> -  DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
> -  true, true, false, ATTRS, true, true)
> +  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
> +   TYPE, true, true, false, ATTRS, true, true)
>
>  /* Define an attribute list for math functions that are normally
> "impure" because some of them may write into global memory for
> @@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
>  DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, "index", 

Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 03:55:17PM +0100, Rainer Orth wrote:
> Steve Kargl  writes:
> 
> > On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
> >>if (s1->as->type == AS_EXPLICIT)
> >> -  for (i = 0; i < s1->as->rank + s1->as->corank; i++)
> >> +  for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++)
> >
> > Doesn't this require '#include '?
> > I suspect that you are depending on namespace pollution
> > via some other header (coretypes.h?).
> 
> It was committed with that change, which unfortunately broke Solaris
> bootstrap:
> 
> In file included from ./config.h:6:0,
>  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
> ./auto-host.h:2055:0: error: "_FILE_OFFSET_BITS" redefined [-Werror]
>  #define _FILE_OFFSET_BITS 64
>  ^
> In file included from /usr/include/iso/stdlib_iso.h:24:0,
>  from /usr/include/stdlib.h:11,
>  from 
> /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
>  from 
> /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
>  from 
> /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
>  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
> /var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
>  note: this is the location of the previous definition
>  #define _FILE_OFFSET_BITS 32
>  ^
> 
> The problem is (as so often) that  was included *before*
> config.h.  Moving it after the other includes allows interface.c to
> compile without warnings.

Why don't you use MAX macro instead of std::max as everywhere else
in the gcc sources?

Your change is wrong, you can't include system headers after including
system.h and other headers.

Jakub


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Rainer Orth
Steve Kargl  writes:

> On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
>>if (s1->as->type == AS_EXPLICIT)
>> -for (i = 0; i < s1->as->rank + s1->as->corank; i++)
>> +for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++)
>
> Doesn't this require '#include '?
> I suspect that you are depending on namespace pollution
> via some other header (coretypes.h?).

It was committed with that change, which unfortunately broke Solaris
bootstrap:

In file included from ./config.h:6:0,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
./auto-host.h:2055:0: error: "_FILE_OFFSET_BITS" redefined [-Werror]
 #define _FILE_OFFSET_BITS 64
 ^
In file included from /usr/include/iso/stdlib_iso.h:24:0,
 from /usr/include/stdlib.h:11,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
/var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
 note: this is the location of the previous definition
 #define _FILE_OFFSET_BITS 32
 ^

The problem is (as so often) that  was included *before*
config.h.  Moving it after the other includes allows interface.c to
compile without warnings.

Ok for mainline?

Rainer


2015-01-27  Rainer Orth  

gcc/fortran:
* interface.c: Include  after config.h

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
formal argument list points to symbols within the same namespace as
the program unit name.  */
 
-#include   /* For std::max.  */
-
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -73,6 +71,8 @@ along with GCC; see the file COPYING3.  
 #include "match.h"
 #include "arith.h"
 
+#include   /* For std::max.  */
+
 /* The current_interface structure holds information about the
interface currently being parsed.  This structure is saved and
restored during recursive interfaces.  */

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH PR64809]

2015-01-27 Thread Yuri Rumyantsev
Hi All,

Here is a simple patch that cures ICE - skip debug gimples.
Test is also included.

Bootstrap and regression testing did not show any new failures.

Is it OK for trunk?

ChangeLog:

2015-01-27  Yuri Rumyantsev  

PR tree-optimization/64809
* cfgexpand.c (reorder_operands): Skip debug gimples.

gcc/testsuite/ChangeLog

* gcc.dg/pr64809.c: New test.


patch
Description: Binary data


Re: [PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Arnaud Charlet
> this patch fixes the ada bootstrap under cygwin-64.
> 
> Boot-strapped under x86_64-pc-cygwin.
> OK for trunk?

OK


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Jeff Law

On 01/27/15 07:08, Richard Sandiford wrote:


Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.
Yes, most likely.   I find myself pondering the related question of how 
we get ports to transition to LRA and if we could tie these together. 
Maintainers are going to need to transition to LRA if we're ever going 
to start removing blobs of reload.  As a part of that transition they're 
presumably going to be looking closely at their backend and could make 
the constraint transition.


In an ideal world, we'd declare release X.Y has a cut-off point.  Ports 
that haven't transitioned to LRA get deprecated at that point.  Those 
ports are the ones most likely not to make the constraint transition as 
well.  I think we would have to consider any uses of ?! that remain 
after that point as intentional.




I just think we're going to regret having two sets of constraints with
such subtly different meanings.
But isn't that inevitable?  While I suspect that most instances of ?! 
should be converted, there may be some that should not.  If that's the 
case then we're going to have both forever.





Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of "*".  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dup pattern to
use "*", it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

I think that's up to Uros and Jakub to sort out.

Jeff


[PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Bernd Edlinger
Hi,


this patch fixes the ada bootstrap under cygwin-64.

Boot-strapped under x86_64-pc-cygwin.
OK for trunk?


Thanks
Bernd.
  2015-01-27  Bernd Edlinger  

Fix build under cygwin/64.
* adaint.h: Add check for __CYGWIN__.
* mingw32.h: Prevent windows.h from including x86intrin.h in GCC.



patch-ada-cygwin64.diff
Description: Binary data


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 02:31:14PM +, Jiong Wang wrote:
> testcase changed to execution version, and moved to gcc.dg. the compile time 
> only
> take several seconds. (previously I am using cc1 built by O0 which at most 
> take 24s)
> 
> ok to install?

Ok for the testcase.
The config/aarch64/ bits I'll defer to aarch64 maintainers.

> 2015-01-19  Ramana Radhakrishnan  
> Jiong Wang  
> 
>   gcc/
> * config/aarch64/aarch64.md (tb1): Clobber CC reg instead of 
> scratch reg.
> (cb1): Likewise.
> * config/aarch64/iterators.md (bcond): New define_code_attr.
> 
>   gcc/testsuite/
> * gcc.dg/long_branch.c: New testcase.

Jakub


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Jiong Wang

On 19/01/15 10:58, Jakub Jelinek wrote:


On Mon, Jan 19, 2015 at 10:52:14AM +, Ramana Radhakrishnan wrote:

What is aarch64 specific on the testcase?


The number of if-then-else's required to get the compiler to generate
 branch sequences rather than the tbnz instruction.

That doesn't mean the same testcase couldn't be tested on other targets and
perhaps find bugs in there.
That said, if the testcase is too expensive to compile (several seconds is
ok, minutes is not), then perhaps it shouldn't be included at all, or should
be guarded with run_expensive_tests target.

Jakub



testcase changed to execution version, and moved to gcc.dg. the compile time 
only
take several seconds. (previously I am using cc1 built by O0 which at most take 
24s)

ok to install?

Thanks.

2015-01-19  Ramana Radhakrishnan  
Jiong Wang  

  gcc/
* config/aarch64/aarch64.md (tb1): Clobber CC reg instead of 
scratch reg.
(cb1): Likewise.
* config/aarch64/iterators.md (bcond): New define_code_attr.

  gcc/testsuite/
* gcc.dg/long_branch.c: New testcase.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 597ff8c..1e00396 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -466,13 +466,17 @@
 		   (const_int 0))
 	 (label_ref (match_operand 2 "" ""))
 	 (pc)))
-   (clobber (match_scratch:DI 3 "=r"))]
+   (clobber (reg:CC CC_REGNUM))]
   ""
-  "*
-  if (get_attr_length (insn) == 8)
-return \"ubfx\\t%3, %0, %1, #1\;\\t%3, %l2\";
-  return \"\\t%0, %1, %l2\";
-  "
+  {
+if (get_attr_length (insn) == 8)
+  {
+	operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1]));
+	return "tst\t%0, %1\;\t%l2";
+  }
+else
+  return "\t%0, %1, %l2";
+  }
   [(set_attr "type" "branch")
(set (attr "length")
 	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768))
@@ -486,13 +490,21 @@
  (const_int 0))
 			   (label_ref (match_operand 1 "" ""))
 			   (pc)))
-   (clobber (match_scratch:DI 2 "=r"))]
+   (clobber (reg:CC CC_REGNUM))]
   ""
-  "*
-  if (get_attr_length (insn) == 8)
-return \"ubfx\\t%2, %0, , #1\;\\t%2, %l1\";
-  return \"\\t%0, , %l1\";
-  "
+  {
+if (get_attr_length (insn) == 8)
+  {
+	char buf[64];
+	uint64_t val = ((uint64_t ) 1)
+			<< (GET_MODE_SIZE (mode) * BITS_PER_UNIT - 1);
+	sprintf (buf, "tst\t%%0, %"PRId64, val);
+	output_asm_insn (buf, operands);
+	return "\t%l1";
+  }
+else
+  return "\t%0, , %l1";
+  }
   [(set_attr "type" "branch")
(set (attr "length")
 	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 7dd3917..bd144f9 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -823,6 +823,9 @@
 		  (smax "s") (umax "u")
 		  (smin "s") (umin "u")])
 
+;; Emit conditional branch instructions.
+(define_code_attr bcond [(eq "beq") (ne "bne") (lt "bne") (ge "beq")])
+
 ;; Emit cbz/cbnz depending on comparison type.
 (define_code_attr cbz [(eq "cbz") (ne "cbnz") (lt "cbnz") (ge "cbz")])
 
diff --git a/gcc/testsuite/gcc.dg/long_branch.c b/gcc/testsuite/gcc.dg/long_branch.c
new file mode 100644
index 000..f388a80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/long_branch.c
@@ -0,0 +1,198 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-reorder-blocks" } */
+
+void abort ();
+
+__attribute__((noinline, noclone)) int
+restore (int a, int b)
+{
+  return a * b;
+}
+
+__attribute__((noinline, noclone)) void
+do_nothing (int *input)
+{
+  *input = restore (*input, 1);
+  return;
+}
+
+#define CASE_ENTRY(n) \
+  case n: \
+sum = sum / (n + 1); \
+sum = restore (sum, n + 1); \
+if (sum == (n + addend)) \
+  break;\
+sum = sum / (n + 2); \
+sum = restore (sum, n + 2); \
+sum = sum / (n + 3); \
+sum = restore (sum, n + 3); \
+sum = sum / (n + 4); \
+sum = restore (sum, n + 4); \
+sum = sum / (n + 5); \
+sum = restore (sum, n + 5); \
+sum = sum / (n + 6); \
+sum = restore (sum, n + 6); \
+sum = sum / (n + 7); \
+sum = restore (sum, n + 7); \
+sum = sum / (n + 8); \
+sum = restore (sum, n + 8); \
+sum = sum / (n + 9); \
+sum = restore (sum, n + 9); \
+sum = sum / (n + 10); \
+sum = restore (sum, n + 10); \
+sum = sum / (n + 11); \
+sum = restore (sum, n + 11); \
+sum = sum / (n + 12); \
+sum = restore (sum, n + 12); \
+sum = sum / (n + 13); \
+sum = restore (sum, n + 13); \
+sum = sum / (n + 14); \
+sum = restore (sum, n + 14); \
+sum = sum / (n + 15); \
+sum = restore (sum, n + 15); \
+sum = sum / (n + 16); \
+sum = restore (sum, n + 16); \
+sum = sum / (n + 17); \
+sum = restore (sum, n + 17); \
+sum = sum / (n + 18); \
+sum = restore (sum, n + 18); \
+sum = sum / (n + 19); \
+sum = restore (sum, n + 19); \
+sum = sum / (n + 20); \
+sum = restore 

[RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Jakub Jelinek
Hi!

I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
far fewer enum values than the 1204 that are being defined.

This patch requires builtins.def to say explicitly (by using
DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
ones need that, for all the others only space in the enum is reserved and
nothing else.

I'd hope this could work around the buggy AIX stabs handling, but even
on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
without dwz between TUs).  The cost without dwz is obviously mainly
from repeating that in most of the translation units.  But why declare
BUILT_IN_*_CHKP enums that are never used by anything...

2015-01-27  Jakub Jelinek  

* builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
(DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
(DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
of DEF_BUILTIN.
(BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
(BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
* tree-core.h (enum built_in_function): In between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
for builtins that use DEF_BUILTIN_CHKP macro.

--- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
+++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
@@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
 
The builtins is registered only if COND is true.  */
 
+/* A macro for builtins where the
+   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
+   enums should be defined too.  */
+#ifndef DEF_BUILTIN_CHKP
+#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
+FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
+  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
+ NONANSI_P, ATTRS, IMPLICIT, COND)
+#endif
+
 /* A GCC builtin (like __builtin_saveregs) is provided by the
compiler, but does not correspond to a function in the standard
library.  */
@@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
 #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, false, ATTRS, true, true)
+#undef DEF_LIB_BUILTIN_CHKP
+#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is not one that is
specified by ANSI/ISO C.  So, when we're being fully conformant we
@@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
 #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, true, ATTRS, false, true)
+#undef DEF_EXT_LIB_BUILTIN_CHKP
+#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, true, ATTRS, false, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is only a part of
the standard in C94 or above.  */
@@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
 /* Builtin used by the implementation of Pointer Bounds Checker.  */
 #undef DEF_CHKP_BUILTIN
 #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
-  DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
-  true, true, false, ATTRS, true, true)
+  DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Define an attribute list for math functions that are normally
"impure" because some of them may write into global memory for
@@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
 DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, "index", 
BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCHR, "memchr", 
BT_FN_PTR_CONST_PTR_INT_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCMP, "memcmp", 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMCPY, "memcpy", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMMOVE, "memmove", 
BT_FN_PTR_PTR_C

RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Moore, Catherine


> -Original Message-
> From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> Sent: Tuesday, January 27, 2015 7:19 AM
> To: Richard Sandiford
> Cc: Robert Suchanek; gcc-patches@gcc.gnu.org; Moore, Catherine
> Subject: RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators
> 
> Richard Sandiford  writes:
> > Matthew Fortune  writes:
> > >> 2015-01-23  Robert Suchanek  
> > >>
> > >>  * config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
> > >> accumulators
> > >>  for all vector modes.
> > >
> > > This seems like a genuine bug and although it can only be triggered
> > > by loongson or paired-single support it probably qualifies for fixing.
> >
> > Agreed FWIW.  We shouldn't mark something as valid for a mode if even
> > the mode's move pattern can't handle it.
> >
> > I think this kind of thing should go in regardless of development stage.
> 
> Given that it was one of the pre-existing tests that failed I'm happy that we
> are covering this issue. All of these LRA related issues are likely to phase 
> in
> and out with subtle changes to code-gen so I don't think we can always get a
> test case that fails on trunk.
> 
That's true.

> Since Catherine asked for further info then I will leave her to say if she is
> happy to accept on this basis.
> 

I withdraw my request for a testcase.

Catherine


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Richard Sandiford
Jeff Law  writes:
> On 01/24/15 04:29, Richard Sandiford wrote:
>>
>> Yeah.  I expect in practice most people who used "?" and "!" attached
>> them to a particular operand for a reason.  From a quick scan through
>> 386.exp it looked like almost all uses would either want this behaviour
>> or wouldn't care.  An interesting exception is:
>>
>> (define_insn "extendsidi2_1"
>>[(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o")
>>  (sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r")))
>> (clobber (reg:CC FLAGS_REG))
>> (clobber (match_scratch:SI 2 "=X,X,X,&r"))]
>>"!TARGET_64BIT"
>>"#")
>>
>> I don't know how effective the third alternative is with LRA.  Surely
>> a "r<-0" alternative is by definition a case where "r<-r" is possible
>> only with a "?"-cost reload?  Seems to me we could just delete it.
>> But assuming it does some good, I suppose the "?" really does apply to
>> the alternative as a whole.  If we had to reload operand 1 or operand 0,
>> there's an extra cost if it can't use the same register as the other
>> operand.
>>
>> Wouldn't it be better to make "?" and "!" behave the new way and only
>> add new constraints if it turns out that the old behaviour really is
>> useful in some cases?
>>
>> Maybe stage 4 isn't the time to be making that kind of change.
>> Still, it'd be great if someone who's set up do x86_64 benchmarking
>> could measure the effect of making "?" and "!" behave like the
>> new constraints.
> My worry isn't the x86_64 port, but all the others that folks don't test 
> as regularly.
>
> I'd rather go the other direction, have folks familiar with the port go 
> through it changing the constraints where it makes sense.  That just 
> seems a hell of a lot safer.
>
> A port maintainer could certainly hack something together for testing 
> purposes to guide them as to whether or not there's something to be 
> gained by converting many/most of the ?! to the new constraints.

Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.

I just think we're going to regret having two sets of constraints with
such subtly different meanings.

Looking back at the original PR, Jakub said:

  The ! has been added by me for PR63594, so it isn't there from the era
  when i?86 backend was using reload.  If there is a better way to
  express that RA should prefer to use memory or xmm register and only
  use r constraint if it already is in a r register and doesn't need to
  be reloaded, I can use that.  Whether it is ?, ??? or something else.
  ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of "*".  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dup pattern to
use "*", it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

Thanks,
Richard



Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown
On Mon, 26 Jan 2015 17:34:26 +0300
Ilya Verbin  wrote:

> Here is my current patch, it works for OpenMP->MIC, but obviously
> will not work for PTX, since it requires symmetrical changes in the
> plugin.  Could you please take a look, whether it is possible to
> support this new interface in PTX plugin?

I think it can probably be made to work. I'll have a look in more
detail.

Thanks,

Julian


[PATCH] Fix PR64798

2015-01-27 Thread Richard Biener

The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke 
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?

Thanks,
Richard.

2015-01-27  Richard Biener  

PR libstdc++/64798
* libsupc++/eh_alloc.cc (struct allocated_entry): Align
data member.
(pool::allocate): Adjust allocation size and alignment to
that change.
(pool::free): Adjust pointer offsetting.

Index: libstdc++-v3/libsupc++/eh_alloc.cc
===
--- libstdc++-v3/libsupc++/eh_alloc.cc  (revision 220164)
+++ libstdc++-v3/libsupc++/eh_alloc.cc  (working copy)
@@ -94,7 +94,7 @@ namespace
   };
   struct allocated_entry {
std::size_t size;
-   char data[];
+   char data[] __attribute__((aligned));
   };
 
   // A single mutex controlling emergency allocations.
@@ -133,17 +133,18 @@ namespace
   void *pool::allocate (std::size_t size)
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
-  // We need an additional size_t member.
-  size += sizeof (std::size_t);
+  // We need an additional size_t member plus the padding to
+  // ensure proper alignment of data.
+  size += offsetof (allocated_entry, data);
   // And we need to at least hand out objects of the size of
   // a freelist entry.
   if (size < sizeof (free_entry))
size = sizeof (free_entry);
-  // And we need to align objects we hand out to the required
-  // alignment of a freelist entry (this really aligns the
+  // And we need to align objects we hand out to the maximum
+  // alignment required on the target (this really aligns the
   // tail which will become a new freelist entry).
-  size = ((size + __alignof__(free_entry) - 1)
- & ~(__alignof__(free_entry) - 1));
+  size = ((size + __alignof__ (allocated_entry::data) - 1)
+ & ~(__alignof__ (allocated_entry::data) - 1));
   // Search for an entry of proper size on the freelist.
   free_entry **e;
   for (e = &first_free_entry;
@@ -185,7 +186,7 @@ namespace
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
   allocated_entry *e = reinterpret_cast 
-   (reinterpret_cast  (data) - sizeof (std::size_t));
+   (reinterpret_cast  (data) - offsetof (allocated_entry, data));
   std::size_t sz = e->size;
   if (!first_free_entry)
{


Re: [patch] libstdc++/64368 add configure check for timed mutex operations

2015-01-27 Thread Jonathan Wakely

This isn't related to the last patch for this bug, except that the PR
is currently being used for all darwin FAILs.

We need to check a configure macro before using
pthread_rwlock_timedrdlock because Darwin doesn't define the
_POSIX_TIMEOUTS option.

Tested x86_64-linux, committed to trunk.
commit d48fe00ea96b3515a6a1f7a6926dfe2ff7db643c
Author: Jonathan Wakely 
Date:   Tue Jan 27 10:38:09 2015 +

	PR libstdc++/64368
	* include/std/shared_mutex (shared_timed_mutex::try_lock_for,
	shared_timed_mutex::try_lock_until): Only define when POSIX thread
	timeouts option is supported.
	(shared_timed_mutex::try_shared_lock_for,
	shared_timed_mutex::try_shared_lock_until): Likewise.

diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex
index 643768c..47cfc64 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -108,6 +108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 template
   bool
   try_lock_for(const chrono::duration<_Rep, _Period>& __rel_time)
@@ -149,6 +150,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_until(__s_atime);
   }
+#endif
 
 void
 unlock()
@@ -186,6 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 template
   bool
   try_lock_shared_for(const chrono::duration<_Rep, _Period>& __rel_time)
@@ -230,6 +233,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_shared_until(__s_atime);
   }
+#endif
 
 void
 unlock_shared()


[PATCH, CHKP] Fix PR middle-end/64805

2015-01-27 Thread Ilya Enkovich
Hi,

Some time ago removal of not instrumented version of funtion with 
'always_inline' was delayed to enable their inlining.  With this change we may 
have situations when we inline into a not instrumented version of a function 
which also has an instrumented version (happens when both of them have 
'always_inline').  It causes ICE in cgraph_node verifier because we clear all 
references before inlining and verifier expects IPA_REF_CHKP reference for all 
functions having instrumented version.  This patch fixes it by rebuilding 
IPA_REF_CHKP reference.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2015-01-27  Ilya Enkovich  

PR middle-end/64805
* ipa-inline.c (early_inliner): Rebuild IPA_REF_CHKP reference
to avoid error in cgraph node verification.

2015-01-27  Ilya Enkovich  

PR middle-end/64805
* gcc.target/i386/pr64805.c: New.


diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index c0ff329..d341619 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -2464,6 +2464,13 @@ early_inliner (function *fun)
 #endif
   node->remove_all_references ();
 
+  /* Rebuild this reference because it dosn't depend on
+ function's body and it's required to pass cgraph_node
+ verification.  */
+  if (node->instrumented_version
+  && !node->instrumentation_clone)
+node->create_reference (node->instrumented_version, IPA_REF_CHKP, NULL);
+
   /* Even when not optimizing or not inlining inline always-inline
  functions.  */
   inlined = inline_always_inline_functions (node);
diff --git a/gcc/testsuite/gcc.target/i386/pr64805.c 
b/gcc/testsuite/gcc.target/i386/pr64805.c
new file mode 100644
index 000..8ba0a97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr64805.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include 
+
+static inline void __attribute ((always_inline)) functionA(void)
+{
+  return;
+}
+
+static inline void __attribute ((always_inline)) functionB(void)
+{
+  functionA();
+}
+
+int test(void)
+{
+  functionB();
+
+  return 0;
+}


Re: [RFC] PR64703, glibc sysdeps/powerpc/powerpc64/dl-machine.h miscompile

2015-01-27 Thread Alan Modra
On Mon, Jan 26, 2015 at 10:11:14AM +0100, Richard Biener wrote:
> On Sat, Jan 24, 2015 at 12:23 AM, Alan Modra  wrote:
> > How does this look as a potential fix for PR64703?  I haven't made
> > many forays into gimple code, so even though this patch passes
> > bootstrap and regression testing on powerpc64-linux it's quite
> > possible this is the wrong place to change.  If it does look to be OK,
> > then I'll fill out the targetm changes, include a testcase etc.
> 
> It looks mostly ok, comments below.

Thanks for looking!

> > PR target/64703
> > * tree-ssa-alias.c (pt_solution_includes_base): New function,
> > extracted from..
> > (ref_maybe_used_by_call_p_1): ..here.  Delete dead code checking
> > for NULL return from ao_ref_base.  Handle potential memory
> > reference by indirect calls on targets using function descriptors.
> >
> > Index: gcc/tree-ssa-alias.c
> > ===
> > --- gcc/tree-ssa-alias.c(revision 220025)
> > +++ gcc/tree-ssa-alias.c(working copy)
> > @@ -1532,6 +1532,23 @@ refs_output_dependent_p (tree store1, tree store2)
> >return refs_may_alias_p_1 (&r1, &r2, false);
> >  }
> >
> > +static bool
> > +pt_solution_includes_base (struct pt_solution *pt, tree base)
> 
> Needs a comment.

Sure, that was part of "etc." above. ;)

> > +{
> > +  if (DECL_P (base))
> > +return pt_solution_includes (pt, base);
> > +
> > +  if ((TREE_CODE (base) == MEM_REF
> > +   || TREE_CODE (base) == TARGET_MEM_REF)
> > +  && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME)
> > +{
> > +  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0));
> > +  if (pi)
> > +   return pt_solutions_intersect (pt, &pi->pt);
> > +}
> > +  return true;
> > +}
> > +
> >  /* If the call CALL may use the memory reference REF return true,
> > otherwise return false.  */
> >
> > @@ -1542,15 +1559,24 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *r
> >unsigned i;
> >int flags = gimple_call_flags (call);
> >
> > +  base = ao_ref_base (ref);
> 
> You dropped the
> 
>   if (!base)
> return true;
> 
> check - please put it back.

Hmm, calls to ao_ref_base in tree-ssa-alias.c are a mixed bag.  Some
check the return for NULL, others don't.  All the checks for NULL are
dead code since ao_ref_base never returns NULL.  OK, I'll put it back
and leave cleanup for another day.

> 
> > +  callee = gimple_call_fn (call);
> > +  if (callee && TREE_CODE (callee) == SSA_NAME
> 
> Do we never propagate the address of a function descriptor here?
> That is, can we generate a testcase like
> 
>descr fn;
>
>foo (&fn);
> 
> void foo (fn)
> {
>(*fn) (a, b);
> }
> 
> and then inline foo so the call becomes
> 
>   (*&fn) (a, b);
> 
> ?  We'd then still implicitely read from 'fn'.  I also wonder whether
> (and where!) we resolve such a descriptor reference to the actual
> function on GIMPLE?

Well, if we did end up with a direct call then the value in 'fn' won't
matter, I think.  A direct call implies gcc knows the value of a
function pointer, and can thus use the value rather than the pointer.
In this case it implies gcc has looked through 'fn' to its
initialization, and seen that 'fn' is a function address.
ie. your  above is something like
  fn = *(descr *) &some_function;
Now it appears that gcc isn't clever enough to do this, but if it did,
then surely the "value of the pointer" would be 'some_function', not
'fn'.

I can't see how we could end up with a direct call any other way.  As
far as I know, gcc doesn't have any knowledge that 'descr' has
anything to do with a function address unless that knowledge is
imparted by an initialization such as the above.  ie. The answer to
your "and where!" question is "nowhere".

> That is - don't you simply want to use
> 
>   if (targetm.function_descriptors
>   && ptr_deref_mayalias_ref_p_1 (callee, ref))
> return true;
> 
> here?

No.  I don't want to do needless work on direct calls, particularly
since it appears that ptr_deref_may_alias_ref_p_1 does return true for
direct calls like memcpy.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Workaround -Wmaybe-uninitialized false positives during profiledbootstrap

2015-01-27 Thread Martin Liška

On 01/27/2015 05:23 AM, DJ Delorie wrote:

+/* Workaround -Wstrict-overflow false positive during profiledbootstrap.  */
+
+# if GCC_VERSION >= 4004
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstrict-overflow"
+#endif
+


#pragma diagnostic ignored was added in 4.4 but #pragma diagnostic
push/pop wasn't added until a later release (4.6 I think).  Attempts
to build with 4.4 (i.e. on RHEL 6) causes warnings on most files.


Hello.

Thank you for pointing out, thus changing to 4006 would be the right fix?

Thanks,
Martin


2010-06-21  DJ Delorie  

 * diagnostic.h (diagnostic_classification_change_t): New.
 (diagnostic_context): Add history and push/pop list.
 (diagnostic_push_diagnostics): Declare.
 (diagnostic_pop_diagnostics): Declare.
 * diagnostic.c (diagnostic_classify_diagnostic): Store changes
 from pragmas in a history chain instead of the global table.
 (diagnostic_push_diagnostics): New.
 (diagnostic_pop_diagnostics): New.
 (diagnostic_report_diagnostic): Scan history chain to find state
 of diagnostics as of the diagnostic location.
 * opts.c (set_option): Pass UNKNOWN_LOCATION to
 diagnostic_classify_diagnostic.
 (enable_warning_as_error): Likewise.
 * diagnostic-core.h (DK_POP): Add after "real" diagnostics, for
 use in the history chain.
 * doc/extend.texi: Document pragma GCC diagnostic changes.





Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
On 27 Jan 12:29, Richard Biener wrote:
> On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich  
> wrote:
> > On 27 Jan 12:40, Ilya Enkovich wrote:
> >> Hi,
> >>
> >> This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
> >> fixed by warnings disabling but I think patch is still useful to avoid 
> >> dead code generated by complete unroll.
> >>
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >>
> >> Thanks,
> >> Ilya
> >> --
> >> gcc/
> >>
> >> 2015-01-27  Ilya Enkovich  
> >>
> >>   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
> >>   range info when possible to refine estimation.
> >>
> >> gcc/testsuite/
> >>
> >> 2015-01-27  Ilya Enkovich  
> >>
> >>   * gcc.dg/pr64277.c: New.
> >>
> >>
> >
> > Here is a new version fixed according to comments in the tracker.  I also 
> > fixed a test to scan cunroll dumps.  Does it look OK?
> 
> Minor comments below.
> 
> > What are possible branches for this patch?
> 
> You can probably create a testcase that shows code-size regressions
> against a version that didn't peel completely (GCC 4.7).  Thus I'd say
> it would apply to 4.9 as well (4.8 doesn't have range information).
> 
> > Thanks,
> > Ilya
> > --
> > diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
> > new file mode 100644
> > index 000..c6ef331
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr64277.c
> > @@ -0,0 +1,23 @@
> > +/* PR tree-optimization/64277 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -Wall -Werror -fdump-tree-cunroll-details" } */
> > +/* { dg-final { scan-tree-dump "loop with 5 iterations completely 
> > unrolled" "cunroll" } } */
> > +/* { dg-final { scan-tree-dump "loop with 6 iterations completely 
> > unrolled" "cunroll" } } */
> > +/* { dg-final { cleanup-tree-dump "cunroll" } } */
> > +
> > +int f1[10];
> > +void test1 (short a[], short m, unsigned short l)
> > +{
> > +  int i = l;
> > +  for (i = i + 5; i < m; i++)
> > +f1[i] = a[i]++;
> > +}
> > +
> > +void test2 (short a[], short m, short l)
> > +{
> > +  int i;
> > +  if (m > 5)
> > +m = 5;
> > +  for (i = m; i > l; i--)
> > +f1[i] = a[i]++;
> > +}
> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> > index 919f5c0..1cd297d 100644
> > --- a/gcc/tree-ssa-loop-niter.c
> > +++ b/gcc/tree-ssa-loop-niter.c
> > @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
> > tree step, gimple stmt,
> >  {
> >tree niter_bound, extreme, delta;
> >tree type = TREE_TYPE (base), unsigned_type;
> > +  tree orig_base = base;
> >
> >if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
> >  return;
> > @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree 
> > base, tree step, gimple stmt,
> >
> >if (tree_int_cst_sign_bit (step))
> >  {
> > +  wide_int min, max;
> >extreme = fold_convert (unsigned_type, low);
> > -  if (TREE_CODE (base) != INTEGER_CST)
> > +  if (TREE_CODE (orig_base) == SSA_NAME
> > + && TREE_CODE (high) == INTEGER_CST
> > + && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
> > + && get_range_info (orig_base, &min, &max) == VR_RANGE
> > + && wi::gts_p (wide_int (high), max))
> 
> For me a simple wi::gts_p (high, max) worked fine.
> 
> > +   base = wide_int_to_tree (unsigned_type, max);
> > +  else if (TREE_CODE (base) != INTEGER_CST)
> > base = fold_convert (unsigned_type, high);
> >delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
> >step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
> >  }
> >else
> >  {
> > +  wide_int min, max;
> >extreme = fold_convert (unsigned_type, high);
> > -  if (TREE_CODE (base) != INTEGER_CST)
> > +  if (TREE_CODE (orig_base) == SSA_NAME
> > + && TREE_CODE (low) == INTEGER_CST
> > + && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
> > + && get_range_info (orig_base, &min, &max) == VR_RANGE
> > + && wi::gts_p (min, wide_int (low)))
> 
> Likewise.
> 
> Ok for trunk with that changes.  For the 4.9 branch you need to adjust
> the patch to not use wide-ints.  I'd leave it on trunk for a while and
> eventually open a bugreport for the size regression to keep track of it.
> 
> Thanks,
> Richard.
> 

Thanks a lot for review!  Here is a final version for GCC 5.0.  Will prepare 
4.9 version later.

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wall -Werror -fdump-tree-cunroll-details" } */
+/* { dg-final { scan-tree-dump "loop with 5 iterations completely unrolled" 
"cunroll" } } */
+/* { dg-final { scan-tree-dump "loop with 6 iterations completely unrolled" 
"cunroll" } } */
+/* { dg-final { cleanup-tree-dump "cunroll" } } */

RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Matthew Fortune
Richard Sandiford  writes:
> Matthew Fortune  writes:
> >> 2015-01-23  Robert Suchanek  
> >>
> >>* config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
> >> accumulators
> >>for all vector modes.
> >
> > This seems like a genuine bug and although it can only be triggered by
> > loongson or paired-single support it probably qualifies for fixing.
> 
> Agreed FWIW.  We shouldn't mark something as valid for a mode if even
> the mode's move pattern can't handle it.
> 
> I think this kind of thing should go in regardless of development stage.

Given that it was one of the pre-existing tests that failed I'm happy that
we are covering this issue. All of these LRA related issues are likely
to phase in and out with subtle changes to code-gen so I don't think we
can always get a test case that fails on trunk.

Since Catherine asked for further info then I will leave her to say if she
is happy to accept on this basis.

Matthew


Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown
On Mon, 26 Jan 2015 14:44:19 +0100
Thomas Schwinge  wrote:

> > On 17 Jan 02:16, Ilya Verbin wrote:
> > > Unfortunately, it broke offloading from shared libraries (I mean
> > > common libs with NEEDED entries, not dlopened).
> 
> Sorry for that!
> 
> > > Such things are not covered by the
> > > testsuite, that's why you missed this issue.  Here is a simple
> > > testcase:
> 
> 
> 
> Probably a good motivation for adding such a test case.  ;-)
> 
> > > So, you don't assume that a device can have multiple images from
> > > multiple libs?
> > 
> > Ping?
> 
> This probably is "just" a bug that we introduced with our changes?
> (Julian?)

AFAICR, we haven't yet figured out how to make (shared) libraries work
with PTX. Actually I'm not entirely sure if static libraries containing
PTX code will work either. But, multiple images (e.g. from different
object files) are supported, via the loop in gomp_target_init.

(The semantics of gomp_register_image_for_device were changed, but not
-- intentionally! -- to limit the number of offloaded images to one.)

> > Also, could you please explain, why did you divide a device
> > initialization into two functions -- gomp_init_device and
> > gomp_init_tables?
> 
> As I understand it (again, Julian, please correct me if I got that
> wrong), the reason is that for OpenACC support, we need these as two
> separate (independent) actions.  Is this causing problems for OpenMP
> offloading?

This was certainly necessary at some point, when the support for
multiple devices of the same type in the OpenACC runtime was delegated
entirely to target-dependent code. Later (after one round of
refactoring), the gomp_device_descr and the memory map were still
separate, with the former possibly representing a number of devices,
and the latter having independent copies for each instance of a device.

That's largely been refactored (again) away now though -- a
gomp_device_descr and its memory map are stored together, per-device
instance. So this separation of their initialisation can probably go
away, although some (somewhat delicate) code in oacc-init.c would need
to be tweaked.

Julian


[PATCH] pr 64047 - explicitly handle target_option_default_node in rs6000_set_current_function

2015-01-27 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

the compiler crashes on pr52429.c because this_target_ira_int gets initialized
with null x_init_costs and x_op_costs.  While I don't really understand this 
option handling mess r217659 made the analogous change to i386 when it broke 
this.  So it seems likely this is the right way to fix the regression.

bootstrapped + regtested ppc64-linux-gnu, without regression and pr52429.c is
fixed, ok?


Trev

gcc/

* config/rs6000/rs6000.c (rs6000_set_current_function): Handle
explicit default options.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 85eb0fd..207fc55 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32609,7 +32609,7 @@ rs6000_set_current_function (tree fndecl)
   if (old_tree == new_tree)
;
 
-  else if (new_tree)
+  else if (new_tree && new_tree != target_option_default_node)
{
  cl_target_option_restore (&global_options,
TREE_TARGET_OPTION (new_tree));
@@ -32620,7 +32620,7 @@ rs6000_set_current_function (tree fndecl)
  = save_target_globals_default_opts ();
}
 
-  else if (old_tree)
+  else if (old_tree && old_tree != target_option_default_node)
{
  new_tree = target_option_current_node;
  cl_target_option_restore (&global_options,
-- 
2.1.4



Re: [PATCH][2/2] Improve array-bound warnings and VRP

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

> On Mon, Jan 26, 2015 at 04:18:32PM +0100, Richard Biener wrote:
> > > > Ok for trunk?  Or should I delay this to GCC 6?
> > > 
> > > Does this work even without the other patch?
> > 
> > Yes, I've actually developed 2/2 first.  The other patch only ever
> > emits more warnings...
> 
> Then it probably should be ok.  I'm really afraid of emitting more warnings
> with such high false positive rate now.

As the patch also mitigates some of the code bloat we get with
the complete peeling (regression against 4.7) I have installed it.
It's also the easiest vehicle to verify range-info is not broken
by passes between vrp1 and vrp2.

Thanks,
Richard.


Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Richard Biener
On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich  wrote:
> On 27 Jan 12:40, Ilya Enkovich wrote:
>> Hi,
>>
>> This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
>> fixed by warnings disabling but I think patch is still useful to avoid dead 
>> code generated by complete unroll.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-01-27  Ilya Enkovich  
>>
>>   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
>>   range info when possible to refine estimation.
>>
>> gcc/testsuite/
>>
>> 2015-01-27  Ilya Enkovich  
>>
>>   * gcc.dg/pr64277.c: New.
>>
>>
>
> Here is a new version fixed according to comments in the tracker.  I also 
> fixed a test to scan cunroll dumps.  Does it look OK?

Minor comments below.

> What are possible branches for this patch?

You can probably create a testcase that shows code-size regressions
against a version that didn't peel completely (GCC 4.7).  Thus I'd say
it would apply to 4.9 as well (4.8 doesn't have range information).

> Thanks,
> Ilya
> --
> diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
> new file mode 100644
> index 000..c6ef331
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr64277.c
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/64277 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -Wall -Werror -fdump-tree-cunroll-details" } */
> +/* { dg-final { scan-tree-dump "loop with 5 iterations completely unrolled" 
> "cunroll" } } */
> +/* { dg-final { scan-tree-dump "loop with 6 iterations completely unrolled" 
> "cunroll" } } */
> +/* { dg-final { cleanup-tree-dump "cunroll" } } */
> +
> +int f1[10];
> +void test1 (short a[], short m, unsigned short l)
> +{
> +  int i = l;
> +  for (i = i + 5; i < m; i++)
> +f1[i] = a[i]++;
> +}
> +
> +void test2 (short a[], short m, short l)
> +{
> +  int i;
> +  if (m > 5)
> +m = 5;
> +  for (i = m; i > l; i--)
> +f1[i] = a[i]++;
> +}
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 919f5c0..1cd297d 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
> tree step, gimple stmt,
>  {
>tree niter_bound, extreme, delta;
>tree type = TREE_TYPE (base), unsigned_type;
> +  tree orig_base = base;
>
>if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>  return;
> @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
> tree step, gimple stmt,
>
>if (tree_int_cst_sign_bit (step))
>  {
> +  wide_int min, max;
>extreme = fold_convert (unsigned_type, low);
> -  if (TREE_CODE (base) != INTEGER_CST)
> +  if (TREE_CODE (orig_base) == SSA_NAME
> + && TREE_CODE (high) == INTEGER_CST
> + && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
> + && get_range_info (orig_base, &min, &max) == VR_RANGE
> + && wi::gts_p (wide_int (high), max))

For me a simple wi::gts_p (high, max) worked fine.

> +   base = wide_int_to_tree (unsigned_type, max);
> +  else if (TREE_CODE (base) != INTEGER_CST)
> base = fold_convert (unsigned_type, high);
>delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
>step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
>  }
>else
>  {
> +  wide_int min, max;
>extreme = fold_convert (unsigned_type, high);
> -  if (TREE_CODE (base) != INTEGER_CST)
> +  if (TREE_CODE (orig_base) == SSA_NAME
> + && TREE_CODE (low) == INTEGER_CST
> + && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
> + && get_range_info (orig_base, &min, &max) == VR_RANGE
> + && wi::gts_p (min, wide_int (low)))

Likewise.

Ok for trunk with that changes.  For the 4.9 branch you need to adjust
the patch to not use wide-ints.  I'd leave it on trunk for a while and
eventually open a bugreport for the size regression to keep track of it.

Thanks,
Richard.

> +   base = wide_int_to_tree (unsigned_type, min);
> +  else if (TREE_CODE (base) != INTEGER_CST)
> base = fold_convert (unsigned_type, low);
>delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
>  }


Re: [PATCH] Fix PR64277

2015-01-27 Thread Ilya Enkovich
2015-01-27 13:59 GMT+03:00 Richard Biener :
> On Tue, 27 Jan 2015, Ilya Enkovich wrote:
>
>> 2015-01-27 12:47 GMT+03:00 Richard Biener :
>> > On Tue, 27 Jan 2015, Jakub Jelinek wrote:
>> >
>> >> On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
>> >> >
>> >> > This disables array-bound warnings from VRP2 as discussed.
>> >> >
>> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
>> >>
>> >> So nothing in the testsuite needed to change?  Nice.
>> >
>> > Yes.
>> >
>> >> Ok for trunk.
>> >>
>> >> > I'll search for duplicates and add a few testcases.
>> >>
>> >> Thanks.
>> >
>> > Committed as follows (first testcase in PR59124 not fixed - it warns
>> > from the first pass).
>>
>> Are you going to port it to 4.9 branch?
>
> I plan to do that (4.8 as well) after some time.

Great, thanks!

Ilya

>
> Richard.
>
>> Thanks,
>> Ilya
>>
>> >
>> > 2015-01-27  Richard Biener  
>> >
>> > PR tree-optimization/56273
>> > PR tree-optimization/59124
>> > PR tree-optimization/64277
>> > * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
>> > from the first VRP pass.
>> >
>> > * g++.dg/warn/Warray-bounds-6.C: New testcase.
>> > * gcc.dg/Warray-bounds-12.c: Likewise.
>> > * gcc.dg/Warray-bounds-13.c: Likewise.
>> >
>> > Index: gcc/tree-vrp.c
>> > ===
>> > *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
>> > --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
>> > *** vrp_finalize (void)
>> > *** 10229,10235 
>> > substitute_and_fold (op_with_constant_singleton_value_range,
>> >vrp_fold_stmt, false);
>> >
>> > !   if (warn_array_bounds)
>> >   check_all_array_refs ();
>> >
>> > /* We must identify jump threading opportunities before we release
>> > --- 10229,10235 
>> > substitute_and_fold (op_with_constant_singleton_value_range,
>> >vrp_fold_stmt, false);
>> >
>> > !   if (warn_array_bounds && first_pass_instance)
>> >   check_all_array_refs ();
>> >
>> > /* We must identify jump threading opportunities before we release
>> > Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
>> > ===
>> > *** /dev/null   1970-01-01 00:00:00.0 +
>> > --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
>> > 10:40:31.311871855 +0100
>> > ***
>> > *** 0 
>> > --- 1,26 
>> > + // { dg-do compile }
>> > + // { dg-options "-O3 -Warray-bounds" }
>> > +
>> > + struct type {
>> > + bool a, b;
>> > + bool get_b() { return b; }
>> > + };
>> > +
>> > + type stuff[9u];
>> > +
>> > + void bar();
>> > +
>> > + void foo()
>> > + {
>> > +   for(unsigned i = 0u; i < 9u; i++)
>> > + {
>> > +   if(!stuff[i].a)
>> > +   continue;
>> > +
>> > +   bar();
>> > +
>> > +   for(unsigned j = i + 1u; j < 9u; j++)
>> > +   if(stuff[j].a && stuff[j].get_b()) // { dg-bogus "above array 
>> > bounds" }
>> > + return;
>> > + }
>> > + }
>> > Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
>> > ===
>> > *** /dev/null   1970-01-01 00:00:00.0 +
>> > --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
>> > 10:40:58.196175989 +0100
>> > ***
>> > *** 0 
>> > --- 1,26 
>> > + /* { dg-do compile } */
>> > + /* { dg-options "-O3 -Warray-bounds" } */
>> > + /* { dg-additional-options "-mssse3" { target x86_64-*-* i?86-*-* } } */
>> > +
>> > + void foo(short a[], short m)
>> > + {
>> > +   int i, j;
>> > +   int f1[10];
>> > +   short nc;
>> > +
>> > +   nc = m + 1;
>> > +   if (nc > 3)
>> > + {
>> > +   for (i = 0; i <= nc; i++)
>> > +   {
>> > + f1[i] = f1[i] + 1;
>> > +   }
>> > + }
>> > +
>> > +   for (i = 0, j = m; i < nc; i++, j--)
>> > + {
>> > +   a[i] = f1[i]; /* { dg-bogus "above array bounds" } */
>> > +   a[j] = i;
>> > + }
>> > +   return;
>> > + }
>> > Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
>> > ===
>> > *** /dev/null   1970-01-01 00:00:00.0 +
>> > --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
>> > 10:42:43.738369929 +0100
>> > ***
>> > *** 0 
>> > --- 1,18 
>> > + /* { dg-do compile } */
>> > + /* { dg-options "-O3 -Warray-bounds" } */
>> > +
>> > + extern char *bar[17];
>> > +
>> > + int foo(int argc, char **argv)
>> > + {
>> > +   int i;
>> > +   int n = 0;
>> > +
>> > +   for (i = 0; i < argc; i++)
>> > + n++;
>> > +
>> > +   for (i = 0; i < argc; i++)
>> > + argv[i] = bar[i + n]; /* { dg-bogus "above array bounds" } */
>> > +
>> > +   return 0;
>> > + }
>>
>>
>
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
> Dilip Upmanyu, Grah

Re: [PATCH] Fix PR64277

2015-01-27 Thread Richard Biener
On Tue, 27 Jan 2015, Ilya Enkovich wrote:

> 2015-01-27 12:47 GMT+03:00 Richard Biener :
> > On Tue, 27 Jan 2015, Jakub Jelinek wrote:
> >
> >> On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
> >> >
> >> > This disables array-bound warnings from VRP2 as discussed.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
> >>
> >> So nothing in the testsuite needed to change?  Nice.
> >
> > Yes.
> >
> >> Ok for trunk.
> >>
> >> > I'll search for duplicates and add a few testcases.
> >>
> >> Thanks.
> >
> > Committed as follows (first testcase in PR59124 not fixed - it warns
> > from the first pass).
> 
> Are you going to port it to 4.9 branch?

I plan to do that (4.8 as well) after some time.

Richard.

> Thanks,
> Ilya
> 
> >
> > 2015-01-27  Richard Biener  
> >
> > PR tree-optimization/56273
> > PR tree-optimization/59124
> > PR tree-optimization/64277
> > * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
> > from the first VRP pass.
> >
> > * g++.dg/warn/Warray-bounds-6.C: New testcase.
> > * gcc.dg/Warray-bounds-12.c: Likewise.
> > * gcc.dg/Warray-bounds-13.c: Likewise.
> >
> > Index: gcc/tree-vrp.c
> > ===
> > *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
> > --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
> > *** vrp_finalize (void)
> > *** 10229,10235 
> > substitute_and_fold (op_with_constant_singleton_value_range,
> >vrp_fold_stmt, false);
> >
> > !   if (warn_array_bounds)
> >   check_all_array_refs ();
> >
> > /* We must identify jump threading opportunities before we release
> > --- 10229,10235 
> > substitute_and_fold (op_with_constant_singleton_value_range,
> >vrp_fold_stmt, false);
> >
> > !   if (warn_array_bounds && first_pass_instance)
> >   check_all_array_refs ();
> >
> > /* We must identify jump threading opportunities before we release
> > Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
> > ===
> > *** /dev/null   1970-01-01 00:00:00.0 +
> > --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
> > 10:40:31.311871855 +0100
> > ***
> > *** 0 
> > --- 1,26 
> > + // { dg-do compile }
> > + // { dg-options "-O3 -Warray-bounds" }
> > +
> > + struct type {
> > + bool a, b;
> > + bool get_b() { return b; }
> > + };
> > +
> > + type stuff[9u];
> > +
> > + void bar();
> > +
> > + void foo()
> > + {
> > +   for(unsigned i = 0u; i < 9u; i++)
> > + {
> > +   if(!stuff[i].a)
> > +   continue;
> > +
> > +   bar();
> > +
> > +   for(unsigned j = i + 1u; j < 9u; j++)
> > +   if(stuff[j].a && stuff[j].get_b()) // { dg-bogus "above array 
> > bounds" }
> > + return;
> > + }
> > + }
> > Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
> > ===
> > *** /dev/null   1970-01-01 00:00:00.0 +
> > --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
> > 10:40:58.196175989 +0100
> > ***
> > *** 0 
> > --- 1,26 
> > + /* { dg-do compile } */
> > + /* { dg-options "-O3 -Warray-bounds" } */
> > + /* { dg-additional-options "-mssse3" { target x86_64-*-* i?86-*-* } } */
> > +
> > + void foo(short a[], short m)
> > + {
> > +   int i, j;
> > +   int f1[10];
> > +   short nc;
> > +
> > +   nc = m + 1;
> > +   if (nc > 3)
> > + {
> > +   for (i = 0; i <= nc; i++)
> > +   {
> > + f1[i] = f1[i] + 1;
> > +   }
> > + }
> > +
> > +   for (i = 0, j = m; i < nc; i++, j--)
> > + {
> > +   a[i] = f1[i]; /* { dg-bogus "above array bounds" } */
> > +   a[j] = i;
> > + }
> > +   return;
> > + }
> > Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
> > ===
> > *** /dev/null   1970-01-01 00:00:00.0 +
> > --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
> > 10:42:43.738369929 +0100
> > ***
> > *** 0 
> > --- 1,18 
> > + /* { dg-do compile } */
> > + /* { dg-options "-O3 -Warray-bounds" } */
> > +
> > + extern char *bar[17];
> > +
> > + int foo(int argc, char **argv)
> > + {
> > +   int i;
> > +   int n = 0;
> > +
> > +   for (i = 0; i < argc; i++)
> > + n++;
> > +
> > +   for (i = 0; i < argc; i++)
> > + argv[i] = bar[i + n]; /* { dg-bogus "above array bounds" } */
> > +
> > +   return 0;
> > + }
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
On 27 Jan 12:40, Ilya Enkovich wrote:
> Hi,
> 
> This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
> fixed by warnings disabling but I think patch is still useful to avoid dead 
> code generated by complete unroll.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> Thanks,
> Ilya
> --
> gcc/
> 
> 2015-01-27  Ilya Enkovich  
> 
>   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
>   range info when possible to refine estimation.
> 
> gcc/testsuite/
> 
> 2015-01-27  Ilya Enkovich  
> 
>   * gcc.dg/pr64277.c: New.
> 
> 

Here is a new version fixed according to comments in the tracker.  I also fixed 
a test to scan cunroll dumps.  Does it look OK?

What are possible branches for this patch?

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wall -Werror -fdump-tree-cunroll-details" } */
+/* { dg-final { scan-tree-dump "loop with 5 iterations completely unrolled" 
"cunroll" } } */
+/* { dg-final { scan-tree-dump "loop with 6 iterations completely unrolled" 
"cunroll" } } */
+/* { dg-final { cleanup-tree-dump "cunroll" } } */
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i < m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m > 5)
+m = 5;
+  for (i = m; i > l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..1cd297d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+ && TREE_CODE (high) == INTEGER_CST
+ && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+ && get_range_info (orig_base, &min, &max) == VR_RANGE
+ && wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+ && TREE_CODE (low) == INTEGER_CST
+ && INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+ && get_range_info (orig_base, &min, &max) == VR_RANGE
+ && wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }


Re: [PATCH] Fix PR64277

2015-01-27 Thread Ilya Enkovich
2015-01-27 12:47 GMT+03:00 Richard Biener :
> On Tue, 27 Jan 2015, Jakub Jelinek wrote:
>
>> On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
>> >
>> > This disables array-bound warnings from VRP2 as discussed.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
>>
>> So nothing in the testsuite needed to change?  Nice.
>
> Yes.
>
>> Ok for trunk.
>>
>> > I'll search for duplicates and add a few testcases.
>>
>> Thanks.
>
> Committed as follows (first testcase in PR59124 not fixed - it warns
> from the first pass).

Are you going to port it to 4.9 branch?

Thanks,
Ilya

>
> 2015-01-27  Richard Biener  
>
> PR tree-optimization/56273
> PR tree-optimization/59124
> PR tree-optimization/64277
> * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
> from the first VRP pass.
>
> * g++.dg/warn/Warray-bounds-6.C: New testcase.
> * gcc.dg/Warray-bounds-12.c: Likewise.
> * gcc.dg/Warray-bounds-13.c: Likewise.
>
> Index: gcc/tree-vrp.c
> ===
> *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
> --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
> *** vrp_finalize (void)
> *** 10229,10235 
> substitute_and_fold (op_with_constant_singleton_value_range,
>vrp_fold_stmt, false);
>
> !   if (warn_array_bounds)
>   check_all_array_refs ();
>
> /* We must identify jump threading opportunities before we release
> --- 10229,10235 
> substitute_and_fold (op_with_constant_singleton_value_range,
>vrp_fold_stmt, false);
>
> !   if (warn_array_bounds && first_pass_instance)
>   check_all_array_refs ();
>
> /* We must identify jump threading opportunities before we release
> Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
> ===
> *** /dev/null   1970-01-01 00:00:00.0 +
> --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
> +0100
> ***
> *** 0 
> --- 1,26 
> + // { dg-do compile }
> + // { dg-options "-O3 -Warray-bounds" }
> +
> + struct type {
> + bool a, b;
> + bool get_b() { return b; }
> + };
> +
> + type stuff[9u];
> +
> + void bar();
> +
> + void foo()
> + {
> +   for(unsigned i = 0u; i < 9u; i++)
> + {
> +   if(!stuff[i].a)
> +   continue;
> +
> +   bar();
> +
> +   for(unsigned j = i + 1u; j < 9u; j++)
> +   if(stuff[j].a && stuff[j].get_b()) // { dg-bogus "above array bounds" 
> }
> + return;
> + }
> + }
> Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
> ===
> *** /dev/null   1970-01-01 00:00:00.0 +
> --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
> +0100
> ***
> *** 0 
> --- 1,26 
> + /* { dg-do compile } */
> + /* { dg-options "-O3 -Warray-bounds" } */
> + /* { dg-additional-options "-mssse3" { target x86_64-*-* i?86-*-* } } */
> +
> + void foo(short a[], short m)
> + {
> +   int i, j;
> +   int f1[10];
> +   short nc;
> +
> +   nc = m + 1;
> +   if (nc > 3)
> + {
> +   for (i = 0; i <= nc; i++)
> +   {
> + f1[i] = f1[i] + 1;
> +   }
> + }
> +
> +   for (i = 0, j = m; i < nc; i++, j--)
> + {
> +   a[i] = f1[i]; /* { dg-bogus "above array bounds" } */
> +   a[j] = i;
> + }
> +   return;
> + }
> Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
> ===
> *** /dev/null   1970-01-01 00:00:00.0 +
> --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
> +0100
> ***
> *** 0 
> --- 1,18 
> + /* { dg-do compile } */
> + /* { dg-options "-O3 -Warray-bounds" } */
> +
> + extern char *bar[17];
> +
> + int foo(int argc, char **argv)
> + {
> +   int i;
> +   int n = 0;
> +
> +   for (i = 0; i < argc; i++)
> + n++;
> +
> +   for (i = 0; i < argc; i++)
> + argv[i] = bar[i + n]; /* { dg-bogus "above array bounds" } */
> +
> +   return 0;
> + }


Re: [PATCH] Fix PR64277

2015-01-27 Thread Richard Biener
On Tue, 27 Jan 2015, Jakub Jelinek wrote:

> On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
> > 
> > This disables array-bound warnings from VRP2 as discussed.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
> 
> So nothing in the testsuite needed to change?  Nice.

Yes.

> Ok for trunk.
> 
> > I'll search for duplicates and add a few testcases.
> 
> Thanks.

Committed as follows (first testcase in PR59124 not fixed - it warns
from the first pass).

2015-01-27  Richard Biener  

PR tree-optimization/56273
PR tree-optimization/59124
PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

* g++.dg/warn/Warray-bounds-6.C: New testcase.
* gcc.dg/Warray-bounds-12.c: Likewise.
* gcc.dg/Warray-bounds-13.c: Likewise.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
--- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
*** vrp_finalize (void)
*** 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
--- 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds && first_pass_instance)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
+0100
***
*** 0 
--- 1,26 
+ // { dg-do compile }
+ // { dg-options "-O3 -Warray-bounds" }
+ 
+ struct type {
+ bool a, b;
+ bool get_b() { return b; }
+ };
+ 
+ type stuff[9u];
+ 
+ void bar();
+ 
+ void foo()
+ {
+   for(unsigned i = 0u; i < 9u; i++)
+ {
+   if(!stuff[i].a)
+   continue;
+ 
+   bar();
+ 
+   for(unsigned j = i + 1u; j < 9u; j++)
+   if(stuff[j].a && stuff[j].get_b()) // { dg-bogus "above array bounds" }
+ return;
+ }
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
+0100
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -Warray-bounds" } */
+ /* { dg-additional-options "-mssse3" { target x86_64-*-* i?86-*-* } } */
+ 
+ void foo(short a[], short m)
+ {
+   int i, j;
+   int f1[10];
+   short nc;
+ 
+   nc = m + 1;
+   if (nc > 3)
+ {
+   for (i = 0; i <= nc; i++)
+   {
+ f1[i] = f1[i] + 1;
+   }
+ }
+ 
+   for (i = 0, j = m; i < nc; i++, j--)
+ {
+   a[i] = f1[i]; /* { dg-bogus "above array bounds" } */
+   a[j] = i;
+ }
+   return;
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
+0100
***
*** 0 
--- 1,18 
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -Warray-bounds" } */
+ 
+ extern char *bar[17];
+ 
+ int foo(int argc, char **argv)
+ {
+   int i;
+   int n = 0;
+ 
+   for (i = 0; i < argc; i++)
+ n++;
+ 
+   for (i = 0; i < argc; i++)
+ argv[i] = bar[i + n]; /* { dg-bogus "above array bounds" } */
+ 
+   return 0;
+ }


Re: [PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

2015-01-27 Thread Kyrill Tkachov


On 19/01/15 15:46, Kyrill Tkachov wrote:

On 19/01/15 15:44, James Greenhalgh wrote:

On Mon, Jan 12, 2015 at 05:30:46PM +, Andrew Pinski wrote:

On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov  wrote:

Hi all,

As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
call to the library sqrt at -O0. To avoid that this patch uses a target
builtin for sqrt on DF mode and uses that to implement the intrinsic.

With this patch I don't see sqrt calls being created at -O0 on a large
arm_neon.h testcase where they were generated before.
aarch64-none-elf testing and the intrinsics testsuite in particular are
clean.
Ok for trunk?

Maybe have a target fold which folds this into sqrt if -fno-math-errno
is supplied.  This might be useful the -ffast-math case.
Maybe also fold it when a constant is supplied too.

Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
in the way Kyrill proposed than languishing on a TODO list. Though an
IOU ticket on bugzilla for the missed optimization seems a good idea
to me.

Unless Kyrill already has something in the works to address your
comment, this looks like the right short-term solution to me
(Though Marcus/Richard will have to approve it).

Sorry, this slipped through the cracks.
I agree with James. A missed-optimization issue on bugzilla would be
helpful to keep track of this.


I've filed PR 64821 to keep track of this for GCC 6.
Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then?
It's a regression fix at -O0 so should be appropriate for stage4

Thanks,
Kyrill



Kyrill


Thanks,
James


2015-01-12  Kyrylo Tkachov  

  * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
  * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
  instead of __builtin_sqrt.








[PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
Hi,

This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
fixed by warnings disabling but I think patch is still useful to avoid dead 
code generated by complete unroll.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-01-27  Ilya Enkovich  

* tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
range info when possible to refine estimation.

gcc/testsuite/

2015-01-27  Ilya Enkovich  

* gcc.dg/pr64277.c: New.


diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..0d5ef11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,21 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wall -Werror" } */
+
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i < m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m > 5)
+m = 5;
+  for (i = m; i > l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..6a55c6f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,32 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+ && TREE_CODE (high) == INTEGER_CST
+ && !POINTER_TYPE_P (TREE_TYPE (orig_base))
+ && SSA_NAME_RANGE_INFO (orig_base)
+ && get_range_info (orig_base, &min, &max) == VR_RANGE
+ && wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+ && TREE_CODE (low) == INTEGER_CST
+ && !POINTER_TYPE_P (TREE_TYPE (orig_base))
+ && SSA_NAME_RANGE_INFO (orig_base)
+ && get_range_info (orig_base, &min, &max) == VR_RANGE
+ && wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }


[PATCH, testsuite] Fix PR64796: bswap64 effective target should not cache its result

2015-01-27 Thread Thomas Preud'homme
As explained in PR64796, code for bswap64 effective target computes the answer 
once and then cache. However the result depends on the flags passed to the 
compiler and with --target_board it's possible to test several sets of flags. 
Besides, this code assume only lp64 targets can do 64-bit bswap when all 32-bit 
targets also can by virtue of expand_doubleword_bswap () called in expand_unop 
(). This patch solve both problems by removing the caching of the result and 
changing the condition to include all target with 32-bit or more wordsize.

ChangeLog entry is as follows:

 gcc/testsuite/ChangeLog ***

2015-01-27  Thomas Preud'homme  

PR testsuite/64796
* lib/target-supports.exp (check_effective_target_bswap64): Do not
cache result in a global variable.  Include all 32-bit targets for
bswap64 tests.


diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index e51d07d..9aaf229 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5039,18 +5039,11 @@ proc check_effective_target_bswap32 { } {
 proc check_effective_target_bswap64 { } {
 global et_bswap64_saved
 
-if [info exists et_bswap64_saved] {
-verbose "check_effective_target_bswap64: using cached result" 2
-} else {
-   set et_bswap64_saved 0
-   if { [is-effective-target bswap]
-&& [is-effective-target lp64] } {
-  set et_bswap64_saved 1
-   }
+# expand_unop can expand 64-bit byte swap on 32-bit targets
+if { [is-effective-target bswap] && [is-effective-target int32plus] } {
+   return 1
 }
-
-verbose "check_effective_target_bswap64: returning $et_bswap64_saved" 2
-return $et_bswap64_saved
+return 0
 }
 
 # Return 1 if the target supports atomic operations on "int" and "long".


Testing done:
* arm-none-eabi-gcc cross-compiler was built and x86_64 GCC native compiler was 
bootstrapped. Both show no regressions when running the testsuite and 
optimize-bswapdi-* tests are run for arm-non-eabi-gcc.
* optimize-bswapdi-* are also run when passing --target_board=unix/-m32 in 
RUNTESTFLAGS.


Is this ok for trunk?

Best regards,

Thomas





Re: [PATCH] Fix PR64277

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
> 
> This disables array-bound warnings from VRP2 as discussed.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

So nothing in the testsuite needed to change?  Nice.

Ok for trunk.

> I'll search for duplicates and add a few testcases.

Thanks.

> 2015-01-27  Richard Biener  
> 
>   PR tree-optimization/64277
>   * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
>   from the first VRP pass.
> 
> Index: gcc/tree-vrp.c
> ===
> --- gcc/tree-vrp.c(revision 220107)
> +++ gcc/tree-vrp.c(working copy)
> @@ -10229,7 +10197,7 @@ vrp_finalize (void)
>substitute_and_fold (op_with_constant_singleton_value_range,
>  vrp_fold_stmt, false);
>  
> -  if (warn_array_bounds)
> +  if (warn_array_bounds && first_pass_instance)
>  check_all_array_refs ();
>  
>/* We must identify jump threading opportunities before we release

Jakub


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:
> Janus Weil  writes:
> 
> > 2015-01-19  Janus Weil  
> >
> > PR fortran/64230
> > * gfortran.dg/class_allocate_18.f90: Extended.
> 
> FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
> Excess errors:
> /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
directory and hack up ubsan.exp in there, from say gcc.dg/ubsan/ubsan.exp
and gfortran.dg/dg.exp.

Jakub


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Andreas Schwab
Janus Weil  writes:

> 2015-01-19  Janus Weil  
>
> PR fortran/64230
> * gfortran.dg/class_allocate_18.f90: Extended.

FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
Excess errors:
/usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] Fix PR64277

2015-01-27 Thread Richard Biener

This disables array-bound warnings from VRP2 as discussed.

Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

I'll search for duplicates and add a few testcases.

Thanks,
Richard.

2015-01-27  Richard Biener  

PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 220107)
+++ gcc/tree-vrp.c  (working copy)
@@ -10229,7 +10197,7 @@ vrp_finalize (void)
   substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
 
-  if (warn_array_bounds)
+  if (warn_array_bounds && first_pass_instance)
 check_all_array_refs ();
 
   /* We must identify jump threading opportunities before we release


Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou
> Because reorder_insns doesn't handle the case of moving a barrier into a
> middle of basic block.

Right, I should have read the audit trail. :-)  The patch is OK then, but add 
a ??? note at the end of the comment saying that the proper thing to do here 
is probably not to run cleanup_barrier for this back-end.

-- 
Eric Botcazou


Re: [PATCH] wide-int division fix (PR tree-optimization/64807)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase we generate wrong code, because
> apparently divmod_internal_2 relies on 0 being the topmost
> element (at b_dividend[m]):
>algorithm.  M is the number of significant elements of U however
>there needs to be at least one extra element of B_DIVIDEND
>allocated, N is the number of elements of B_DIVISOR.  */
> The comment talks just about allocation, but from the code
> it seems it really relies on it being 0.
> There is space for it:
>   unsigned HOST_HALF_WIDE_INT
> b_dividend[(4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT) + 
> 1];
>   unsigned HOST_HALF_WIDE_INT
> b_divisor[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
> (the + 1), and usually there already is a zero in there:
>   m = dividend_blocks_needed;
>   while (m > 1 && b_dividend[m - 1] == 0)
> m--;
> so the only problematic case is if m isn't decreased.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

Ok.

Thanks,
Richard.

> 2015-01-26  Jakub Jelinek  
> 
>   PR tree-optimization/64807
>   * wide-int.cc (wi::divmod_internal): Clear
>   b_dividend[dividend_blocks_needed].
> 
>   * gcc.dg/pr64807.c: New test.
> 
> --- gcc/wide-int.cc.jj2015-01-09 21:59:38.0 +0100
> +++ gcc/wide-int.cc   2015-01-26 19:21:56.114316481 +0100
> @@ -1819,6 +1819,7 @@ wi::divmod_internal (HOST_WIDE_INT *quot
>divisor_blocks_needed, divisor_prec, sgn);
>  
>m = dividend_blocks_needed;
> +  b_dividend[m] = 0;
>while (m > 1 && b_dividend[m - 1] == 0)
>  m--;
>  
> --- gcc/testsuite/gcc.dg/pr64807.c.jj 2015-01-26 19:24:13.612943033 +0100
> +++ gcc/testsuite/gcc.dg/pr64807.c2015-01-26 19:32:34.502237566 +0100
> @@ -0,0 +1,19 @@
> +/* PR tree-optimization/64807 */
> +/* { dg-do run { target int128 } } */
> +/* { dg-options "-O2" } */
> +
> +__uint128_t
> +foo (void)
> +{
> +  __uint128_t a = -1;
> +  __uint128_t b = -1;
> +  return a / b;
> +}
> +
> +int
> +main ()
> +{
> +  if (foo () != 1)
> +__builtin_abort ();
> +  return 0;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Fix ICE due to invalid thunk (PR ipa/64776)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

> Hi!
> 
> On x86_64-darwin, we ICE on one of the pr64307.c testcase, because
> expand_thunk doesn't load non-gimple_val arguments into registers
> for the first argument, only for all the other ones.
> Supposedly normally thunks were meant to have this argument as pointer first
> and thus it wasn't an issue, but in the -O0 -fipa-icf case a thunk is
> created even for a non-method.
> 
> This patch fixes it by special-casing the first argument only if
> this_adjusting - then we know it is a pointer that is being adjusted.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2015-01-26  Jakub Jelinek  
> 
>   PR ipa/64776
>   * cgraphunit.c (cgraph_node::expand_thunk): If not this_adjusting,
>   handle the first argument in the same loop as all the other arguments.
> 
> --- gcc/cgraphunit.c.jj   2015-01-15 14:05:05.0 +0100
> +++ gcc/cgraphunit.c  2015-01-26 17:26:18.629818527 +0100
> @@ -1610,14 +1610,18 @@ cgraph_node::expand_thunk (bool output_a
>for (arg = a; arg; arg = DECL_CHAIN (arg))
>  nargs++;
>auto_vec vargs (nargs);
> +  i = 0;
> +  arg = a;
>if (this_adjusting)
> -vargs.quick_push (thunk_adjust (&bsi, a, 1, fixed_offset,
> - virtual_offset));
> -  else if (nargs)
> -vargs.quick_push (a);
> + {
> +   vargs.quick_push (thunk_adjust (&bsi, a, 1, fixed_offset,
> +   virtual_offset));
> +   arg = DECL_CHAIN (a);
> +   i = 1;
> + }
>  
>if (nargs)
> -for (i = 1, arg = DECL_CHAIN (a); i < nargs; i++, arg = DECL_CHAIN 
> (arg))
> + for (; i < nargs; i++, arg = DECL_CHAIN (arg))
> {
>   tree tmp = arg;
>   if (!is_gimple_val (arg))
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Fix ICE during ipa dumping (PR ipa/64730)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

> Hi!
> 
> On various targets, %s in fprintf can't handle NULL arguments,
> and even when edge->call_stmt is non-NULL, it still might have
> UNKNOWN_LOCATION or BUILTINS_LOCATION, which have NULL filename.
> In this particular case it is a fnsplit created call.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

Ok.

Thanks,
Richard.

> 2015-01-26  Jakub Jelinek  
> 
>   PR ipa/64730
>   * ipa-inline.c (inline_small_functions): Print "unknown" even
>   if edge->call_stmt is non-NULL, but has builtins or unknown
>   location.
> 
> --- gcc/ipa-inline.c.jj   2015-01-22 21:45:18.0 +0100
> +++ gcc/ipa-inline.c  2015-01-26 15:41:57.193640527 +0100
> @@ -1822,6 +1822,9 @@ inline_small_functions (void)
>  " Estimated badness is %f, frequency %.2f.\n",
>  edge->caller->name (), edge->caller->order,
>  edge->call_stmt
> +&& (LOCATION_LOCUS (gimple_location ((const_gimple)
> + edge->call_stmt))
> +> BUILTINS_LOCATION)
>  ? gimple_filename ((const_gimple) edge->call_stmt)
>  : "unknown",
>  edge->call_stmt
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 09:25:32AM +0100, Eric Botcazou wrote:
> > Yes, they do, that is why it crashed during final.
> 
> OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

Because reorder_insns doesn't handle the case of moving a barrier into a
middle of basic block.

  if (!BARRIER_P (from)
  && (bb2 = BLOCK_FOR_INSN (from)))
{
  if (BB_END (bb2) == to)
BB_END (bb2) = prev;
  df_set_bb_dirty (bb2);
}

  if (BB_END (bb) == after)
BB_END (bb) = to;

  for (x = from; x != NEXT_INSN (to); x = NEXT_INSN (x))
if (!BARRIER_P (x))
  df_insn_change_bb (x, bb);

from == to is a BARRIER in this case, BB_END (bb) != after (BB_END
is actually PREV_INSN (from)), so this doesn't do anything at all.

While what we need is:

1) set BB_END to after
2) clear BLOCK_FOR_INSN on the notes after AFTER (after addition of
   barrier after FROM == TO) until former PREV_INSN (FROM) (inclusive)

Jakub


[PATCH] S/390: -mhotpatch v2

2015-01-27 Thread Dominik Vogt
The attached patch updates the -mhotpatch option and the hopatch
function attribute with (incompatible) new semantics.  Please
refer to the commit in the patch for details.

--

2015-01-27  Dominik Vogt  

* doc/extend.texi: s/390: Update documentation of hotpatch attribute.
* doc/invoke.texi (-mhotpatch): s/390: Update documentation of
-mhotpatch= option.
* config/s390/s390.opt (mhotpatch): s/390: Remove -mhotpatch and
-mno-hotpatch options.  Change syntax of -mhotpatch= option.
* config/s390/s390.c (s390_hotpatch_trampoline_halfwords_default):
Renamed.
(s390_hotpatch_trampoline_halfwords_max): Renamed.
(s390_hotpatch_hw_max): New name.
(s390_hotpatch_trampoline_halfwords): Renamed.
(s390_hotpatch_hw_before_label): New name.
(get_hotpatch_attribute): Removed.
(s390_hotpatch_hw_after_label): New name.
(s390_handle_hotpatch_attribute): Add second parameter to hotpatch
attribute.
(s390_attribute_table): Ditto.
(s390_function_num_hotpatch_trampoline_halfwords): Renamed.
(s390_function_num_hotpatch_hw): New name.
Remove special handling of inline functions and hotpatching.
Return number of nops before and after the function label.
(s390_can_inline_p): Removed.
(s390_asm_output_function_label): Emit a configurable number of nops
after the function label.
(s390_option_override): Update -mhotpatch= syntax and remove -mhotpatch.
(TARGET_CAN_INLINE_P) Removed.
(TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P): New.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From 9123265bb1d6e325f4edc99a2d1f33a862b3ba53 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 1 Dec 2014 15:59:42 +0100
Subject: [PATCH] S/390: -mhotpatch v2

Update the -mhotpatch option and the hotpatch function attribute to take
exactly two arguments.  The first is the number of halfwords to be filled with
two-byte-nops before the function label.  The second is the number of halfwords
to be filled with nops after the label (the biggest available nop instructions
are used).

Further changes are:

* Artificial functions and the main function are also patched.
* Functions selected for hotpatching can still be inlined.  It's the
  responsibility of the user to take care of this when patching, or to
  explicitly disable inlining.
---
 gcc/config/s390/s390.c | 227 -
 gcc/config/s390/s390.opt   |  12 +-
 gcc/doc/extend.texi|  17 +-
 gcc/doc/invoke.texi|  16 +-
 gcc/testsuite/gcc.target/s390/hotpatch-1.c |  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-10.c|  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-11.c|  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-12.c|  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-13.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-14.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-15.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-16.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-17.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-18.c|  16 ++
 gcc/testsuite/gcc.target/s390/hotpatch-19.c|  23 +++
 gcc/testsuite/gcc.target/s390/hotpatch-2.c |  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-20.c|  20 ++
 gcc/testsuite/gcc.target/s390/hotpatch-3.c |  10 +-
 gcc/testsuite/gcc.target/s390/hotpatch-4.c |  18 +-
 gcc/testsuite/gcc.target/s390/hotpatch-5.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-6.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-7.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-8.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-9.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-1.c |  24 +--
 .../gcc.target/s390/hotpatch-compile-10.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-11.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-12.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-13.c  |  29 +++
 .../gcc.target/s390/hotpatch-compile-14.c  |  11 +
 .../gcc.target/s390/hotpatch-compile-15.c  |  43 
 .../gcc.target/s390/hotpatch-compile-16.c  |  24 +++
 gcc/testsuite/gcc.target/s390/hotpatch-compile-2.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-3.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-4.c |   2 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-5.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-6.c |   4 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-7.c |  66 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-9.c |  12 ++
 40 files changed, 532 insertions(+), 404 deletions(-)
 create mode 100644 gcc/testsuite/gcc

Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou
> Yes, they do, that is why it crashed during final.

OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

-- 
Eric Botcazou


Re: [Patch, Fortran] PR63861 - fix OpenMP/ACC's gfc_has_alloc_comps

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 08:27:07AM +0100, Tobias Burnus wrote:
> 2015-01-27  Tobias Burnus  
> 
>   PR fortran/63861
> gcc/fortran/
>   * trans-openmp.c (gfc_has_alloc_comps, gfc_trans_omp_clauses):
>   Fix handling for scalar coarrays.
>   * trans-types.c (gfc_get_element_type): Add comment.
> 
> gcc/testsuite/
>   * gfortran.dg/goacc/coarray_2.f90: New.

Ok, thanks.

Jakub


Re: [PATCH] Fix for PR64741 (UBSan/ASan integration)

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 09:19:20AM +0300, Yury Gribov wrote:
> As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64741 , ASan
> may currently report false positives for UBSan internal variables due to
> their incomplete type information. This patch fixes this.
> 
> Bootstrapped and regtested on Linux x64. Ok to commit?
> 
> -Y

> commit cf083510ece7b7bde1ab5a41e293b5a6a5bb4550
> Author: Yury Gribov 
> Date:   Mon Jan 26 10:19:03 2015 +0300
> 
> 2015-01-26  Yury Gribov  
> 
>   PR ubsan/64741
> 
>   * ubsan.c (ubsan_type_descriptor): Update type size.

No extra newline between PR and * ubsan.c lines.

> --- a/gcc/ubsan.c
> +++ b/gcc/ubsan.c
> @@ -504,6 +504,14 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
> pstyle)
>tinfo = get_ubsan_type_info_for_type (type);
>  
>/* Create a new VAR_DECL of type descriptor.  */
> +  const char *tmp = pp_formatted_text (&pretty_name);
> +  size_t len = strlen (tmp);
> +  tree str = build_string (len + 1, tmp);
> +  TREE_TYPE (str) = build_array_type (char_type_node,
> +   build_index_type (size_int (len)));
> +  TREE_READONLY (str) = 1;
> +  TREE_STATIC (str) = 1;

While touching this, could you please rewrite it as:
  const char *tmp = pp_formatted_text (&pretty_name);
  size_t len = strlen (tmp) + 1;
  tree str = build_string (len, tmp);
  TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
  TREE_READONLY (str) = 1;
  TREE_STATIC (str) = 1;
?  Or, if you want, do it as a follow-up.  There is another occurrence
of this in ubsan_source_location.

Ok for trunk with or without this change.

Jakub