Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Richard Biener
On August 9, 2018 7:26:19 AM GMT+02:00, Jeff Law  wrote:
>On 07/24/2018 05:18 PM, Bernd Edlinger wrote:
>> On 07/24/18 23:46, Jeff Law wrote:
>>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
 Hi!

 This patch makes strlen range computations more conservative.

 Firstly if there is a visible type cast from type A to B before
>passing
 then value to strlen, don't expect the type layout of B to restrict
>the
 possible return value range of strlen.
>>> Why do you think this is the right thing to do?  ie, is there
>language
>>> in the standards that makes you think the code as it stands today is
>>> incorrect from a conformance standpoint?  Is there a significant
>body of
>>> code that is affected in an adverse way by the current code?  If so,
>>> what code?
>>>
>>>
>> 
>> I think if you have an object, of an effective type A say char[100],
>then
>> you can cast the address of A to B, say typedef char (*B)[2] for
>instance
>> and then to const char *, say for use in strlen.  I may be wrong, but
>I think
>> that we should at least try not to pick up char[2] from B, but
>instead
>> use A for strlen ranges, or leave this range open.  Currently the
>range
>> info for strlen is [0..1] in this case, even if we see the type cast
>> in the generic tree.
>Coming back to this... I'd like to hope we can depend on the type of
>the
>strlen argument.  Obviously if it's a char *, we know nothing.  But if
>it's an ARRAY_TYPE it'd be advantageous if we could use the array
>bounds
>to bound the length.

 But the FE made it char * and only because pointer type conversions are 
useless we may see sth else. So we cannot use the type of the argument. 

>
>*But* we have code which will turn pointer access into array indexing.
>tree-ssa-forwprop.c can do that, there may be others.  So if we
>originally had a char * argument to strlen, but forwprop changed it
>into
>a char[N] type, then have we just broken things?  I'd totally forgotten
>about this behavior from forwprop.  PR 46393 contains sample code where
>this happens.

If we really still have this it must be very constrained. Because we used to 
have sorts of wrong code issues with this and data dependence analysis. 

>I also thought we had code to recover array indexing from pointer
>arithmetic in the C/C++ front-end, but I can't seem to find it tonight.
>But it would raise similar concerns.

I've removed that. What I did at some point is avoid decaying too much array to 
pointer conversions in the FEs to preserve array refs instead of trying to 
reconstruct them late. 

>
>
>
>> 
>> One other example I have found in one of the test cases:
>> 
>> char c;
>> 
>> if (strlen(&c) != 0) abort();
>> 
>> this is now completely elided, but why?  Is there a code base where
>> that is used?  I doubt, but why do we care to eliminate something
>> stupid like that?  If we would emit a warning for that I'm fine with
>it,
>> But if we silently remove code like that I don't think that it
>> will improve anything.  So I ask, where is the code base which
>> gets an improvement from that optimization?
>I think it falls out naturally from trying to get accurate
>computations.
>I don't think either Martin or I really care about optimizing strlen in
>this case.  In fact it's so clearly erroneous that it ought to generate
>a diagnostic on its own.  Knowing Martin it was probably included in
>the
>tests for completeness.
>
>However, there is a fair amount of code that passes addresses of
>characters to functions that want char * string arguments and those
>functions promptly walk off the end of the single character,
>unterminated string.  We actually just saw one of these in glibc that
>was detected by Martin's recent work.  So it's definitely useful to
>track how these kinds of values get used.
>
>> 
>>> Ultimately we want highly accurate string lengths to help improve
>the
>>> quality of the warnings we generate for potentially dangerous code.
>>> These changes seem to take us in the opposite direction.
>>>
>> 
>> No, I don't think so, we have full control on the direction, when
>> I do what Richi requested on his response, we will have one function
>> where the string length estimation is based upon, instead of several
>> open coded tree walks.
>I don't think anyone objects to consolidating length computation.  What
>I think we're hashing through is how does the object model in GIMPLE
>affect the assumptions that can be correctly made about lengths of
>objects.  When I ACK'd Martin's patches I'd totally forgotten about
>these issues in GIMPLE and the impact they'd have if they were used in
>areas that affect code generation.  That is absolutely and totally my
>mistake.
>
>I suspect that we're ultimately going to have to refine the design a
>bit
>so that the lengths that impact code generation/optimization are
>distinct from those that are used for warnings.  I'm not keen on this
>concept, but I believe it's better than just reverting all the work on
>issuing diagnostics f

Re: [RFC,PATCH] Introduce -msdata=explicit for powerpc

2018-08-08 Thread Alexandre Oliva
On Aug  8, 2018, Segher Boessenkool  wrote:

> Then you get sdata2 used (via srodata in generic code), and it is accessed
> via GPR2 (via the sda21 reloc and linker magic).  It is hard to trace down :-)

Aah, it didn't occur to me that the r2 uses could be introduced by the
linker.  I was cutting corners and looking at the asm output, after
building only gcc, without binutils.  Thanks for the explanation.  Mind
if I add a comment about that next to... hmm...  how about the
SMALL_DATA_* macros?

for  gcc/ChangeLog

* config/rs6000/rs6000.c (SMALL_DATA_RELOC, SMALL_DATA_REG): Add
a comment about how uses of r2 for .sdata2 come about.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ddc61bdaffe7..bbc564de41ae 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21240,6 +21240,9 @@ rs6000_output_function_entry (FILE *file, const char 
*fname)
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
+/* Access to .sdata2 through r2 (see -msdata=eabi in invoke.texi) is
+   only introduced by the linker, when applying the sda21
+   relocation.  */
 #define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel")
 #define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13)
 #else


-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [RFC][PATCH][mid-end] Optimize immediate choice in comparisons.

2018-08-08 Thread Jeff Law
On 08/07/2018 02:11 PM, Richard Sandiford wrote:
> Hi Vlad,
> 
> Thanks for the patch.
> 
> Vlad Lazar  writes:
>> Hi.
>>
>> This patch optimises the choice of immediates in integer comparisons. Integer
>> comparisons allow for different choices (e.g. a > b is equivalent to a >= 
>> b+1)
>> and there are cases where an incremented/decremented immediate can be loaded 
>> into a
>> register in fewer instructions. The cases are as follows:
>>
>> i)   a >  b or a >= b + 1
>> ii)  a <= b or a <  b + 1
>> iii) a >= b or a >  b - 1
>> iv)  a <  b or a <= b - 1
>>
>> For each comparison we check whether the equivalent can be performed in less 
>> instructions.
>> This is done on a statement by statement basis, right before the GIMPLE 
>> statement is expanded
>> to RTL. Therefore, it requires a correct implementation of the 
>> TARGET_INSN_COST hook.
>> The change is general and it applies to any integer comparison, regardless 
>> of types or location.
>>
>> For example, on AArch64 for the following code:
>>
>> int foo (int x)
>> {
>>return x > 0xfefe;
>> }
>>
>> it generates:
>>
>> mov  w1, -16777217
>> cmp  w0, w1
>> cset w0, cs
>>
>> instead of:
>>
>> mov  w1, 65534
>> movk w1, 0xfeff, lsl 16
>> cmp  w0, w1
>> cset w0, hi
>>
>> Bootstrapped and regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu 
>> and there are no regressions.
> 
> Looks like a useful feature.  I'm playing devil's advocate to some
> extent here, but did you consider instead doing this during the
> expansion functions themselves?  In some ways that feels more
> natural because we're already operating on rtxes at that point.
> It also has the advantage that it would trap comparisons that didn't
> exist at the gimple level, such as when computing the carry for a
> doubleword add/sub.
I've got no strong opinions on doing it in cfgexpand vs the expansion
functions themselves.  I'm happy to have you setting overall direction
here Richard.

I do worry about the amount of RTL we generate and throw away during
cost computation.  Though it's just for comparisons, so it may not be
terrible.

I wouldn't be surprised if ports aren't particularly accurate in their
costing computations for this kind of use -- but that's nothing new.  We
run into it every time we use rtx costing in a new place.  I'm
comfortable having targets fault in improvements for this kind of use.

Jeff


Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Jeff Law
On 08/02/2018 09:42 AM, Martin Sebor wrote:

> The warning bits are definitely not okay by me.  The purpose
> of the warnings (-W{format,sprintf}-{overflow,truncation} is
> to detect buffer overflows.  When a warning doesn't have access
> to string length information for dynamically created strings
> (like the strlen pass does) it uses array sizes as a proxy.
> This is useful both to detect possible buffer overflows and
> to prevent false positives for overflows that cannot happen
> in correctly written programs.
So how much of this falling-back to array sizes as a proxy would become
unnecessary if sprintf had access to the strlen pass as an analysis module?

As you know we've been kicking that around and from my investigations
that doesn't really look hard to do.  Encapsulate the data structures in
a class, break up the statement handling into analysis and optimization
and we should be good to go.  I'm hoping to start prototyping this week.

If we think that has a reasonable chance to eliminate the array-size
fallback, then that seems like the most promising path forward.


> 
> By changing the logic to not consider the bounds of the array
> the warnkngs will become prone to many false positives as is
> evident from your changes to the tests (you had to explicitly
> enable the new option).
ISTM that a reasonable goal to shoot for is to be able to run the
existing warning tests without the array-size fallback once we've got a
string length analysis module.

Jeff


Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Jeff Law
On 07/24/2018 05:18 PM, Bernd Edlinger wrote:
> On 07/24/18 23:46, Jeff Law wrote:
>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> This patch makes strlen range computations more conservative.
>>>
>>> Firstly if there is a visible type cast from type A to B before passing
>>> then value to strlen, don't expect the type layout of B to restrict the
>>> possible return value range of strlen.
>> Why do you think this is the right thing to do?  ie, is there language
>> in the standards that makes you think the code as it stands today is
>> incorrect from a conformance standpoint?  Is there a significant body of
>> code that is affected in an adverse way by the current code?  If so,
>> what code?
>>
>>
> 
> I think if you have an object, of an effective type A say char[100], then
> you can cast the address of A to B, say typedef char (*B)[2] for instance
> and then to const char *, say for use in strlen.  I may be wrong, but I think
> that we should at least try not to pick up char[2] from B, but instead
> use A for strlen ranges, or leave this range open.  Currently the range
> info for strlen is [0..1] in this case, even if we see the type cast
> in the generic tree.
Coming back to this... I'd like to hope we can depend on the type of the
strlen argument.  Obviously if it's a char *, we know nothing.  But if
it's an ARRAY_TYPE it'd be advantageous if we could use the array bounds
to bound the length.

*But* we have code which will turn pointer access into array indexing.
tree-ssa-forwprop.c can do that, there may be others.  So if we
originally had a char * argument to strlen, but forwprop changed it into
a char[N] type, then have we just broken things?  I'd totally forgotten
about this behavior from forwprop.  PR 46393 contains sample code where
this happens.

I also thought we had code to recover array indexing from pointer
arithmetic in the C/C++ front-end, but I can't seem to find it tonight.
But it would raise similar concerns.




> 
> One other example I have found in one of the test cases:
> 
> char c;
> 
> if (strlen(&c) != 0) abort();
> 
> this is now completely elided, but why?  Is there a code base where
> that is used?  I doubt, but why do we care to eliminate something
> stupid like that?  If we would emit a warning for that I'm fine with it,
> But if we silently remove code like that I don't think that it
> will improve anything.  So I ask, where is the code base which
> gets an improvement from that optimization?
I think it falls out naturally from trying to get accurate computations.
 I don't think either Martin or I really care about optimizing strlen in
this case.  In fact it's so clearly erroneous that it ought to generate
a diagnostic on its own.  Knowing Martin it was probably included in the
tests for completeness.

However, there is a fair amount of code that passes addresses of
characters to functions that want char * string arguments and those
functions promptly walk off the end of the single character,
unterminated string.  We actually just saw one of these in glibc that
was detected by Martin's recent work.  So it's definitely useful to
track how these kinds of values get used.

> 
>> Ultimately we want highly accurate string lengths to help improve the
>> quality of the warnings we generate for potentially dangerous code.
>> These changes seem to take us in the opposite direction.
>>
> 
> No, I don't think so, we have full control on the direction, when
> I do what Richi requested on his response, we will have one function
> where the string length estimation is based upon, instead of several
> open coded tree walks.
I don't think anyone objects to consolidating length computation.  What
I think we're hashing through is how does the object model in GIMPLE
affect the assumptions that can be correctly made about lengths of
objects.  When I ACK'd Martin's patches I'd totally forgotten about
these issues in GIMPLE and the impact they'd have if they were used in
areas that affect code generation.  That is absolutely and totally my
mistake.

I suspect that we're ultimately going to have to refine the design a bit
so that the lengths that impact code generation/optimization are
distinct from those that are used for warnings.  I'm not keen on this
concept, but I believe it's better than just reverting all the work on
issuing diagnostics for fishy code.


We're going to be kicking this around immediately -- there's a concern
that some of this may have gotten into the gcc-7 and/or gcc-8 codebase.
We need to get a sense of scale of the damage as well as a sense of
scale for how to go about fixing the codegen issues while still keeping
the benefits in the warning code.

If we go quiet, it's not from a lack of caring about this issue.  Quite
the opposite, we want to make sure we address these issues correctly
without just churning on the trunk.

Jeff


Re: [RFC,PATCH] Introduce -msdata=explicit for powerpc

2018-08-08 Thread Alexandre Oliva
On Aug  7, 2018, Segher Boessenkool  wrote:

> The approach looks like it should work, but it does not seem all that
> convenient to me.

And there's more, it's actually redundant.
-msdata -G 0 is equivalent to the proposed -msdata=explicit.
Patch withdrawn.  Sorry about the noise.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [PATCH v2 01/10] Initial TI PRU GCC port

2018-08-08 Thread Dimitar Dimitrov
On Tuesday, 7 Aug 2018, 16:56:16 EEST Sandra Loosemore wrote:
> > * doc/extend.texi: Document PRU pragmas.
> > * doc/invoke.texi: Document PRU-specific options.
> > * doc/md.texi: Document PRU asm constraints.
> 
> I have a few nit-picky comments about the documentation parts.

Thank you for the review. I will fix those and send a new patch version.

Regards,
Dimitar


Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Sebastian Huber

On 08/08/18 16:33, Jonathan Wakely wrote:

On 08/08/18 16:22 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:


Aha, so newlib was using memalign previously:

@@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
 #else
 extern "C" void *memalign(std::size_t boundary, std::size_t size);
 #endif
-#define aligned_alloc memalign


Yes, exactly ... this commit introduced the regression.


OK, I've regressed the branches then - I'll fix that.


This should fix it. I'll finish testing and commit it.

Sebastian, your patch to define HAVE_ALIGNED_ALLOC is OK for
gcc-7-branch and gcc-8-branch, because changing newlib from using
memalign to aligned_alloc is safe. 


Should I check in my patch in addition to your patch?

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] assume sprintf formatting of wide characters may fail (PR 86853)

2018-08-08 Thread Jeff Law
On 08/04/2018 12:46 PM, Martin Sebor wrote:
> The sprintf handling of wide characters neglects to consider
> that calling the function may fail due to a conversion error
> (when the wide character is invalid or not representable in
> the current locale).  The handling also misinterprets
> the POSIX %S wide string directive as a plain narrow %s and
> doesn't include %C (the POSIX equivalent of %lc).  The attached
> patch corrects these oversights by extending the data structures
> to indicate when a directive may fail, and extending the UNDER4K
> member of the format_result structure to also encode calls with
> such directives.
> 
> Tested on x86_64-linux.
> 
> Besides the trunk, since this bug can affect code correctness
> I'd like to backport this patch to both release branches (7
> and 8).
> 
> Martin
> 
> gcc-86853.diff
> 
> 
> PR tree-optimization/86853 - sprintf optimization for wide strings doesn't 
> account for conversion failure
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/86853
>   * gimple-ssa-sprintf.c (struct format_result): Rename member.
>   (struct fmtresult): Add member and initialize it in ctors.
>   (format_character): Handle %C.  Extend range to NUL.  Set MAYFAIL.
>   (format_string): Handle %S the same as %ls.  Set MAYFAIL.
>   (format_directive): Set POSUNDER4K when MAYFAIL is set.
>   (parse_directive): Handle %C same as %c.
>   (sprintf_dom_walker::compute_format_length): Adjust.
>   (is_call_safe): Adjust.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/86853
>   * gcc.dg/tree-ssa/builtin-sprintf-10.c: New test.
>   * gcc.dg/tree-ssa/builtin-sprintf-11.c: New test.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Adjust.
Mostly OK.  One nit noted below and one question.  If you're
sufficiently confident that the charmap test is OK, then you just need
to address the nit in the comment and you're good to go.

> 
> Index: gcc/gimple-ssa-sprintf.c
> ===
> --- gcc/gimple-ssa-sprintf.c  (revision 263295)
> +++ gcc/gimple-ssa-sprintf.c  (working copy)
> @@ -2217,13 +2224,20 @@ format_character (const directive &dir, tree arg,
> res.range.likely = 0;
> res.range.unlikely = 0;
>   }
> -   else if (min > 0 && min < 128)
> +   else if (min >= 0 && min < 128)
>   {
> +   /* Be conservative if the target execution character set
> +  is not a 1-to-1 mapping to the source character set or
> +  if the source set is not ASCII.  */
> +   bool one_2_one_ascii
> + = (target_to_host_charmap[0] == 1 && target_to_host ('a') == 
> 97);
Hmm.  Is this really sufficient?I have nowhere near enough knowledge
of the potential target character sets to know if this is sufficiently
tight.

> @@ -2302,6 +2320,10 @@ format_string (const directive &dir, tree arg, vr_
> /* Even a non-empty wide character string need not convert into
>any bytes.  */
> res.range.min = 0;
> +
> +   /* A non-emtpy wide character conversion may fail.  */
s/emtpy/empty/

Jeff



Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Martin Sebor

On 08/08/2018 02:48 PM, Joseph Myers wrote:

On Wed, 8 Aug 2018, Martin Sebor wrote:


Jason/Joseph, is this meant to be accepted?  It's rejected with
a hard error with -Wpedantic but I don't see any tests for it:

warning: ISO C forbids empty initializer braces [-Wpedantic]
   const char x[] = { };
^
error: zero or negative size array ‘b’
   const char x[] = { };
  ^

I'll avoid handling it but I'm not sure I should add a test case
for it if it's accepted by mistake (and if I should open a bug
to reject it instead).


It's a perfectly ordinary combination of two GNU extensions (zero-size
arrays, and empty initializer braces), so yes, it should be accepted, and
have size 0 (but -pedantic should produce a pedwarn, not a hard error).


Okay, let me fix that separately.

As an aside, while adding tests for the zero-size array, I found
the handling of these extensions less than intuitive.

The empty braced initializer makes the array complete and gives
it a zero size.  I suppose that makes sense.

But for a declaration at file scope and without an initializer,
GCC warns that the array is assumed to have one element, but
then gives an error when sizeof is applied to it:

  const char a[];   // warning: array ‘a’ assumed to have one element
  const int n = sizeof a;   // error: invalid application of ‘sizeof’ 
to incomplete type ‘const char[]’


even though the array does have a size of 1.  (I guess that's
because of the [] syntax but the warning sure makes the sizeof
error surprising.)  Auto declarations of arrays with no bound
and with no intializer are rejected with an error:

  char a[];   // error: array size missing in ‘a’

I'm sure there's some explanation for this too but that doesn't
make it intuitive to work with.

It would be nice to say something about how this works and maybe
even why in the manual.  The documentation for zero-length arrays
doesn't describe these kinds of extensions or the subtle nuances
(it only talks about the [0] form where the zero is explicit, or
flexible array members).  Without tests it's hard to tell what's
meant to be valid and what could be accepted by accident.  With
the patch accepted tests will exist, but unless it's discussed
somewhere I missed I'll put together a patch to update the docs.

Martin


Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Martin Sebor

On 08/08/2018 05:08 AM, Jason Merrill wrote:

On Wed, Aug 8, 2018 at 9:04 AM, Martin Sebor  wrote:

On 08/07/2018 02:57 AM, Jason Merrill wrote:


On Wed, Aug 1, 2018 at 12:49 AM, Martin Sebor  wrote:


On 07/31/2018 07:38 AM, Jason Merrill wrote:



On Tue, Jul 31, 2018 at 9:51 AM, Martin Sebor  wrote:



The middle-end contains code to determine the lengths of constant
character arrays initialized by string literals.  The code is used
in a number of optimizations and warnings.

However, the code is unable to deal with constant arrays initialized
using the braced initializer syntax, as in

  const char a[] = { '1', '2', '\0' };

The attached patch extends the C and C++ front-ends to convert such
initializers into a STRING_CST form.

The goal of this work is to both enable existing optimizations for
such arrays, and to help detect bugs due to using non-nul terminated
arrays where nul-terminated strings are expected.  The latter is
an extension of the GCC 8 _Wstringop-overflow and
-Wstringop-truncation warnings that help detect or prevent reading
past the end of dynamically created character arrays.  Future work
includes detecting potential past-the-end reads from uninitialized
local character arrays.





  && TYPE_MAIN_VARIANT (TREE_TYPE (valtype)) == char_type_node)




Why? Don't we want this for other character types as well?



It suppresses narrowing warnings for things like

  signed char a[] = { 0xff };

(there are a couple of tests that exercise this).



Why is plain char different in this respect?  Presumably one of

char a[] = { -1 };
char b[] = { 0xff };

should give the same narrowing warning, depending on whether char is
signed.



Right.  I've added more tests to verify that it does.


At the same time, STRING_CST is supposed to be able to represent
strings of any integer type so there should be a way to make it
work.  On the flip side, recent discussions of changes in this
area suggest there may be bugs in the wide character handling of
STRING_CST so those would need to be fixed before relying on it
for robust support.

In any case, if you have a suggestion for how to make it work for
at least the narrow character types I'll adjust the patch.



I suppose braced_list_to_string should call check_narrowing for C++.



I see.  I've made that change.  That has made it possible to
convert arrays of all character types.  Thanks!


Currently it uses tree_fits_shwi_p (signed host_wide_int) and then
stores the extracted value in a host unsigned int, which is then
converted to host char.  Does the right thing happen for -fsigned-char
or targets with a different character set?


I believe so.  I've added tests for these too (ASCII and EBCDIC)
and also changed the type of the extracted value to HWI to match
(it doesn't change the results of the tests).

Attached is an updated patch with these changes plus more tests
as suggested by Joseph.


Great.  Can we also move the call to braced_list_to_string into
check_initializer, so it works for templates as well?  As a case just
before the block that calls reshape_init seems appropriate.


Done in the attached patch.  I've also avoided dealing with
zero-length arrays and added tests to make sure their size
stays is regardless of the form of their initializer and
the appropriate warnings are issued.

Using build_string() rather than build_string_literal() needed
a tweak in digest_init_r().  It didn't break anything but since
the array type may not have a domain yet, neither will the
string.  It looks like that may get adjusted later on but I've
temporarily guarded the code with #if 1.  If the change is
fine I'll remove the #if before committing.

This initial patch only handles narrow character initializers
(i.e., those with TYPE_STRING_FLAG set).  Once this gets some
exposure I'd like to extend it to other character types,
including wchar_t.

Martin
PR tree-optimization/71625 - missing strlen optimization on different array initialization style

gcc/c/ChangeLog:

	PR tree-optimization/71625
	* c-parser.c (c_parser_declaration_or_fndef): Call
	braced_list_to_string.

gcc/c-family/ChangeLog:

	PR tree-optimization/71625
	* c-common.c (braced_list_to_string): New function.
	* c-common.h (braced_list_to_string): Declare it.

gcc/cp/ChangeLog:

	PR tree-optimization/71625
	* decl.c (check_initializer):  Call braced_list_to_string.
	(eval_check_narrowing): New function.
	* gcc/cp/typeck2.c (digest_init_r): Accept strings literals
	as initilizers for all narrow character types.

gcc/testsuite/ChangeLog:

	PR tree-optimization/71625
	* g++.dg/init/string2.C: New test.
	* g++.dg/init/string3.C: New test.
	* g++.dg/init/string4.C: New test.
	* gcc.dg/init-string-3.c: New test.
	* gcc.dg/strlenopt-55.c: New test.
	* gcc.dg/strlenopt-56.c: New test.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d919605..b10d9c9 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8509,4 +8509,102 @@ maybe_add_include_fixit (rich_location *richl

Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Joseph Myers
On Wed, 8 Aug 2018, Martin Sebor wrote:

> Jason/Joseph, is this meant to be accepted?  It's rejected with
> a hard error with -Wpedantic but I don't see any tests for it:
> 
> warning: ISO C forbids empty initializer braces [-Wpedantic]
>const char x[] = { };
> ^
> error: zero or negative size array ‘b’
>const char x[] = { };
>   ^
> 
> I'll avoid handling it but I'm not sure I should add a test case
> for it if it's accepted by mistake (and if I should open a bug
> to reject it instead).

It's a perfectly ordinary combination of two GNU extensions (zero-size 
arrays, and empty initializer braces), so yes, it should be accepted, and 
have size 0 (but -pedantic should produce a pedwarn, not a hard error).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Ping for STRING_CST patches

2018-08-08 Thread Joseph Myers
On Wed, 8 Aug 2018, Bernd Edlinger wrote:

> [PATCH] Handle overlength strings in the C FE
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00142.html

At least this one had at least one review comment 
.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Bernd Edlinger
On 08/08/18 21:50, Martin Sebor wrote:
>> Sorry, again, but could it be possible that this test case changed
>> with your patch?
>>
>> $ cat w.c
>> const char  x[] = {  };
>> int main()
>> {
>>    __builtin_printf("%ld\n", sizeof(x));
>>    return 0;
>> }
>> $ gcc w.c
>> $ ./a.out
>> 1
>>
>> without your patch
>> $ ./a.out
>> 0
>>
> 
> Jason/Joseph, is this meant to be accepted?  It's rejected with
> a hard error with -Wpedantic but I don't see any tests for it:
> 
> warning: ISO C forbids empty initializer braces [-Wpedantic]
>     const char x[] = { };
>      ^
> error: zero or negative size array ‘b’
>     const char x[] = { };
>    ^
> 
> I'll avoid handling it but I'm not sure I should add a test case
> for it if it's accepted by mistake (and if I should open a bug
> to reject it instead).
> 
> The comment above complete_array_type() in c-common.c and
> the code kind of suggest it's on purpose but the odd failure
> mode (hard error with -Wpedantic, silence otherwise) and
> the absence of tests make me wonder.
> 

This is also a bit odd:
$ cat w.cc
$ g++ w.cc
w.cc:1:26: error: initializer-string for array of chars is too long 
[-fpermissive]
  char   x[3] = { 1,2,3,4  };
$ cat w1.cc
char   x[3] = { 1,2,3,4,5  };
int main()
{
   __builtin_printf("%ld\n", sizeof(x));
   return 0;
}
$ g++ w.cc
w1.cc:1:28: error: too many initializers for 'char [3]'
  char   x[3] = { 1,2,3,4,5  };

This suggests, that it might be better to fold the initializer at where
my C/C++ FE patch shortens the string constants?

It would have the advantage, that the array size is already known there.
And it would avoid changing the array size accidentally, because the
decl is already laid out?

What do you think?


Bernd.


m68k: handle more cases of TLS symbols with offset

2018-08-08 Thread Andreas Schwab
This is a better fix for PR target/46179.  There are more DImode insns
that call adjust_operand during output processing.  Instead of scanning
the insns in the FINAL_PRESCAN_INSN hook, adjust them directly before
they are output by print_operand or print_operand_address.

Andreas.

PR target/46179
* config/m68k/m68k.h (FINAL_PRESCAN_INSN): Don't define.
* config/m68k/m68k.c (handle_move_double): Don't call
m68k_final_prescan_insn.
(m68k_adjust_decorated_operand): Renamed from
m68k_final_prescan_insn, remove first and third operand and
simplify.
(print_operand): Call it.
(print_operand_address): Call it.

PR target/46179
* gcc.target/m68k/tls-dimode.c: New file.
---
 gcc/ChangeLog  | 12 
 gcc/config/m68k/m68k.c | 66 ++
 gcc/config/m68k/m68k.h |  3 -
 gcc/testsuite/ChangeLog|  5 ++
 gcc/testsuite/gcc.target/m68k/tls-dimode.c | 15 +
 5 files changed, 62 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/m68k/tls-dimode.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 64704bcd6d..e8f65c9791 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2018-08-08  Andreas Schwab  
+
+   PR target/46179
+   * config/m68k/m68k.h (FINAL_PRESCAN_INSN): Don't define.
+   * config/m68k/m68k.c (handle_move_double): Don't call
+   m68k_final_prescan_insn.
+   (m68k_adjust_decorated_operand): Renamed from
+   m68k_final_prescan_insn, remove first and third operand and
+   simplify.
+   (print_operand): Call it.
+   (print_operand_address): Call it.
+
 2018-08-08  Nathan Sidwell  
 
* diagnostic.c (diagnostic_report_current_module): Use
diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index 75a5a5b69b..303dfc1c0c 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -2329,11 +2329,10 @@ m68k_unwrap_symbol (rtx orig, bool unwrap_reloc32_p)
   return m68k_unwrap_symbol_1 (orig, unwrap_reloc32_p, NULL);
 }
 
-/* Prescan insn before outputing assembler for it.  */
+/* Adjust decorated address operand before outputing assembler for it.  */
 
-void
-m68k_final_prescan_insn (rtx_insn *insn ATTRIBUTE_UNUSED,
-rtx *operands, int n_operands)
+static void
+m68k_adjust_decorated_operand (rtx op)
 {
   int i;
 
@@ -2355,45 +2354,38 @@ m68k_final_prescan_insn (rtx_insn *insn 
ATTRIBUTE_UNUSED,
  to patch up anything outside of the operand.  */
 
   subrtx_var_iterator::array_type array;
-  for (i = 0; i < n_operands; ++i)
+  FOR_EACH_SUBRTX_VAR (iter, array, op, ALL)
 {
-  rtx op;
-
-  op = operands[i];
-
-  FOR_EACH_SUBRTX_VAR (iter, array, op, ALL)
+  rtx x = *iter;
+  if (m68k_unwrap_symbol (x, true) != x)
{
- rtx x = *iter;
- if (m68k_unwrap_symbol (x, true) != x)
-   {
- rtx plus;
+ rtx plus;
 
- gcc_assert (GET_CODE (x) == CONST);
- plus = XEXP (x, 0);
+ gcc_assert (GET_CODE (x) == CONST);
+ plus = XEXP (x, 0);
 
- if (GET_CODE (plus) == PLUS || GET_CODE (plus) == MINUS)
-   {
- rtx unspec;
- rtx addend;
+ if (GET_CODE (plus) == PLUS || GET_CODE (plus) == MINUS)
+   {
+ rtx unspec;
+ rtx addend;
 
- unspec = XEXP (plus, 0);
- gcc_assert (GET_CODE (unspec) == UNSPEC);
- addend = XEXP (plus, 1);
- gcc_assert (CONST_INT_P (addend));
+ unspec = XEXP (plus, 0);
+ gcc_assert (GET_CODE (unspec) == UNSPEC);
+ addend = XEXP (plus, 1);
+ gcc_assert (CONST_INT_P (addend));
 
- /* We now have all the pieces, rearrange them.  */
+ /* We now have all the pieces, rearrange them.  */
 
- /* Move symbol to plus.  */
- XEXP (plus, 0) = XVECEXP (unspec, 0, 0);
+ /* Move symbol to plus.  */
+ XEXP (plus, 0) = XVECEXP (unspec, 0, 0);
 
- /* Move plus inside unspec.  */
- XVECEXP (unspec, 0, 0) = plus;
+ /* Move plus inside unspec.  */
+ XVECEXP (unspec, 0, 0) = plus;
 
- /* Move unspec to top level of const.  */
- XEXP (x, 0) = unspec;
-   }
- iter.skip_subrtxes ();
+ /* Move unspec to top level of const.  */
+ XEXP (x, 0) = unspec;
}
+ iter.skip_subrtxes ();
}
 }
 }
@@ -3496,7 +3488,6 @@ handle_move_double (rtx operands[2],
 
   /* Normal case: do the two words, low-numbered first.  */
 
-  m68k_final_prescan_insn (NULL, operands, 2);
   handle_movsi (operands);
 
   /* Do the middle one of the three words for long double */
@@ -3507,7 +3498,6 @@ handle_mov

Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-08-08 Thread Paul Koning
Thanks.  Ok, so the expressions you gave are undefined for x==1, which says 
that substituting something that is also undefined for x==1 is permitted.  You 
can argue from "undefined" rather than relying on IEEE features like NaN or 
infinite.

paul

> On Aug 8, 2018, at 2:57 PM, Giuliano Augusto Faulin Belinassi 
>  wrote:
> 
> Sorry about that. In the e-mail text field I wrote sinh(tanh(x)) and
> cosh(tanh(x)) where it was supposed to be sinh(atanh(x)) and
> cosh(atanh(x)), thus I am talking about the inverse hyperbolic tangent
> function. The patch code and comments are still correct.
> 
> On Wed, Aug 8, 2018 at 10:58 AM, Paul Koning  wrote:
>> Now I'm puzzled.
>> 
>> I don't see how an infinite would show up in the original expression.  I 
>> don't know hyperbolic functions, so I just constructed a small test program, 
>> and the original vs. the substitution you mention are not at all similar.
>> 
>>paul
>> 
>> 
>>> On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi 
>>>  wrote:
>>> 
>>> That is a good question because I didn't know that such targets
>>> exists. Any suggestion?
>>> 
>>> 
>>> On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning  wrote:
 
 
> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi 
>  wrote:
> 
> Related with bug 86829, but for hyperbolic trigonometric functions.
> This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1
> - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both
> formulas has division by 0, but it causes no harm because 1/(+0) ->
> +infinity, thus the math is still safe.
 
 What about non-IEEE targets that don't have "infinite" in their float 
 representation?
 
   paul
 
 
>> 



Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Martin Sebor

Sorry, again, but could it be possible that this test case changed
with your patch?

$ cat w.c
const char  x[] = {  };
int main()
{
   __builtin_printf("%ld\n", sizeof(x));
   return 0;
}
$ gcc w.c
$ ./a.out
1

without your patch
$ ./a.out
0



Jason/Joseph, is this meant to be accepted?  It's rejected with
a hard error with -Wpedantic but I don't see any tests for it:

warning: ISO C forbids empty initializer braces [-Wpedantic]
   const char x[] = { };
^
error: zero or negative size array ‘b’
   const char x[] = { };
  ^

I'll avoid handling it but I'm not sure I should add a test case
for it if it's accepted by mistake (and if I should open a bug
to reject it instead).

The comment above complete_array_type() in c-common.c and
the code kind of suggest it's on purpose but the odd failure
mode (hard error with -Wpedantic, silence otherwise) and
the absence of tests make me wonder.

Martin



Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-08-08 Thread Giuliano Augusto Faulin Belinassi
Sorry about that. In the e-mail text field I wrote sinh(tanh(x)) and
cosh(tanh(x)) where it was supposed to be sinh(atanh(x)) and
cosh(atanh(x)), thus I am talking about the inverse hyperbolic tangent
function. The patch code and comments are still correct.

On Wed, Aug 8, 2018 at 10:58 AM, Paul Koning  wrote:
> Now I'm puzzled.
>
> I don't see how an infinite would show up in the original expression.  I 
> don't know hyperbolic functions, so I just constructed a small test program, 
> and the original vs. the substitution you mention are not at all similar.
>
> paul
>
>
>> On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi 
>>  wrote:
>>
>> That is a good question because I didn't know that such targets
>> exists. Any suggestion?
>>
>>
>> On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning  wrote:
>>>
>>>
 On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi 
  wrote:

 Related with bug 86829, but for hyperbolic trigonometric functions.
 This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1
 - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both
 formulas has division by 0, but it causes no harm because 1/(+0) ->
 +infinity, thus the math is still safe.
>>>
>>> What about non-IEEE targets that don't have "infinite" in their float 
>>> representation?
>>>
>>>paul
>>>
>>>
>


Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Bernd Edlinger
On 08/08/18 19:33, Bernd Edlinger wrote:
> Hi Martin,
> 
> sorry, I hope you forgive me, when I add a few comments to your patch.
> 
>> +  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
>> +  tree eltype = TREE_TYPE (type);
> ...
>> +  /* Bail if the CTOR has a block of more than 256 embedded nuls
>> +    due to implicitly initialized elements.  */
>> +  unsigned nelts = (idx - str.length ()) + 1;
>> +  if (nelts > 256)
>> +   return NULL_TREE;
> 
> nelts shadows previous nelts.
> 
>> +  if (!nelts || str.length () < i)
> 
> I don't understand when is str.length () < i ?
> 
>> +    /* Append a nul for the empty initializer { } and for the last
>> +   explicit initializer in the loop above that is a nul.  */
>> +    str.safe_push (0);
>> +
>> +  /* Build a string literal but return the embedded STRING_CST.  */
>> +  tree res = build_string_literal (str.length (), str.begin ());
>> +  res = TREE_OPERAND (TREE_OPERAND (res, 0), 0);
>> +  return res;
>> +}
> 
> 
> res has a different type than the object you initialize.
> I think you should use:
> 
> tree res = build_string (str.length (), str.begin ());
> TREE_TYPE (res) = type;
> return res;
> 

Sorry, again, but could it be possible that this test case changed
with your patch?

$ cat w.c
const char  x[] = {  };
int main()
{
   __builtin_printf("%ld\n", sizeof(x));
   return 0;
}
$ gcc w.c
$ ./a.out
1

without your patch
$ ./a.out
0


Bernd.



Re: [PATCH] line-map include-from representation

2018-08-08 Thread Nathan Sidwell

On 08/08/2018 11:36 AM, David Malcolm wrote:


In r255786 I adjusted prune.exp to move the dg-regexp handling to
before the pruning.  Unfortunately, handle-multiline-outputs is still
after the pruning.  I guess we could try moving that to before as well,
but I suspect it might break some things.


Ah, that shows how to use dg-regexp, which does the trick.  I guess I 
can go kill the hack I did on the modules branch for this.


Here's what I'll commit

nathan

--
Nathan Sidwell
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 263427)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,31 @@
+2018-08-08  Nathan Sidwell  
+
+	Make linemap::included_from a location
+	libcpp/
+	* include/line-map.h (struct line_map_ordinary): Replace
+	included_from map index with included_at source_location.
+	(ORDINARY_MAP_INCLUDER_FILE_INDEX): Delete.
+	(LAST_SOURCE_LINE_LOCATION): Delete.
+	(LAST_SOURCE_LINE, LAST_SOURCE_COLUMN): Delete.
+	(linemap_included_from): New.
+	(linemap_included_from_linemap): Declare.
+	(MAIN_FILE_P): Adjust.
+	* line-map.c (linemap_included_from_linemap): New.
+	(lonemap_check_files_exited): Use linemap_included_at.
+	(linemap_add): Adjust inclusion setting.
+	(linemap_dump, linemap_dump_location): Adjust.
+	* directives.c (do_linemarker): Use linemap_included_at.
+	gcc/
+	* diagnostic.c (diagnostic_report_current_module): Use
+	linemap_included_from & linemap_included_from_linemap.
+	gcc/c-family/
+	* c-common.c (try_to_locate_new_include_inertion_point): Use
+	linemap_included_from_linemap.
+	* c-lex.c (fe_file_change): Use linemap_included_from.
+	* c-ppoutput.c (pp_file_change): Likewise.
+	gcc/fortran/
+	* cpp.c (cb_file_change): Use linemap_included_from.
+
 2018-08-08  Hongbo Zhang  
 
 	* config/aarch64/aarch64-cores.def: Add phecda core.
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 263427)
+++ gcc/c-family/c-common.c	(working copy)
@@ -8413,8 +8413,8 @@ try_to_locate_new_include_insertion_poin
   const line_map_ordinary *ord_map
 	= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
-  const line_map_ordinary *from = INCLUDED_FROM (line_table, ord_map);
-  if (from)
+  if (const line_map_ordinary *from
+	  = linemap_included_from_linemap (line_table, ord_map))
 	if (from->to_file == file)
 	  {
 	last_include_ord_map = from;
Index: gcc/c-family/c-lex.c
===
--- gcc/c-family/c-lex.c	(revision 263427)
+++ gcc/c-family/c-lex.c	(working copy)
@@ -199,7 +199,7 @@ fe_file_change (const line_map_ordinary
 	 we already did in compile_file.  */
   if (!MAIN_FILE_P (new_map))
 	{
-	  unsigned int included_at = LAST_SOURCE_LINE_LOCATION (new_map - 1);
+	  location_t included_at = linemap_included_from (new_map);
 	  int line = 0;
 	  if (included_at > BUILTINS_LOCATION)
 	line = SOURCE_LINE (new_map - 1, included_at);
Index: gcc/c-family/c-ppoutput.c
===
--- gcc/c-family/c-ppoutput.c	(revision 263427)
+++ gcc/c-family/c-ppoutput.c	(working copy)
@@ -663,11 +663,9 @@ pp_file_change (const line_map_ordinary
 	  /* Bring current file to correct line when entering a new file.  */
 	  if (map->reason == LC_ENTER)
 	{
-	  const line_map_ordinary *from = INCLUDED_FROM (line_table, map);
-	  maybe_print_line (LAST_SOURCE_LINE_LOCATION (from));
+	  maybe_print_line (linemap_included_from (map));
+	  flags = " 1";
 	}
-	  if (map->reason == LC_ENTER)
-	flags = " 1";
 	  else if (map->reason == LC_LEAVE)
 	flags = " 2";
 	  print_line (map->start_location, flags);
Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c	(revision 263427)
+++ gcc/diagnostic.c	(working copy)
@@ -590,9 +590,10 @@ diagnostic_report_current_module (diagno
 	  bool first = true;
 	  do
 	{
-	  map = INCLUDED_FROM (line_table, map);
+	  where = linemap_included_from (map);
+	  map = linemap_included_from_linemap (line_table, map);
 	  const char *line_col
-		= maybe_line_and_column (LAST_SOURCE_LINE (map),
+		= maybe_line_and_column (SOURCE_LINE (map, where),
 	 first && context->show_column
 	 ? SOURCE_COLUMN (map, where) : 0);
 	  static const char *const msgs[] =
Index: gcc/fortran/cpp.c
===
--- gcc/fortran/cpp.c	(revision 263427)
+++ gcc/fortran/cpp.c	(working copy)
@@ -881,10 +881,7 @@ cb_file_change (cpp_reader * ARG_UNUSED
 	{
 	  /* Bring current file to correct line when entering a new file.  */
 	  if (map->reason == LC_ENTER)
-	{
-	  const line_map_ordinary *from = INCLUDED_FROM (line_table, map);
-	  maybe_print_line (LAST_SOURCE_LINE_LOCATION (from));
-	}
+	maybe_print_line (linemap_included_from (map));
 	  if (map->reason

Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Bernd Edlinger
Hi Martin,

sorry, I hope you forgive me, when I add a few comments to your patch.

> +  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
> +  tree eltype = TREE_TYPE (type);
...
> +  /* Bail if the CTOR has a block of more than 256 embedded nuls
> +due to implicitly initialized elements.  */
> +  unsigned nelts = (idx - str.length ()) + 1;
> +  if (nelts > 256)
> +   return NULL_TREE;

nelts shadows previous nelts.

> +  if (!nelts || str.length () < i)

I don't understand when is str.length () < i ?

> +/* Append a nul for the empty initializer { } and for the last
> +   explicit initializer in the loop above that is a nul.  */
> +str.safe_push (0);
> +
> +  /* Build a string literal but return the embedded STRING_CST.  */
> +  tree res = build_string_literal (str.length (), str.begin ());
> +  res = TREE_OPERAND (TREE_OPERAND (res, 0), 0);
> +  return res;
> +}


res has a different type than the object you initialize.
I think you should use:

tree res = build_string (str.length (), str.begin ());
TREE_TYPE (res) = type;
return res;


Bernd.


Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Richard Biener
On August 8, 2018 5:51:16 PM GMT+02:00, Martin Sebor  wrote:
>On 08/07/2018 11:46 AM, Richard Biener wrote:
>> On August 7, 2018 6:31:36 PM GMT+02:00, Martin Sebor
> wrote:
>>> On 08/07/2018 09:33 AM, Bernd Edlinger wrote:
 On 08/07/18 17:02, Martin Sebor wrote:
> On 08/06/2018 11:45 PM, Richard Biener wrote:
>> On August 7, 2018 5:38:59 AM GMT+02:00, Martin Sebor
>>>  wrote:
>>> On 08/06/2018 11:40 AM, Jeff Law wrote:
 On 08/06/2018 11:15 AM, Martin Sebor wrote:
>>> These examples do not aim to be valid C, they just point out
>>> limitations
>>> of the middle-end design, and a good deal of the problems
>are
>>> due
>>> to trying to do things that are not safe within the
>boundaries
>>> given
>>> by the middle-end design.
>> I really think this is important -- and as such I think we
>need
>>> to
>>> move
>> away from trying to describe scenarios in C because doing so
>>> keeps
>> bringing us back to the "C doesn't allow XYZ" kinds of
>>> arguments
>>> when
>> what we're really discussing are GIMPLE semantic issues.
>>
>> So examples should be GIMPLE.  You might start with (possibly
>>> invalid) C
>> code to generate the GIMPLE, but the actual discussion needs
>to
>>> be
>> looking at GIMPLE.  We might include the C code in case
>someone
>>> wants to
>> look at things in a debugger, but bringing the focus to
>GIMPLE
>>> is
>>> really
>> important here.
>
> I don't understand the goal of this exercise.  Unless the
>GIMPLE
> code is the result of a valid test case (in some language GCC
> supports), what does it matter what it looks like?  The basis
>of
> every single transformation done by a compiler is that the
>>> source
> code is correct.  If it isn't then all bets are off.  I'm no
>>> GIMPLE
> expert but even I can come up with any number of GIMPLE
>>> expressions
> that have undefined behavior.  What would that prove?
 The GIMPLE IL is less restrictive than the original source
>>> language.
 The process of translation into GIMPLE and optimization can
>>> create
 situations in the GIMPLE IL that can't be validly represented
>in
>>> the
 original source language.  Subobject crossing being one such
>>> case,
>>> there
 are certainly others.  We have to handle these scenarios
>>> correctly.
>>>
>>> Sure, but a valid C test case still needs to exist to show that
>>> such a transformation is possible.  Until someone comes up with
>>> one it's all speculation.
>>
>> Jakub showed you one wrt CSE of addresses.
>
> Sorry, there have been so many examples I've lost track.  Can
> you please copy it here or point to it in the archive?
>

 This is based on Jakubs example here:
>>> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00260.html

 $ cat y.cc
 typedef __typeof__ (sizeof 0) size_t;
 void *operator new (size_t, void *p) { return p; }
 void *operator new[] (size_t, void *p) { return p; }
 struct S { int x; unsigned char a[1]; char b[64]; };
 void baz (char *);

 size_t
 foo (S *p)
 {
char *q = new ((char*)p->a) char [16];
baz (q);
size_t x = __builtin_strlen (q);
if (x==0)
  __builtin_abort();
return x;
 }

 $ gcc -O3 -S y.ccup
 $ cat y.s
 .LFB2:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
addq$4, %rdi
call_Z3bazPc
callabort
.cfi_endproc



 I think this is not a correct optimization.
>>>
>>> I see.  This narrows it down to the placement new expression
>>> exposing the type of the original object rather than that of
>>> the newly constructed object.  We end up with strlen (_2)
>>> where _2 = &p_1(D)->a.
>>>
>>> The aggressive loop optimization trigger in this case because
>>> the access has been transformed to MEM[(char *)p_5(D) + 4B]
>>> early on which obviates the structure of the accessed type.
>>>
>>> This is the case that I think is worth solving -- ideally not
>>> by removing the optimization but by preserving the conversion
>>> to the type of the newly constructed object.
>>
>> Pointer types carry no information in GIMPLE.
>
>So what do you suggest as a solution?
>
>The strlen optimization can be decoupled from warnings and
>disabled, and the aggressive loop optimization can be disabled
>altogether.  But the same issue affects all string functions
>with _FORTIFY_SOURCE=2.  The modified example below aborts at
>runime (and gets diagnosed by -Wstringop-overflow).
>
>GCC certainly needs to generate valid object code for valid
>source code.  But keeping track of object type information
>is also important for correctness and security, as has been
>done by _FORTIFY_SOURCE and by many middle-end w

Re: dejagnu version update?

2018-08-08 Thread Segher Boessenkool
On Wed, Aug 08, 2018 at 01:17:49PM +0200, Bernhard Reutner-Fischer wrote:
> On 7 August 2018 18:34:30 CEST, Segher Boessenkool 
>  wrote:
> >On Mon, Aug 06, 2018 at 08:25:49AM -0700, Mike Stump wrote:
> >> Since g++ already requires 1.5.3, it make no sense to bump to
> >anything older that 1.5.3, so let's bump to 1.5.3.  Those packaging
> >systems and OSes that wanted to update by now, have had their chance to
> >update.  Those that punt until we bump the requirement, well, they will
> >now have to bump.  :-)
> >
> >"g++ requires it"?  In what way?  I haven't seen any issues with older
> >dejagnu versions.
> 
> I think 
> http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=commit;h=5256bd82343000c76bc0e48139003f90b6184347

Ugh.

If there is a conflict between the test-specific options and the testsuite
run options, sometimes you should pick one, sometimes the other, and often
skipping the test is best.  Older dejagnu picked the run options, and now
newer dejagnu picks the test-specific options, so now we cannot rely on
*either* behaviour.  At least for many years to come: we share most
testcases with older GCC versions, which do not require dejagnu 1.5.3!

What a mess.


Segher


Ping for STRING_CST patches

2018-08-08 Thread Bernd Edlinger
Hi,

I'd like to ping the following patches:

[PATCH] Check the STRING_CSTs in varasm.c
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00361.html

[PATCH] Fix not properly nul-terminated string constants in JIT
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00370.html

[PATCH] Handle overlength strings in the C FE
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00142.html

[PATCH] Make GO string literals properly NUL terminated
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01931.html

[PATCH] [Ada] Make middle-end string literals NUL terminated
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01929.html


Thanks
Bernd.


Re: [PATCH][AArch64] Implement new intrinsics vabsd_s64 and vnegd_s64

2018-08-08 Thread Vlad Lazar

On 01/08/18 18:35, James Greenhalgh wrote:

On Wed, Aug 01, 2018 at 07:13:53AM -0500, Vlad Lazar wrote:

On 31/07/18 22:48, James Greenhalgh wrote:

On Fri, Jul 20, 2018 at 04:37:34AM -0500, Vlad Lazar wrote:

Hi,

The patch adds implementations for the NEON intrinsics vabsd_s64 and vnegd_s64.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/docs/ihi0073/latest/arm-neon-intrinsics-reference-architecture-specification)

Bootstrapped and regtested on aarch64-none-linux-gnu and there are no 
regressions.

OK for trunk?

+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vnegd_s64 (int64_t __a)
+{
+  return -__a;
+}


Does this give the correct behaviour for the minimum value of int64_t? That
would be undefined behaviour in C, but well-defined under ACLE.

Thanks,
James



Hi. Thanks for the review.

For the minimum value of int64_t it behaves as the ACLE specifies:
"The negative of the minimum (signed) value is itself."


What should happen in this testcase? The spoiler is below, but try to work out
what should happen and what goes wrong with your implementation.

   int foo (int64_t x)
   {
 if (x < (int64_t) 0)
   return vnegd_s64(x) < (int64_t) 0;
 else
   return 0;
   }
   
   
   int bar (void)

   {
 return foo (INT64_MIN);
   }
  
Thanks,

James


-






INT64_MIN < 0 should be true, so we should return vnegd_s64(INT64_MIN) < 0.
vnegd_s64(INT64_MIN) is identity, so the return value should be
INT64_MIN < 0; i.e. True.

This isn't what the compiler thinks... The compiler makes use of the fact
that -INT64_MIN is undefined behaviour in C, and doesn't need to be considered
as a special case. The if statement gives you a range reduction to [-INF, -1],
negating that gives you a range [1, INF], and [1, INF] is never less than 0,
so the compiler folds the function to return false. We have a mismatch in
semantics


I see your point now. I have updated the vnegd_s64 intrinsic to convert to
unsigned before negating. This means that if the predicted range of x is
[INT64_MIN, y], then the predicted range of vnegd_s64 (x) will be
~[INT64_MIN + 1, y] which seems to resolve the issue. I've also added testcases
which reflect the issue you've pointed out. Note that I've change the vabsd_s64
intrinsic in order to avoid moves between integer and vector registers.

See the updated patch below. Ok for trunk?

---

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 
2d18400040f031dfcdaf60269ad484647804e1be..fc734e1aa9e93c171c0670164e5a3a54209905d3
 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -11822,6 +11822,18 @@ vabsq_s64 (int64x2_t __a)
   return __builtin_aarch64_absv2di (__a);
 }
 
+/* Try to avoid moving between integer and vector registers.

+   For why the cast to unsigned is needed check the vnegd_s64 intrinsic.
+   There is a testcase related to this issue:
+   gcc.target/aarch64/vabsd_s64.c.  */
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vabsd_s64 (int64_t __a)
+{
+  return __a < 0 ? - (uint64_t) __a : __a;
+}
+
 /* vadd */
 
 __extension__ extern __inline int64_t

@@ -22907,6 +22919,25 @@ vneg_s64 (int64x1_t __a)
   return -__a;
 }
 
+/* According to the ACLE, the negative of the minimum (signed)

+   value is itself.  This leads to a semantics mismatch, as this is
+   undefined behaviour in C.  The value range predictor is not
+   aware that the negation of a negative number can still be negative
+   and it may try to fold the expression.  See the test in
+   gcc.target/aarch64/vnegd_s64.c for an example.
+
+   The cast below tricks the value range predictor to include
+   INT64_MIN in the range it computes.  So for x in the range
+   [INT64_MIN, y] the range prediction after vnegd_s64 (x) will
+   be ~[INT64_MIN + 1, y].  */
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vnegd_s64 (int64_t __a)
+{
+  return - (uint64_t) __a;
+}
+
 __extension__ extern __inline float32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vnegq_f32 (float32x4_t __a)
diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c 
b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
index 
ea29066e369b967d0781d31c8a5208bda9e4f685..d943989768dd8c9aa87d9dcb899e199029ef3f8b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
+++ b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
@@ -627,6 +627,14 @@ test_vqabss_s32 (int32_t a)
   return vqabss_s32 (a);
 }
 
+/* { dg-final { scan-assembler-times "\\tneg\\tx\[0-9\]+" 1 } } */

+
+int64_t
+test_vnegd_s64 (int64_t a)
+{
+  return vnegd_s64 (a);
+}
+
 /* { dg-final { scan-assembler-times "\\tsqneg\\tb\[0-9\]+" 1 } } */
 
 int8_t

diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_3.c 
b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsi

[PING] [PATCH] Create internally nul terminated string literals in fortan FE

2018-08-08 Thread Bernd Edlinger
Hi,

I'd like to ping this patch: 
https://gcc.gnu.org/ml/fortran/2018-08/msg0.html

I attach a new version, which contains only a minor white-space change from
the previous version, in the function header of gfc_build_hollerith_string_const
to contain "static tree" on one line instead of two.

Thanks
Bernd.

On 08/01/18 13:32, Bernd Edlinger wrote:
> Hi,
> 
> this patch changes the Fortan FE to create NUL terminated STRING_CST
> objects.  This is a cleanup in preparation of a more thorough check
> on the STRING_CST objects in the middle-end.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
2018-08-01  Bernd Edlinger  

	* trans-array.c (gfc_conv_array_initializer): Remove excess precision
	from overlength string initializers.
	* trans-const.c (gfc_build_wide_string_const): Make the internal
	representation of STRING_CST properly NUL terminated.
	(gfc_build_hollerith_string_const): New helper function.
	(gfc_conv_constant_to_tree): Use it.
	

diff -pur gcc/fortran/trans-array.c gcc/fortran/trans-array.c
--- gcc/fortran/trans-array.c	2018-07-02 09:24:43.0 +0200
+++ gcc/fortran/trans-array.c	2018-08-01 06:45:20.529923246 +0200
@@ -5964,6 +5964,32 @@ gfc_conv_array_initializer (tree type, g
 	{
 	case EXPR_CONSTANT:
 	  gfc_conv_constant (&se, c->expr);
+
+	  /* See gfortran.dg/charlen_15.f90 for instance.  */
+	  if (TREE_CODE (se.expr) == STRING_CST
+		  && TREE_CODE (type) == ARRAY_TYPE)
+		{
+		  tree atype = type;
+		  while (TREE_CODE (TREE_TYPE (atype)) == ARRAY_TYPE)
+		atype = TREE_TYPE (atype);
+		  if (TREE_CODE (TREE_TYPE (atype)) == INTEGER_TYPE
+		  && tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (se.expr)))
+			 > tree_to_uhwi (TYPE_SIZE_UNIT (atype)))
+		{
+		  unsigned HOST_WIDE_INT size
+			= tree_to_uhwi (TYPE_SIZE_UNIT (atype));
+		  unsigned unit
+			= TYPE_PRECISION (TREE_TYPE (atype)) / BITS_PER_UNIT;
+		  const char *p = TREE_STRING_POINTER (se.expr);
+		  char *q = (char *)xmalloc (size + unit);
+
+		  memcpy (q, p, size);
+		  memset (q + size, 0, unit);
+		  se.expr = build_string (size + unit, q);
+		  TREE_TYPE (se.expr) = atype;
+		  free (q);
+		}
+		}
 	  break;
 
 	case EXPR_STRUCTURE:
diff -pur gcc/fortran/trans-const.c gcc/fortran/trans-const.c
--- gcc/fortran/trans-const.c	2018-06-10 14:50:03.0 +0200
+++ gcc/fortran/trans-const.c	2018-07-31 20:15:21.721153877 +0200
@@ -93,13 +93,16 @@ gfc_build_wide_string_const (int kind, s
   int i;
   tree str, len;
   size_t size;
+  size_t elem;
   char *s;
 
   i = gfc_validate_kind (BT_CHARACTER, kind, false);
-  size = length * gfc_character_kinds[i].bit_size / 8;
+  elem = gfc_character_kinds[i].bit_size / 8;
+  size = (length + 1) * elem;
 
   s = XCNEWVAR (char, size);
   gfc_encode_character (kind, length, string, (unsigned char *) s, size);
+  memset (s + size - elem, 0, elem);
 
   str = build_string (size, s);
   free (s);
@@ -131,6 +134,30 @@ gfc_build_localized_cstring_const (const char *msg
 }
 
 
+/* Build a hollerith string constant.  */
+
+static tree
+gfc_build_hollerith_string_const (size_t length, const char *s)
+{
+  tree str;
+  tree len;
+  char *p;
+
+  p = XCNEWVAR (char, length + 1);
+  memcpy (p, s, length);
+  p[length] = '\0';
+  str = build_string (length + 1, p);
+  free (p);
+  len = size_int (length);
+  TREE_TYPE (str) =
+build_array_type (gfc_character1_type_node,
+		  build_range_type (gfc_charlen_type_node,
+	size_one_node, len));
+  TYPE_STRING_FLAG (TREE_TYPE (str)) = 1;
+  return str;
+}
+
+
 /* Return a string constant with the given length.  Used for static
initializers.  The constant will be padded or truncated to match
length.  */
@@ -363,8 +390,8 @@ gfc_conv_constant_to_tree (gfc_expr * expr)
   return res;
 
 case BT_HOLLERITH:
-  return gfc_build_string_const (expr->representation.length,
- expr->representation.string);
+  return gfc_build_hollerith_string_const (expr->representation.length,
+	   expr->representation.string);
 
 default:
   gcc_unreachable ();


Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Bernd Edlinger
On 08/08/18 17:51, Martin Sebor wrote:
> On 08/07/2018 11:46 AM, Richard Biener wrote:
>>
>> Pointer types carry no information in GIMPLE.
> 
> So what do you suggest as a solution?
> 
> The strlen optimization can be decoupled from warnings and
> disabled, and the aggressive loop optimization can be disabled
> altogether.  But the same issue affects all string functions
> with _FORTIFY_SOURCE=2.  The modified example below aborts at
> runime (and gets diagnosed by -Wstringop-overflow).
> 
> GCC certainly needs to generate valid object code for valid
> source code.  But keeping track of object type information
> is also important for correctness and security, as has been
> done by _FORTIFY_SOURCE and by many middle-end warnings.
> When accurate, it also benefits optimization.
> 
> What can we do to make it more reliable?  Would annotating
> placement new solve the problem?  If not, what would?
> 
> Martin
> 
> #include 
> #include 
> 
> struct S { int x; unsigned char a[1]; char b[64]; };
> 
> void foo (S *p)
> {
>     char *q = new ((char*)p->a) char [16];
> 
>     strcpy (q, "12345678");   // abort here
> }
> 
> int main ()
> {
>    foo (new S);
> }
> 

I quote Jakub's E-mail from 07/31/18 08:38:

"Note that _FORTIFY_SOURCE=2 is the mode that goes beyond what the standard
requires, imposes extra requirements.  So from what this mode accepts or
rejects we shouldn't determine what is or isn't considered valid."


So just use _FORTIFY_SOURCE=1 ?

But what would be good to improve in _FORTIFY_SOURCE is, intercepting
strlen to catch cases early, where the char buffer is not zero terminated.


Bernd.


Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 15:46 +0100, Jonathan Wakely wrote:

On 08/08/18 15:33 +0100, Jonathan Wakely wrote:

On 08/08/18 16:22 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:


Aha, so newlib was using memalign previously:

@@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
#else
extern "C" void *memalign(std::size_t boundary, std::size_t size);
#endif
-#define aligned_alloc memalign


Yes, exactly ... this commit introduced the regression.


OK, I've regressed the branches then - I'll fix that.


This should fix it. I'll finish testing and commit it.

Sebastian, your patch to define HAVE_ALIGNED_ALLOC is OK for
gcc-7-branch and gcc-8-branch, because changing newlib from using
memalign to aligned_alloc is safe.


With the patch this time ...


Committed to trunk, gcc-8-branch and gcc-7-branch.

Rainer, please let me know if this still works on Solaris 10, thanks.


commit 4f61feff8ad3615c0b686deac3852d98734ef964
Author: redi 
Date:   Wed Aug 8 15:16:43 2018 +

Prevent internal aligned_alloc clashing with libc version

If configure fails to detect aligned_alloc we will try to define our
own in new_opa.cc but that could clash with the libcversion in
. Use a namespace to keep them distinct.

* libsupc++/new_opa.cc (aligned_alloc): Declare inside namespace to
avoid clashing with an ::aligned_alloc function that was not detected
by configure.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@263409 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/libsupc++/new_opa.cc b/libstdc++-v3/libsupc++/new_opa.cc
index 5be0cc2ca65..68eac5b8ceb 100644
--- a/libstdc++-v3/libsupc++/new_opa.cc
+++ b/libstdc++-v3/libsupc++/new_opa.cc
@@ -25,15 +25,30 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "new"
 
+#if !_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE__ALIGNED_MALLOC \
+  && !_GLIBCXX_HAVE_POSIX_MEMALIGN && _GLIBCXX_HAVE_MEMALIGN
+# if _GLIBCXX_HOSTED && __has_include()
+// Some C libraries declare memalign in 
+#  include 
+# else
+extern "C" void *memalign(std::size_t boundary, std::size_t size);
+# endif
+#endif
+
 using std::new_handler;
 using std::bad_alloc;
 
-#if !_GLIBCXX_HAVE_ALIGNED_ALLOC
-#if _GLIBCXX_HAVE__ALIGNED_MALLOC
-#define aligned_alloc(al,sz) _aligned_malloc(sz,al)
+namespace __gnu_cxx {
+#if _GLIBCXX_HAVE_ALIGNED_ALLOC
+using ::aligned_alloc;
+#elif _GLIBCXX_HAVE__ALIGNED_MALLOC
+static inline void*
+aligned_alloc (std::size_t al, std::size_t sz)
+{ return _aligned_malloc(sz, al); }
 #elif _GLIBCXX_HAVE_POSIX_MEMALIGN
 static inline void*
 aligned_alloc (std::size_t al, std::size_t sz)
@@ -49,11 +64,6 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return nullptr;
 }
 #elif _GLIBCXX_HAVE_MEMALIGN
-#if _GLIBCXX_HOSTED
-#include 
-#else
-extern "C" void *memalign(std::size_t boundary, std::size_t size);
-#endif
 static inline void*
 aligned_alloc (std::size_t al, std::size_t sz)
 {
@@ -66,7 +76,6 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return memalign (al, sz);
 }
 #else // !HAVE__ALIGNED_MALLOC && !HAVE_POSIX_MEMALIGN && !HAVE_MEMALIGN
-#include 
 // The C library doesn't provide any aligned allocation functions, define one.
 // This is a modified version of code from gcc/config/i386/gmm_malloc.h
 static inline void*
@@ -87,7 +96,7 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return aligned_ptr;
 }
 #endif
-#endif
+} // namespace __gnu_cxx
 
 _GLIBCXX_WEAK_DEFINITION void *
 operator new (std::size_t sz, std::align_val_t al)
@@ -116,6 +125,7 @@ operator new (std::size_t sz, std::align_val_t al)
 sz += align - rem;
 #endif
 
+  using __gnu_cxx::aligned_alloc;
   while (__builtin_expect ((p = aligned_alloc (align, sz)) == 0, false))
 {
   new_handler handler = std::get_new_handler ();


Re: [PATCH] Make strlen range computations more conservative

2018-08-08 Thread Martin Sebor

On 08/07/2018 11:46 AM, Richard Biener wrote:

On August 7, 2018 6:31:36 PM GMT+02:00, Martin Sebor  wrote:

On 08/07/2018 09:33 AM, Bernd Edlinger wrote:

On 08/07/18 17:02, Martin Sebor wrote:

On 08/06/2018 11:45 PM, Richard Biener wrote:

On August 7, 2018 5:38:59 AM GMT+02:00, Martin Sebor

 wrote:

On 08/06/2018 11:40 AM, Jeff Law wrote:

On 08/06/2018 11:15 AM, Martin Sebor wrote:

These examples do not aim to be valid C, they just point out

limitations

of the middle-end design, and a good deal of the problems are

due

to trying to do things that are not safe within the boundaries

given

by the middle-end design.

I really think this is important -- and as such I think we need

to

move

away from trying to describe scenarios in C because doing so

keeps

bringing us back to the "C doesn't allow XYZ" kinds of

arguments

when

what we're really discussing are GIMPLE semantic issues.

So examples should be GIMPLE.  You might start with (possibly

invalid) C

code to generate the GIMPLE, but the actual discussion needs to

be

looking at GIMPLE.  We might include the C code in case someone

wants to

look at things in a debugger, but bringing the focus to GIMPLE

is

really

important here.


I don't understand the goal of this exercise.  Unless the GIMPLE
code is the result of a valid test case (in some language GCC
supports), what does it matter what it looks like?  The basis of
every single transformation done by a compiler is that the

source

code is correct.  If it isn't then all bets are off.  I'm no

GIMPLE

expert but even I can come up with any number of GIMPLE

expressions

that have undefined behavior.  What would that prove?

The GIMPLE IL is less restrictive than the original source

language.

The process of translation into GIMPLE and optimization can

create

situations in the GIMPLE IL that can't be validly represented in

the

original source language.  Subobject crossing being one such

case,

there

are certainly others.  We have to handle these scenarios

correctly.


Sure, but a valid C test case still needs to exist to show that
such a transformation is possible.  Until someone comes up with
one it's all speculation.


Jakub showed you one wrt CSE of addresses.


Sorry, there have been so many examples I've lost track.  Can
you please copy it here or point to it in the archive?



This is based on Jakubs example here:

https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00260.html


$ cat y.cc
typedef __typeof__ (sizeof 0) size_t;
void *operator new (size_t, void *p) { return p; }
void *operator new[] (size_t, void *p) { return p; }
struct S { int x; unsigned char a[1]; char b[64]; };
void baz (char *);

size_t
foo (S *p)
{
   char *q = new ((char*)p->a) char [16];
   baz (q);
   size_t x = __builtin_strlen (q);
   if (x==0)
 __builtin_abort();
   return x;
}

$ gcc -O3 -S y.ccup
$ cat y.s
.LFB2:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
addq$4, %rdi
call_Z3bazPc
callabort
.cfi_endproc



I think this is not a correct optimization.


I see.  This narrows it down to the placement new expression
exposing the type of the original object rather than that of
the newly constructed object.  We end up with strlen (_2)
where _2 = &p_1(D)->a.

The aggressive loop optimization trigger in this case because
the access has been transformed to MEM[(char *)p_5(D) + 4B]
early on which obviates the structure of the accessed type.

This is the case that I think is worth solving -- ideally not
by removing the optimization but by preserving the conversion
to the type of the newly constructed object.


Pointer types carry no information in GIMPLE.


So what do you suggest as a solution?

The strlen optimization can be decoupled from warnings and
disabled, and the aggressive loop optimization can be disabled
altogether.  But the same issue affects all string functions
with _FORTIFY_SOURCE=2.  The modified example below aborts at
runime (and gets diagnosed by -Wstringop-overflow).

GCC certainly needs to generate valid object code for valid
source code.  But keeping track of object type information
is also important for correctness and security, as has been
done by _FORTIFY_SOURCE and by many middle-end warnings.
When accurate, it also benefits optimization.

What can we do to make it more reliable?  Would annotating
placement new solve the problem?  If not, what would?

Martin

#include 
#include 

struct S { int x; unsigned char a[1]; char b[64]; };

void foo (S *p)
{
   char *q = new ((char*)p->a) char [16];

   strcpy (q, "12345678");   // abort here
}

int main ()
{
  foo (new S);
}



Re: [PATCH] line-map include-from representation

2018-08-08 Thread David Malcolm
On Wed, 2018-08-08 at 11:36 -0400, David Malcolm wrote:
> On Wed, 2018-08-08 at 11:28 -0400, Nathan Sidwell wrote:
> > On 08/08/2018 10:46 AM, David Malcolm wrote:
> > 
> > > [...snip...]
> > > 
> > > We don't seem to have much test coverage for this code.  Sorry to
> > > be a
> > > pain, but could you please try adding a testcase of a diagnostic
> > > issued
> > > from within a couple of levels of nested includes, perhaps using
> > > 
> > > /* { dg-options "-fdiagnostics-show-caret" } */
> > > 
> > > and
> > > 
> > > /* { dg-begin-multiline-output "" }
> > > { dg-end-multiline-output "" } */
> > > 
> > > (or dg-regexp if need be for dealing with arbitrary paths in
> > > filenames)
> > 
> > I could never get existing dejagnu check this particular piece of 
> > output.  AFAICT this is all unconditionally pruned before the
> > output 
> > scanners get a look-see.  On the modules branch I ended up hacking
> > in
> > a 
> > pre-pruned-output hook.  It's not pretty.
> 
> In r255786 I adjusted prune.exp to move the dg-regexp handling to
> before the pruning.  Unfortunately, handle-multiline-outputs is still
> after the pruning.  I guess we could try moving that to before as
> well,
> but I suspect it might break some things.

Alternatively, could the test cases in r255786 (aka
6d8c9f39007790df9c08fca1e4c458a22b04e9c4) be adapted to provide some
test coverage of this?  (they use dg-regexp for the awkward lines)


Re: [PATCH] line-map include-from representation

2018-08-08 Thread David Malcolm
On Wed, 2018-08-08 at 11:28 -0400, Nathan Sidwell wrote:
> On 08/08/2018 10:46 AM, David Malcolm wrote:
> 
> > [...snip...]
> > 
> > We don't seem to have much test coverage for this code.  Sorry to
> > be a
> > pain, but could you please try adding a testcase of a diagnostic
> > issued
> > from within a couple of levels of nested includes, perhaps using
> > 
> > /* { dg-options "-fdiagnostics-show-caret" } */
> > 
> > and
> > 
> > /* { dg-begin-multiline-output "" }
> > { dg-end-multiline-output "" } */
> > 
> > (or dg-regexp if need be for dealing with arbitrary paths in
> > filenames)
> 
> I could never get existing dejagnu check this particular piece of 
> output.  AFAICT this is all unconditionally pruned before the output 
> scanners get a look-see.  On the modules branch I ended up hacking in
> a 
> pre-pruned-output hook.  It's not pretty.

In r255786 I adjusted prune.exp to move the dg-regexp handling to
before the pruning.  Unfortunately, handle-multiline-outputs is still
after the pruning.  I guess we could try moving that to before as well,
but I suspect it might break some things.

Dave


Re: [PATCH] line-map include-from representation

2018-08-08 Thread Nathan Sidwell

On 08/08/2018 10:46 AM, David Malcolm wrote:


[...snip...]

We don't seem to have much test coverage for this code.  Sorry to be a
pain, but could you please try adding a testcase of a diagnostic issued
from within a couple of levels of nested includes, perhaps using

/* { dg-options "-fdiagnostics-show-caret" } */

and

/* { dg-begin-multiline-output "" }
{ dg-end-multiline-output "" } */

(or dg-regexp if need be for dealing with arbitrary paths in filenames)


I could never get existing dejagnu check this particular piece of 
output.  AFAICT this is all unconditionally pruned before the output 
scanners get a look-see.  On the modules branch I ended up hacking in a 
pre-pruned-output hook.  It's not pretty.


nathan

--
Nathan Sidwell


Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry

2018-08-08 Thread Tom de Vries
On Wed, Aug 08, 2018 at 07:09:16AM -0700, Cesar Philippidis wrote:
> On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
> 
> > I attached an updated version of the CUDA driver patch, although I
> > haven't rebased it against your changes yet. It still needs to be tested
> > against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
> > you an update.
> > 
> > Does this patch look OK, at least after testing competes? I removed the
> > tests for CUDA_ONE_CALL_MAYBE_NULL, because the newer CUDA API isn't
> > supported in the older drivers.
> 
> I've finally finished testing this patch. Besides for a couple of
> regressions with CUDA 5.5 in libgomp.oacc-c-c++-common/lib-75.c,
> lib-76.c and lib-79.c, the results came back clean.
> 
> This patch has been tested the following ways using a K40 GPU:
> 
>   * Using GCC's cuda.h with CUDA 9.2 drivers.
>   * Using cuda.h from CUDA 5.5 and Nvidia drivers 331.133 (supports CUDA
> 6.0) and the driver from CUDA 8.0.
>   * Using cuda.h from CUDA 8.0.
> 
> As mentioned before, because GCC's cuda.h defines CUDA_VERSION as 8000,
> there was a conflict with using it against CUDA 5.5, because of the
> missing cuLinkAddData_v2 symbol.
> 
> Note how the usage of cuOccupancyMaxPotentialBlockSize is guarded by
> checking for the version of CUDA_VERSION. I don't really like this, but
> it's a necessary evil of maintaining backwards compatibility.
> 
> Is this patch OK for trunk?
> 
> Thanks,
> Cesar

> [nvptx] Use CUDA driver API to select default runtime launch geometry
> 
> 2018-08-YY  Cesar Philippidis  
> 
>   libgomp/
>   plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef.
>   (cuDriverGetVersion): Declare.
>   (cuOccupancyMaxPotentialBlockSizeWithFlags): Declare.
>   plugin/plugin-nvptx.c (CUDA_ONE_CALL): Add entries for
>   cuDriverGetVersion and cuOccupancyMaxPotentialBlockSize.
>   (ptx_device): Add driver_version member.
>   (nvptx_open_device): Initialize it.
>   (nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the
>   default num_gangs and num_workers when the driver supports it.
> ---
>  libgomp/plugin/cuda-lib.def   |  2 ++
>  libgomp/plugin/cuda/cuda.h|  4 
>  libgomp/plugin/plugin-nvptx.c | 40 +++-
>  3 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/libgomp/plugin/cuda-lib.def b/libgomp/plugin/cuda-lib.def
> index be8e3b3..f2433e1 100644
> --- a/libgomp/plugin/cuda-lib.def
> +++ b/libgomp/plugin/cuda-lib.def
> @@ -2,6 +2,7 @@ CUDA_ONE_CALL (cuCtxCreate)
>  CUDA_ONE_CALL (cuCtxDestroy)
>  CUDA_ONE_CALL (cuCtxGetCurrent)
>  CUDA_ONE_CALL (cuCtxGetDevice)
> +CUDA_ONE_CALL (cuDriverGetVersion)

Don't use cuDriverGetVersion.

>  CUDA_ONE_CALL (cuCtxPopCurrent)
>  CUDA_ONE_CALL (cuCtxPushCurrent)
>  CUDA_ONE_CALL (cuCtxSynchronize)
> @@ -39,6 +40,7 @@ CUDA_ONE_CALL (cuModuleGetGlobal)
>  CUDA_ONE_CALL (cuModuleLoad)
>  CUDA_ONE_CALL (cuModuleLoadData)
>  CUDA_ONE_CALL (cuModuleUnload)
> +CUDA_ONE_CALL (cuOccupancyMaxPotentialBlockSize)

Use CUDA_ONE_CALL_MAYBE_NULL.

>  CUDA_ONE_CALL (cuStreamCreate)
>  CUDA_ONE_CALL (cuStreamDestroy)
>  CUDA_ONE_CALL (cuStreamQuery)
> diff --git a/libgomp/plugin/cuda/cuda.h b/libgomp/plugin/cuda/cuda.h
> index 4799825..3a790e6 100644
> --- a/libgomp/plugin/cuda/cuda.h
> +++ b/libgomp/plugin/cuda/cuda.h
> @@ -44,6 +44,7 @@ typedef void *CUevent;
>  typedef void *CUfunction;
>  typedef void *CUlinkState;
>  typedef void *CUmodule;
> +typedef size_t (*CUoccupancyB2DSize)(int);
>  typedef void *CUstream;
>  
>  typedef enum {
> @@ -123,6 +124,7 @@ CUresult cuCtxSynchronize (void);
>  CUresult cuDeviceGet (CUdevice *, int);
>  CUresult cuDeviceGetAttribute (int *, CUdevice_attribute, CUdevice);
>  CUresult cuDeviceGetCount (int *);
> +CUresult cuDriverGetVersion(int *);
>  CUresult cuEventCreate (CUevent *, unsigned);
>  #define cuEventDestroy cuEventDestroy_v2
>  CUresult cuEventDestroy (CUevent);
> @@ -170,6 +172,8 @@ CUresult cuModuleGetGlobal (CUdeviceptr *, size_t *, 
> CUmodule, const char *);
>  CUresult cuModuleLoad (CUmodule *, const char *);
>  CUresult cuModuleLoadData (CUmodule *, const void *);
>  CUresult cuModuleUnload (CUmodule);
> +CUresult cuOccupancyMaxPotentialBlockSize(int *, int *, CUfunction,
> +   CUoccupancyB2DSize, size_t, int);
>  CUresult cuStreamCreate (CUstream *, unsigned);
>  #define cuStreamDestroy cuStreamDestroy_v2
>  CUresult cuStreamDestroy (CUstream);
> diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
> index 825470a..b0ccf0b 100644
> --- a/libgomp/plugin/plugin-nvptx.c
> +++ b/libgomp/plugin/plugin-nvptx.c
> @@ -376,6 +376,7 @@ struct ptx_device
>int max_threads_per_block;
>int max_threads_per_multiprocessor;
>int default_dims[GOMP_DIM_MAX];
> +  int driver_version;
>  
>struct ptx_image_data *images;  /* Images loaded on device.  */
>pthread_mutex_t image_lock; /* Lock for above list.  */
> @

Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 15:33 +0100, Jonathan Wakely wrote:

On 08/08/18 16:22 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:


Aha, so newlib was using memalign previously:

@@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
#else
extern "C" void *memalign(std::size_t boundary, std::size_t size);
#endif
-#define aligned_alloc memalign


Yes, exactly ... this commit introduced the regression.


OK, I've regressed the branches then - I'll fix that.


This should fix it. I'll finish testing and commit it.

Sebastian, your patch to define HAVE_ALIGNED_ALLOC is OK for
gcc-7-branch and gcc-8-branch, because changing newlib from using
memalign to aligned_alloc is safe.


With the patch this time ...


commit cb2a7aae2668b690cfaed5093e0107e0ee64bb0e
Author: Jonathan Wakely 
Date:   Wed Aug 8 15:28:48 2018 +0100

Prevent internal aligned_alloc clashing with libc version

If configure fails to detect aligned_alloc we will try to define our
own in new_opa.cc but that could clash with the libcversion in
. Use a namespace to keep them distinct.

* libsupc++/new_opa.cc (aligned_alloc): Declare inside namespace to
avoid clashing with an ::aligned_alloc function that was not detected
by configure.

diff --git a/libstdc++-v3/libsupc++/new_opa.cc b/libstdc++-v3/libsupc++/new_opa.cc
index 5be0cc2ca65..68eac5b8ceb 100644
--- a/libstdc++-v3/libsupc++/new_opa.cc
+++ b/libstdc++-v3/libsupc++/new_opa.cc
@@ -25,15 +25,30 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "new"
 
+#if !_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE__ALIGNED_MALLOC \
+  && !_GLIBCXX_HAVE_POSIX_MEMALIGN && _GLIBCXX_HAVE_MEMALIGN
+# if _GLIBCXX_HOSTED && __has_include()
+// Some C libraries declare memalign in 
+#  include 
+# else
+extern "C" void *memalign(std::size_t boundary, std::size_t size);
+# endif
+#endif
+
 using std::new_handler;
 using std::bad_alloc;
 
-#if !_GLIBCXX_HAVE_ALIGNED_ALLOC
-#if _GLIBCXX_HAVE__ALIGNED_MALLOC
-#define aligned_alloc(al,sz) _aligned_malloc(sz,al)
+namespace __gnu_cxx {
+#if _GLIBCXX_HAVE_ALIGNED_ALLOC
+using ::aligned_alloc;
+#elif _GLIBCXX_HAVE__ALIGNED_MALLOC
+static inline void*
+aligned_alloc (std::size_t al, std::size_t sz)
+{ return _aligned_malloc(sz, al); }
 #elif _GLIBCXX_HAVE_POSIX_MEMALIGN
 static inline void*
 aligned_alloc (std::size_t al, std::size_t sz)
@@ -49,11 +64,6 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return nullptr;
 }
 #elif _GLIBCXX_HAVE_MEMALIGN
-#if _GLIBCXX_HOSTED
-#include 
-#else
-extern "C" void *memalign(std::size_t boundary, std::size_t size);
-#endif
 static inline void*
 aligned_alloc (std::size_t al, std::size_t sz)
 {
@@ -66,7 +76,6 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return memalign (al, sz);
 }
 #else // !HAVE__ALIGNED_MALLOC && !HAVE_POSIX_MEMALIGN && !HAVE_MEMALIGN
-#include 
 // The C library doesn't provide any aligned allocation functions, define one.
 // This is a modified version of code from gcc/config/i386/gmm_malloc.h
 static inline void*
@@ -87,7 +96,7 @@ aligned_alloc (std::size_t al, std::size_t sz)
   return aligned_ptr;
 }
 #endif
-#endif
+} // namespace __gnu_cxx
 
 _GLIBCXX_WEAK_DEFINITION void *
 operator new (std::size_t sz, std::align_val_t al)
@@ -116,6 +125,7 @@ operator new (std::size_t sz, std::align_val_t al)
 sz += align - rem;
 #endif
 
+  using __gnu_cxx::aligned_alloc;
   while (__builtin_expect ((p = aligned_alloc (align, sz)) == 0, false))
 {
   new_handler handler = std::get_new_handler ();


Re: [PATCH] line-map include-from representation

2018-08-08 Thread David Malcolm
On Wed, 2018-08-08 at 09:06 -0400, Nathan Sidwell wrote:
> This patch changes the way line-maps hold the included-from
> information. 
>   Currently this is an index to another (earlier) line-map, and
> relies 
> on the fact that #include cause the termination of the current map
> and 
> emission of a new map.  It's then possible to determine the location
> of 
> the include directive as the last line of the just terminated line
> map 
> (by examining the first location of the next map).
> 
> With modules, I'd like to show the import path in a similar
> way.  But 
> importing a module doesn't (necessarily) cause termination of a line
> map 
> -- the imported module's 'imported-from' location is in the middle
> of 
> some other map.  Here's what that looks like:
> In file of module bob,
>imported at loc-2_b.C:5,
>  of module stuart,
>imported at loc-2_d.C:2:
> loc-2_a.C:5:18: note:   initializing argument 1 of 'int frob(int*)'
> 
> [included-from lines may appear in the middle of that stack too]
> 
> Thus, this patch uses a source_location to represent the included
> from 
> location.  No data size changes, as old and new forms are really
> ints. 
> IMHO it makes the interface somewhat cleaner, as we stop exposing
> the 
> map indexes to the library users.
> 
> [I noticed that a piece of this patch already escaped, we were using 
> LAST_SOURCE_COLUMN in diagnostic_report_current_module, but I'd 
> inadvertently changed that in the prepatch that was intended to
> prepare 
> the ground for this one.  Sorry.]
> 
> booted & tested on x86_64-linux, ok?

> 2018-08-08  Nathan Sidwell  
> 
>   Make linemap::included_at a location
>   libcpp/
>   * include/line-map.h (struct line_map_ordinary): Replace
>   included_from map index with included_at source_location.
>   (ORDINARY_MAP_INCLUDER_FILE_INDEX): Delete.
>   (LAST_SOURCE_LINE_LOCATION): Move to line-map.c.

Looks like LAST_SOURCE_LINE was also deleted.

>   (LAST_SOURCE_COLUMN): Delete.
>   (INCLUDED_AT): New.
>   (linemap_included_at): Declare.
>   (MAIN_FILE_P): Adjust.
>   * line-map.c (linemap_included_at): New.
>   (lonemap_check_files_exited): Use linemap_included_at.
>   (LAST_SOURCE_LINE_LOCATION): Made internal from header file.
>   (linemap_add): Adjust inclusion setting.
>   (linemap_dump, linemap_dump_location): Adjust.
>   * directives.c (do_linemarker): Use linemap_included_at.
>   gcc/
>   * diagnostic.c (diagnostic_report_current_module): Use INCLUDED_AT
>   & linemap_included_at.
>   gcc/c-family/
>   * c-common.c (try_to_locate_new_include_inertion_point): Use
>   linemap_included_at.
>   * c-lex.c (fe_file_change): Use INCLUDED_AT.
>   * c-ppoutput.c (pp_file_change): Likewise.
>   gcc/fortran/
>   * cpp.c (cb_file_change): Use INCLUDED_AT.
> 
> Index: gcc/c-family/c-common.c
> ===
> --- gcc/c-family/c-common.c   (revision 263365)
> +++ gcc/c-family/c-common.c   (working copy)
> @@ -8413,8 +8413,8 @@ try_to_locate_new_include_insertion_poin
>const line_map_ordinary *ord_map
>   = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
>  
> -  const line_map_ordinary *from = INCLUDED_FROM (line_table, ord_map);
> -  if (from)
> +  if (const line_map_ordinary *from
> +   = linemap_included_at (line_table, ord_map))

I think "linemap_included_from" would be a better name for what this
patch calls "linemap_included_at".

Doing so would make a better distinction between the location at which
the include happened, vs the linemap in which it happened.

[...snip...]

> Index: libcpp/include/line-map.h
> ===
> --- libcpp/include/line-map.h (revision 263365)
> +++ libcpp/include/line-map.h (working copy)

[...snip...]

> -/* Return the last column number within an ordinary map.  */
> -
> -inline linenum_type
> -LAST_SOURCE_COLUMN (const line_map_ordinary *map)
> +inline source_location
> +INCLUDED_AT (const line_map_ordinary *ord_map)
>  {
> -  return SOURCE_COLUMN (map, LAST_SOURCE_LINE_LOCATION (map));
> +  return ord_map->included_at;
>  }

These used to be macros, hence the SCARY_UPPER_CASE; I converted them
to inline functions as a precursor to adding range support to
locations; I didn't rename them at the time, to minimize churn.

This is a new inline function, and so doesn't need to be in upper case.
How about "linemap_included_at" - which you're already using for
something else, but which I think should be renamed...

> -/* Returns the map a given map was included from, or NULL if the map
> -   belongs to the main file, i.e, a file that wasn't included by
> -   another one.  */
> -inline line_map_ordinary *
> -INCLUDED_FROM (struct line_maps *set, const line_map_ordinary *ord_map)
> -{
> -  return ((ord_map->included_from == -1)
> -   ? NULL
> -   : LINEMAPS_ORD

Re: dejagnu version update?

2018-08-08 Thread Michael Matz
Hi,

On Wed, 8 Aug 2018, Bernhard Reutner-Fischer wrote:

> How come?
> 
> If one wants to develop on a distro that is notoriously outdated then 
> you have to obtain the missing pieces yourself.

It's not about developing on an "notoriously outdated" distro, but about 
_testing_ on it.  There are very good reasons to test the quality of a 
compiler also on older distros.

> I wouldn't call three years "aggressive".

But even independend from the above I would.


Ciao,
Michael.


[committed][libgomp, nvptx] Fall back to cuLinkAddData/cuLinkCreate if _v2 not found

2018-08-08 Thread Tom de Vries
On Tue, Aug 07, 2018 at 06:52:59AM -0700, Cesar Philippidis wrote:

> I spotted an error
> with the patch; I realized that the cuda.h that ships with libgomp
> emulates version CUDA 8.0. That lead to problems using cuLinkAddData,
> because that function gets remapped to cuLinkAddData_v2 in CUDA 6.5 and
> newer.
> 

Right. [ I found that problem is mentioned here already:
( https://gcc.gnu.org/ml/gcc-patches/2017-01/msg01670.html ). ]

This patch should fix it.

Committed.

Thanks,
- Tom

[libgomp, nvptx] Fall back to cuLinkAddData/cuLinkCreate if _v2 not found

Cuda driver api functions cuLinkAddData and cuLinkCreate are available starting
version 5.5.  In version 6.5, they are remapped onto _v2 versions.

The dlopen interface of the libgomp nvptx plugin uses the _v2 versions, so it
won't work with a cuda driver with driver api version lower than 6.5.

This patch fixes the problem by testing for the presence of the _v2 versions,
and falling back to the original versions in case of absence of the _v2
versions.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

2018-08-08  Tom de Vries  

* plugin/cuda-lib.def (cuLinkAddData_v2, cuLinkCreate_v2): Declare using
CUDA_ONE_CALL_MAYBE_NULL.
* plugin/plugin-nvptx.c (cuLinkAddData, cuLinkCreate): Undef and 
declare.
(cuLinkAddData_v2, cuLinkCreate_v2): Declare.
(link_ptx): Fall back to cuLinkAddData/cuLinkCreate if the _v2 versions
are not found.

---
 libgomp/plugin/cuda-lib.def   |  2 ++
 libgomp/plugin/plugin-nvptx.c | 28 
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/libgomp/plugin/cuda-lib.def b/libgomp/plugin/cuda-lib.def
index 6365cdbfcbe4..29028b504a05 100644
--- a/libgomp/plugin/cuda-lib.def
+++ b/libgomp/plugin/cuda-lib.def
@@ -19,8 +19,10 @@ CUDA_ONE_CALL_MAYBE_NULL (cuGetErrorString)
 CUDA_ONE_CALL (cuInit)
 CUDA_ONE_CALL (cuLaunchKernel)
 CUDA_ONE_CALL (cuLinkAddData)
+CUDA_ONE_CALL_MAYBE_NULL (cuLinkAddData_v2)
 CUDA_ONE_CALL (cuLinkComplete)
 CUDA_ONE_CALL (cuLinkCreate)
+CUDA_ONE_CALL_MAYBE_NULL (cuLinkCreate_v2)
 CUDA_ONE_CALL (cuLinkDestroy)
 CUDA_ONE_CALL (cuMemAlloc)
 CUDA_ONE_CALL (cuMemAllocHost)
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index b549b7740039..6799a264976d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -54,6 +54,18 @@ extern CUresult cuGetErrorString (CUresult, const char **);
 #define CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR 82
 #endif
 
+#if CUDA_VERSION >= 6050
+#undef cuLinkCreate
+#undef cuLinkAddData
+CUresult cuLinkAddData (CUlinkState, CUjitInputType, void *, size_t,
+   const char *, unsigned, CUjit_option *, void **);
+CUresult cuLinkCreate (unsigned, CUjit_option *, void **, CUlinkState *);
+#else
+CUresult cuLinkAddData_v2 (CUlinkState, CUjitInputType, void *, size_t,
+  const char *, unsigned, CUjit_option *, void **);
+CUresult cuLinkCreate_v2 (unsigned, CUjit_option *, void **, CUlinkState *);
+#endif
+
 #define DO_PRAGMA(x) _Pragma (#x)
 
 #if PLUGIN_NVPTX_DYNAMIC
@@ -938,16 +950,24 @@ link_ptx (CUmodule *module, const struct targ_ptx_obj 
*ptx_objs,
   nopts++;
 }
 
-  CUDA_CALL (cuLinkCreate, nopts, opts, optvals, &linkstate);
+  if (CUDA_CALL_EXISTS (cuLinkCreate_v2))
+CUDA_CALL (cuLinkCreate_v2, nopts, opts, optvals, &linkstate);
+  else
+CUDA_CALL (cuLinkCreate, nopts, opts, optvals, &linkstate);
 
   for (; num_objs--; ptx_objs++)
 {
   /* cuLinkAddData's 'data' argument erroneously omits the const
 qualifier.  */
   GOMP_PLUGIN_debug (0, "Loading:\n---\n%s\n---\n", ptx_objs->code);
-  r = CUDA_CALL_NOCHECK (cuLinkAddData, linkstate, CU_JIT_INPUT_PTX,
-(char *) ptx_objs->code, ptx_objs->size,
-0, 0, 0, 0);
+  if (CUDA_CALL_EXISTS (cuLinkAddData_v2))
+   r = CUDA_CALL_NOCHECK (cuLinkAddData_v2, linkstate, CU_JIT_INPUT_PTX,
+  (char *) ptx_objs->code, ptx_objs->size,
+  0, 0, 0, 0);
+  else
+   r = CUDA_CALL_NOCHECK (cuLinkAddData, linkstate, CU_JIT_INPUT_PTX,
+  (char *) ptx_objs->code, ptx_objs->size,
+  0, 0, 0, 0);
   if (r != CUDA_SUCCESS)
{
  GOMP_PLUGIN_error ("Link error log %s\n", &elog[0]);


Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 16:22 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:


Aha, so newlib was using memalign previously:

@@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
 #else
 extern "C" void *memalign(std::size_t boundary, std::size_t size);
 #endif
-#define aligned_alloc memalign


Yes, exactly ... this commit introduced the regression.


OK, I've regressed the branches then - I'll fix that.


This should fix it. I'll finish testing and commit it.

Sebastian, your patch to define HAVE_ALIGNED_ALLOC is OK for
gcc-7-branch and gcc-8-branch, because changing newlib from using
memalign to aligned_alloc is safe.

Sorry for the mess.




[committed][libgomp, nvptx] Allow cuGetErrorString to be NULL

2018-08-08 Thread Tom de Vries
Hi,

Cuda driver api function cuGetErrorString is available in version 6.0 and
higher.

Currently, when the driver that is used does not contain this function, the
libgomp nvptx plugin will not build (PLUGIN_NVPTX_DYNAMIC == 0) or run
(PLUGIN_NVPTX_DYNAMIC == 1).

This patch fixes this problem by testing for the presence of the function, and
handling absence.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Allow cuGetErrorString to be NULL

2018-08-08  Tom de Vries  

* plugin/cuda-lib.def (cuGetErrorString): Use CUDA_ONE_CALL_MAYBE_NULL.
* plugin/plugin-nvptx.c (cuda_error): Handle if cuGetErrorString is not
present.

---
 libgomp/plugin/cuda-lib.def   |  2 +-
 libgomp/plugin/plugin-nvptx.c | 10 +++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/libgomp/plugin/cuda-lib.def b/libgomp/plugin/cuda-lib.def
index be8e3b3ec4d6..6365cdbfcbe4 100644
--- a/libgomp/plugin/cuda-lib.def
+++ b/libgomp/plugin/cuda-lib.def
@@ -15,7 +15,7 @@ CUDA_ONE_CALL (cuEventQuery)
 CUDA_ONE_CALL (cuEventRecord)
 CUDA_ONE_CALL (cuEventSynchronize)
 CUDA_ONE_CALL (cuFuncGetAttribute)
-CUDA_ONE_CALL (cuGetErrorString)
+CUDA_ONE_CALL_MAYBE_NULL (cuGetErrorString)
 CUDA_ONE_CALL (cuInit)
 CUDA_ONE_CALL (cuLaunchKernel)
 CUDA_ONE_CALL (cuLinkAddData)
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 589d6596cc2f..b549b7740039 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -161,13 +161,17 @@ init_cuda_lib (void)
 static const char *
 cuda_error (CUresult r)
 {
+  const char *fallback = "unknown cuda error";
   const char *desc;
 
+  if (!CUDA_CALL_EXISTS (cuGetErrorString))
+return fallback;
+
   r = CUDA_CALL_NOCHECK (cuGetErrorString, r, &desc);
-  if (r != CUDA_SUCCESS)
-desc = "unknown cuda error";
+  if (r == CUDA_SUCCESS)
+return desc;
 
-  return desc;
+  return fallback;
 }
 
 static unsigned int instantiated_devices = 0;


[committed][libgomp, nvptx] Remove hard-coded const in nvptx_open_device

2018-08-08 Thread Tom de Vries
Hi,

CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined in cuda driver
api version 6.0 and higher.

Currently nvptx_open_device uses a hard-coded constant instead.

This patch fixes that by:
- defining CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR to the hardcoded
  constant at toplevel, if not present in cuda.h, and
- using CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR in 
nvptx_open_device

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Remove hard-coded const in nvptx_open_device

2018-08-08  Tom de Vries  

* plugin/plugin-nvptx.c
(CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR): Define.
(nvptx_open_device): Use
CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR.

---
 libgomp/plugin/plugin-nvptx.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index e2ea542049b0..589d6596cc2f 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -51,6 +51,7 @@
 
 #if CUDA_VERSION < 6000
 extern CUresult cuGetErrorString (CUresult, const char **);
+#define CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR 82
 #endif
 
 #define DO_PRAGMA(x) _Pragma (#x)
@@ -741,9 +742,11 @@ nvptx_open_device (int n)
  &pi, CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK, dev);
   ptx_dev->regs_per_block = pi;
 
-  /* CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82 is defined only
+  /* CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined only
  in CUDA 6.0 and newer.  */
-  r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi, 82, dev);
+  r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi,
+CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR,
+dev);
   /* Fallback: use limit of registers per block, which is usually equal.  */
   if (r == CUDA_ERROR_INVALID_VALUE)
 pi = ptx_dev->regs_per_block;


[committed][libgomp, nvptx] Note that cuGetErrorString is in CUDA_VERSION >= 6000

2018-08-08 Thread Tom de Vries
Hi,

Cuda driver api function cuGetErrorString is available in version 6.0 and
higher.

This patch:
- removes a comment saying the declaration is not available in cuda.h 6.0
- fixes the presence test to use CUDA_VERSION < 6000
- moves the declaration to toplevel

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Note that cuGetErrorString is in CUDA_VERSION >= 6000

2018-08-08  Tom de Vries  

* plugin/plugin-nvptx.c (cuda_error): Move declaration of 
cuGetErrorString ...
(cuGetErrorString): ... here.  Guard with CUDA_VERSION < 6000.

---
 libgomp/plugin/plugin-nvptx.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 825470adce3e..e2ea542049b0 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -49,6 +49,10 @@
 #include 
 #include 
 
+#if CUDA_VERSION < 6000
+extern CUresult cuGetErrorString (CUresult, const char **);
+#endif
+
 #define DO_PRAGMA(x) _Pragma (#x)
 
 #if PLUGIN_NVPTX_DYNAMIC
@@ -156,11 +160,6 @@ init_cuda_lib (void)
 static const char *
 cuda_error (CUresult r)
 {
-#if CUDA_VERSION < 7000
-  /* Specified in documentation and present in library from at least
- 5.5.  Not declared in header file prior to 7.0.  */
-  extern CUresult cuGetErrorString (CUresult, const char **);
-#endif
   const char *desc;
 
   r = CUDA_CALL_NOCHECK (cuGetErrorString, r, &desc);


Re: [PATCH v3] Add HXT Phecda core support

2018-08-08 Thread James Greenhalgh
On Wed, Aug 08, 2018 at 05:38:09AM -0500, Hongbo Zhang wrote:
> HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx,
> reuses the same tuning structure and pipeline with it.

Thank you. This patch is still OK.

I've applied it on your behalf as r263404.

Thanks,
James

> 
> 2018-08-08  Hongbo Zhang  
> 
>   * config/aarch64/aarch64-cores.def: Add phecda core.
>   * config/aarch64/aarch64-tune.md: Regenerate.
>   * doc/invoke.texi: Add phecda core.



Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Ulrich Weigand
Jonathan Wakely wrote:

> Aha, so newlib was using memalign previously:
> 
> @@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
>  #else
>  extern "C" void *memalign(std::size_t boundary, std::size_t size);
>  #endif
> -#define aligned_alloc memalign

Yes, exactly ... this commit introduced the regression.

> OK, I've regressed the branches then - I'll fix that.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 16:04 +0200, Sebastian Huber wrote:

On 08/08/18 16:01, Jonathan Wakely wrote:

On 08/08/18 15:57 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:

On 08/08/18 10:52 +0200, Sebastian Huber wrote:

While building for Newlib, some configure checks must be hard coded.
The aligned_alloc() is supported since 2015 in Newlib.

libstdc++-v3

    PR target/85904
    * configure.ac): Define HAVE_ALIGNED_ALLOC if building for


There's a stray closing parenthesis here.


    Newlib.
    * configure: Regnerate.


Typo "Regnerate".

But the patch itself is fine - OK for trunk.

I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
(gcc-6 is unaffected as it doesn't use aligned_alloc).

It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
affects the memory layout for allocations from operator new(size_t,
align_val_t) (in new_opa.cc) which needs to agree with the
corresponding operator delete (in del_opa.cc). Using static linking it
might be possible to create a binary that has operator new using
aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
which would be bad.

Those operators are C++17, so "experimental", but maybe we shouldn't
make the change on release branches.


The way it is now I'm getting build failures on new SPU target
(which is newlib based):

/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:58:1: 
error: 'void* aligned_alloc(std::size_t, std::size_t)' was 
declared 'extern' and later 'static' [-fpermissive]

aligned_alloc (std::size_t al, std::size_t sz)
^
In file included from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/cstdlib:75:62,
    from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/stdlib.h:36,
    from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:27:
/home/uweigand/dailybuild/spu-tc-2018-08-07/spu-toolchain/spu/include/stdlib.h:328:8: 
note: previous declaration of 'void* aligned_alloc(size_t, 
size_t)'

void * aligned_alloc(size_t, size_t) __malloc_like __alloc_align(1)
   ^

This seems to be because configure is hard-coded to assume aligned_alloc
is not available, but then the new_opc.cc file tries to define its own
version, conflicting with the newlib prototype that is actually there.

So one way or the other this needs to be fixed ...


But that's on trunk, right? Sebastian is about to fix that.

I expect the same problem exists on the branches too though.

I'm surprised nobody has reported this until now. It's been like that
since r244933 in Jan 2017.


I notices this problem today with a build of the GCC 7 branch. I am 
able to build a GCC 7.3.0.


Aha, so newlib was using memalign previously:

@@ -53,20 +54,24 @@ aligned_alloc (std::size_t al, std::size_t sz)
#else
extern "C" void *memalign(std::size_t boundary, std::size_t size);
#endif
-#define aligned_alloc memalign
-#else
+static inline void*
+aligned_alloc (std::size_t al, std::size_t sz)
+{
+#ifdef __sun
+  // Solaris 10 memalign requires that alignment is greater than or equal to
+  // the size of a word.
+  if (al < sizeof(int))
+al = sizeof(int);
+#endif
+  return memalign (al, sz);
+}
+#else // !HAVE__ALIGNED_MALLOC && !HAVE_POSIX_MEMALIGN && !HAVE_MEMALIGN
#include 
// The C library doesn't provide any aligned allocation functions, define one.
// This is a modified version of code from gcc/config/i386/gmm_malloc.h


OK, I've regressed the branches then - I'll fix that.





Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry

2018-08-08 Thread Cesar Philippidis
On 08/07/2018 06:52 AM, Cesar Philippidis wrote:

> I attached an updated version of the CUDA driver patch, although I
> haven't rebased it against your changes yet. It still needs to be tested
> against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
> you an update.
> 
> Does this patch look OK, at least after testing competes? I removed the
> tests for CUDA_ONE_CALL_MAYBE_NULL, because the newer CUDA API isn't
> supported in the older drivers.

I've finally finished testing this patch. Besides for a couple of
regressions with CUDA 5.5 in libgomp.oacc-c-c++-common/lib-75.c,
lib-76.c and lib-79.c, the results came back clean.

This patch has been tested the following ways using a K40 GPU:

  * Using GCC's cuda.h with CUDA 9.2 drivers.
  * Using cuda.h from CUDA 5.5 and Nvidia drivers 331.133 (supports CUDA
6.0) and the driver from CUDA 8.0.
  * Using cuda.h from CUDA 8.0.

As mentioned before, because GCC's cuda.h defines CUDA_VERSION as 8000,
there was a conflict with using it against CUDA 5.5, because of the
missing cuLinkAddData_v2 symbol.

Note how the usage of cuOccupancyMaxPotentialBlockSize is guarded by
checking for the version of CUDA_VERSION. I don't really like this, but
it's a necessary evil of maintaining backwards compatibility.

Is this patch OK for trunk?

Thanks,
Cesar
[nvptx] Use CUDA driver API to select default runtime launch geometry

2018-08-YY  Cesar Philippidis  

	libgomp/
	plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef.
	(cuDriverGetVersion): Declare.
	(cuOccupancyMaxPotentialBlockSizeWithFlags): Declare.
	plugin/plugin-nvptx.c (CUDA_ONE_CALL): Add entries for
	cuDriverGetVersion and cuOccupancyMaxPotentialBlockSize.
	(ptx_device): Add driver_version member.
	(nvptx_open_device): Initialize it.
	(nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the
	default num_gangs and num_workers when the driver supports it.
---
 libgomp/plugin/cuda-lib.def   |  2 ++
 libgomp/plugin/cuda/cuda.h|  4 
 libgomp/plugin/plugin-nvptx.c | 40 +++-
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/libgomp/plugin/cuda-lib.def b/libgomp/plugin/cuda-lib.def
index be8e3b3..f2433e1 100644
--- a/libgomp/plugin/cuda-lib.def
+++ b/libgomp/plugin/cuda-lib.def
@@ -2,6 +2,7 @@ CUDA_ONE_CALL (cuCtxCreate)
 CUDA_ONE_CALL (cuCtxDestroy)
 CUDA_ONE_CALL (cuCtxGetCurrent)
 CUDA_ONE_CALL (cuCtxGetDevice)
+CUDA_ONE_CALL (cuDriverGetVersion)
 CUDA_ONE_CALL (cuCtxPopCurrent)
 CUDA_ONE_CALL (cuCtxPushCurrent)
 CUDA_ONE_CALL (cuCtxSynchronize)
@@ -39,6 +40,7 @@ CUDA_ONE_CALL (cuModuleGetGlobal)
 CUDA_ONE_CALL (cuModuleLoad)
 CUDA_ONE_CALL (cuModuleLoadData)
 CUDA_ONE_CALL (cuModuleUnload)
+CUDA_ONE_CALL (cuOccupancyMaxPotentialBlockSize)
 CUDA_ONE_CALL (cuStreamCreate)
 CUDA_ONE_CALL (cuStreamDestroy)
 CUDA_ONE_CALL (cuStreamQuery)
diff --git a/libgomp/plugin/cuda/cuda.h b/libgomp/plugin/cuda/cuda.h
index 4799825..3a790e6 100644
--- a/libgomp/plugin/cuda/cuda.h
+++ b/libgomp/plugin/cuda/cuda.h
@@ -44,6 +44,7 @@ typedef void *CUevent;
 typedef void *CUfunction;
 typedef void *CUlinkState;
 typedef void *CUmodule;
+typedef size_t (*CUoccupancyB2DSize)(int);
 typedef void *CUstream;
 
 typedef enum {
@@ -123,6 +124,7 @@ CUresult cuCtxSynchronize (void);
 CUresult cuDeviceGet (CUdevice *, int);
 CUresult cuDeviceGetAttribute (int *, CUdevice_attribute, CUdevice);
 CUresult cuDeviceGetCount (int *);
+CUresult cuDriverGetVersion(int *);
 CUresult cuEventCreate (CUevent *, unsigned);
 #define cuEventDestroy cuEventDestroy_v2
 CUresult cuEventDestroy (CUevent);
@@ -170,6 +172,8 @@ CUresult cuModuleGetGlobal (CUdeviceptr *, size_t *, CUmodule, const char *);
 CUresult cuModuleLoad (CUmodule *, const char *);
 CUresult cuModuleLoadData (CUmodule *, const void *);
 CUresult cuModuleUnload (CUmodule);
+CUresult cuOccupancyMaxPotentialBlockSize(int *, int *, CUfunction,
+	  CUoccupancyB2DSize, size_t, int);
 CUresult cuStreamCreate (CUstream *, unsigned);
 #define cuStreamDestroy cuStreamDestroy_v2
 CUresult cuStreamDestroy (CUstream);
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 825470a..b0ccf0b 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -376,6 +376,7 @@ struct ptx_device
   int max_threads_per_block;
   int max_threads_per_multiprocessor;
   int default_dims[GOMP_DIM_MAX];
+  int driver_version;
 
   struct ptx_image_data *images;  /* Images loaded on device.  */
   pthread_mutex_t image_lock; /* Lock for above list.  */
@@ -687,6 +688,7 @@ nvptx_open_device (int n)
   ptx_dev->ord = n;
   ptx_dev->dev = dev;
   ptx_dev->ctx_shared = false;
+  ptx_dev->driver_version = 0;
 
   r = CUDA_CALL_NOCHECK (cuCtxGetDevice, &ctx_dev);
   if (r != CUDA_SUCCESS && r != CUDA_ERROR_INVALID_CONTEXT)
@@ -780,6 +782,9 @@ nvptx_open_device (int n)
   for (int i = 0; i != GOMP_DIM_MAX; i++)
 ptx_dev->default_dims[i] = 0;
 
+  CUDA_CALL_ERET (NULL, cuDriverGetVersion, &pi);
+  

Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Sebastian Huber

On 08/08/18 16:01, Jonathan Wakely wrote:

On 08/08/18 15:57 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:

On 08/08/18 10:52 +0200, Sebastian Huber wrote:
>While building for Newlib, some configure checks must be hard coded.
>The aligned_alloc() is supported since 2015 in Newlib.
>
>libstdc++-v3
>
>    PR target/85904
>    * configure.ac): Define HAVE_ALIGNED_ALLOC if building for

There's a stray closing parenthesis here.

>    Newlib.
>    * configure: Regnerate.

Typo "Regnerate".

But the patch itself is fine - OK for trunk.

I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
(gcc-6 is unaffected as it doesn't use aligned_alloc).

It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
affects the memory layout for allocations from operator new(size_t,
align_val_t) (in new_opa.cc) which needs to agree with the
corresponding operator delete (in del_opa.cc). Using static linking it
might be possible to create a binary that has operator new using
aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
which would be bad.

Those operators are C++17, so "experimental", but maybe we shouldn't
make the change on release branches.


The way it is now I'm getting build failures on new SPU target
(which is newlib based):

/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:58:1: 
error: 'void* aligned_alloc(std::size_t, std::size_t)' was declared 
'extern' and later 'static' [-fpermissive]

aligned_alloc (std::size_t al, std::size_t sz)
^
In file included from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/cstdlib:75:62,
    from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/stdlib.h:36,
    from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:27:
/home/uweigand/dailybuild/spu-tc-2018-08-07/spu-toolchain/spu/include/stdlib.h:328:8: 
note: previous declaration of 'void* aligned_alloc(size_t, size_t)'

void * aligned_alloc(size_t, size_t) __malloc_like __alloc_align(1)
   ^

This seems to be because configure is hard-coded to assume aligned_alloc
is not available, but then the new_opc.cc file tries to define its own
version, conflicting with the newlib prototype that is actually there.

So one way or the other this needs to be fixed ...


But that's on trunk, right? Sebastian is about to fix that.

I expect the same problem exists on the branches too though.

I'm surprised nobody has reported this until now. It's been like that
since r244933 in Jan 2017. 


I notices this problem today with a build of the GCC 7 branch. I am able 
to build a GCC 7.3.0.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 15:57 +0200, Ulrich Weigand wrote:

Jonathan Wakely wrote:

On 08/08/18 10:52 +0200, Sebastian Huber wrote:
>While building for Newlib, some configure checks must be hard coded.
>The aligned_alloc() is supported since 2015 in Newlib.
>
>libstdc++-v3
>
>PR target/85904
>* configure.ac): Define HAVE_ALIGNED_ALLOC if building for

There's a stray closing parenthesis here.

>Newlib.
>* configure: Regnerate.

Typo "Regnerate".

But the patch itself is fine - OK for trunk.

I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
(gcc-6 is unaffected as it doesn't use aligned_alloc).

It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
affects the memory layout for allocations from operator new(size_t,
align_val_t) (in new_opa.cc) which needs to agree with the
corresponding operator delete (in del_opa.cc). Using static linking it
might be possible to create a binary that has operator new using
aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
which would be bad.

Those operators are C++17, so "experimental", but maybe we shouldn't
make the change on release branches.


The way it is now I'm getting build failures on new SPU target
(which is newlib based):

/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:58:1:
 error: 'void* aligned_alloc(std::size_t, std::size_t)' was declared 'extern' 
and later 'static' [-fpermissive]
aligned_alloc (std::size_t al, std::size_t sz)
^
In file included from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/cstdlib:75:62,
from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/stdlib.h:36,
from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:27:
/home/uweigand/dailybuild/spu-tc-2018-08-07/spu-toolchain/spu/include/stdlib.h:328:8:
 note: previous declaration of 'void* aligned_alloc(size_t, size_t)'
void * aligned_alloc(size_t, size_t) __malloc_like __alloc_align(1)
   ^

This seems to be because configure is hard-coded to assume aligned_alloc
is not available, but then the new_opc.cc file tries to define its own
version, conflicting with the newlib prototype that is actually there.

So one way or the other this needs to be fixed ...


But that's on trunk, right? Sebastian is about to fix that.

I expect the same problem exists on the branches too though.

I'm surprised nobody has reported this until now. It's been like that
since r244933 in Jan 2017.

Maybe we should put the fallback implementation into a namespace to
avoid the same problem on other targets where HAVE_ALIGNED_ALLOC isn't
set correctly.





Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-08-08 Thread Paul Koning
Now I'm puzzled.

I don't see how an infinite would show up in the original expression.  I don't 
know hyperbolic functions, so I just constructed a small test program, and the 
original vs. the substitution you mention are not at all similar.

paul


> On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi 
>  wrote:
> 
> That is a good question because I didn't know that such targets
> exists. Any suggestion?
> 
> 
> On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning  wrote:
>> 
>> 
>>> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi 
>>>  wrote:
>>> 
>>> Related with bug 86829, but for hyperbolic trigonometric functions.
>>> This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1
>>> - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both
>>> formulas has division by 0, but it causes no harm because 1/(+0) ->
>>> +infinity, thus the math is still safe.
>> 
>> What about non-IEEE targets that don't have "infinite" in their float 
>> representation?
>> 
>>paul
>> 
>> 



Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Ulrich Weigand
Jonathan Wakely wrote:
> On 08/08/18 10:52 +0200, Sebastian Huber wrote:
> >While building for Newlib, some configure checks must be hard coded.
> >The aligned_alloc() is supported since 2015 in Newlib.
> >
> >libstdc++-v3
> >
> > PR target/85904
> > * configure.ac): Define HAVE_ALIGNED_ALLOC if building for
> 
> There's a stray closing parenthesis here.
> 
> > Newlib.
> > * configure: Regnerate.
> 
> Typo "Regnerate".
> 
> But the patch itself is fine - OK for trunk.
> 
> I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
> (gcc-6 is unaffected as it doesn't use aligned_alloc).
> 
> It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
> affects the memory layout for allocations from operator new(size_t,
> align_val_t) (in new_opa.cc) which needs to agree with the
> corresponding operator delete (in del_opa.cc). Using static linking it
> might be possible to create a binary that has operator new using
> aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
> which would be bad.
> 
> Those operators are C++17, so "experimental", but maybe we shouldn't
> make the change on release branches.

The way it is now I'm getting build failures on new SPU target
(which is newlib based):

/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:58:1:
 error: 'void* aligned_alloc(std::size_t, std::size_t)' was declared 'extern' 
and later 'static' [-fpermissive]
 aligned_alloc (std::size_t al, std::size_t sz)
 ^
In file included from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/cstdlib:75:62,
 from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-build/spu/libstdc++-v3/include/stdlib.h:36,
 from 
/home/uweigand/dailybuild/spu-tc-2018-08-07/gcc-head/src/libstdc++-v3/libsupc++/new_opa.cc:27:
/home/uweigand/dailybuild/spu-tc-2018-08-07/spu-toolchain/spu/include/stdlib.h:328:8:
 note: previous declaration of 'void* aligned_alloc(size_t, size_t)'
 void * aligned_alloc(size_t, size_t) __malloc_like __alloc_align(1)
^

This seems to be because configure is hard-coded to assume aligned_alloc
is not available, but then the new_opc.cc file tries to define its own
version, conflicting with the newlib prototype that is actually there.

So one way or the other this needs to be fixed ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [RFC,PATCH] Introduce -msdata=explicit for powerpc

2018-08-08 Thread Segher Boessenkool
On Wed, Aug 08, 2018 at 12:38:54AM -0300, Alexandre Oliva wrote:
> On Aug  7, 2018, Segher Boessenkool  wrote:
> > On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote:
> 
> >> I saw comments, docs and init code that suggested the possibility of
> >> using r2/.sdata2 for small data, but I couldn't get code to be generated
> >> for such access, even with very old toolchains.  I'm not sure I'm just
> >> missing how this magic comes about, or whether it hasn't been available
> >> for a very long time but nobody removed the comments/docs.  Assuming I'm
> >> missing something, I put in the possibility of using r2 in the test in
> >> the patch, but I'm sure I have not exercised it to make sure I got it
> >> right.  Help?
> 
> > attribute(section("sdata2")))  works, perhaps.  Nothing is put there
> > implicitly afais.
> 
> Yeah, that's the point.  Docs say:
> 
> @item -msdata=eabi
> @opindex msdata=eabi
> On System V.4 and embedded PowerPC systems, put small initialized
> @code{const} global and static data in the @code{.sdata2} section, which
> is pointed to by register @code{r2}.  Put small initialized
> non-@code{const} global and static data in the @code{.sdata} section,
> which is pointed to by register @code{r13}.  [...]
> 
> and rs6000.c contains:
> 
> rs6000_elf_asm_init_sections (void)
> {
> [...]
>   sdata2_section
> = get_unnamed_section (SECTION_WRITE, output_section_asm_op,
>  SDATA2_SECTION_ASM_OP);
> 
> but sdata2_section seems to be unused, and I don't see anything that
> selects r2 over SMALL_DATA_REG.

So, you need to build a powerpc-eabi- toolchain, and use -msdata=eabi.
Then you get sdata2 used (via srodata in generic code), and it is accessed
via GPR2 (via the sda21 reloc and linker magic).  It is hard to trace down :-)

> >> I have not YET given this much testing.  I'm posting it so as to give
> >> ppc maintainers an opportunity to comment on the proposed approach, in
> >> hopes of getting buy-in for the idea, if not necessarily for the patch,
> >> but I welcome alternate suggestions to enable users to choose what goes
> >> in faster sdata when there's too much data for size-based assignment to
> >> place interesting variables in sdata without overflowing its range.
> 
> > There is 64kB of sdata, and the maximum size of an object to be put there
> > is 8 bytes by default.  That will require an awful lot of manual markup.
> 
> In this case, it's machine-generated code we're talking about.  IIUC one
> large-ish data set is performance critical, and another data set that
> would take them both over the edge isn't, though they're both used
> together in the same translation units.
> 
> I suppose defining a fastint C macro / Fast_Integer Ada subtype in some
> C header file or Ada package could make it a lot more appealing and easy
> to use ;-)
> 
> > Well I'm sure you try it out on a reasonably sized
> > project, and you'll find out if it is handy or not :-)
> 
> I wanted to run this design through port maintainers to check for
> objections before offering it to the interested customer.  If you tell
> me there's no fundamental objection, I will have it run through the
> customer to confirm it will indeed meet their needs.

It looks fine, not intrusive, and your use case is reasonable as well.
Shouldn't be a problem.


Segher


Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Sebastian Huber

On 08/08/18 13:18, Jonathan Wakely wrote:

On 08/08/18 10:52 +0200, Sebastian Huber wrote:

While building for Newlib, some configure checks must be hard coded.
The aligned_alloc() is supported since 2015 in Newlib.

libstdc++-v3

PR target/85904
* configure.ac): Define HAVE_ALIGNED_ALLOC if building for


There's a stray closing parenthesis here.


Newlib.
* configure: Regnerate.


Typo "Regnerate".

But the patch itself is fine - OK for trunk.


Oh, sorry, for all the typos.



I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
(gcc-6 is unaffected as it doesn't use aligned_alloc).

It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
affects the memory layout for allocations from operator new(size_t,
align_val_t) (in new_opa.cc) which needs to agree with the
corresponding operator delete (in del_opa.cc). Using static linking it
might be possible to create a binary that has operator new using
aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
which would be bad.

Those operators are C++17, so "experimental", but maybe we shouldn't
make the change on release branches.



I was able to build a GCC 7.3.0 compiler, but now the build fails on the 
GCC 7 branch.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] Improve libstdc++ docs w.r.t newer C++ standards

2018-08-08 Thread Jonathan Wakely

On 08/08/18 14:29 +0100, Jonathan Wakely wrote:

On 31/07/18 16:03 +0100, Jonathan Wakely wrote:

Instead of repeating all the old headers for every new standard I've
changed the docs to only list the new headers for each standard.

* doc/xml/manual/test.xml: Improve documentation on writing tests for
newer standards.
* doc/xml/manual/using.xml: Document all headers for C++11 and later.
* doc/html/*: Regenerate.


This fixes a missing header and typo in the LFTS table.


And another little tweak to the table markup.

Committed to trunk.


commit cf1fda4055fae2fa4f4f4a3f3153c3672438eaf0
Author: Jonathan Wakely 
Date:   Wed Aug 8 14:36:35 2018 +0100

Fix Docbook markup for table entry

* doc/xml/manual/using.xml: Fix markup for empty table entry.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index 7512b39afaf..63031c8a86d 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -574,7 +574,7 @@ compilation errors, but will not define anything.
 experimental/unordered_set
 experimental/utility
 experimental/vector
-
+
 
 
 


Re: dejagnu version update?

2018-08-08 Thread Richard Earnshaw (lists)
On 08/08/18 12:17, Bernhard Reutner-Fischer wrote:
> On 7 August 2018 18:34:30 CEST, Segher Boessenkool 
>  wrote:
>> On Mon, Aug 06, 2018 at 08:25:49AM -0700, Mike Stump wrote:
>>> Since g++ already requires 1.5.3, it make no sense to bump to
>> anything older that 1.5.3, so let's bump to 1.5.3.  Those packaging
>> systems and OSes that wanted to update by now, have had their chance to
>> update.  Those that punt until we bump the requirement, well, they will
>> now have to bump.  :-)
>>
>> "g++ requires it"?  In what way?  I haven't seen any issues with older
>> dejagnu versions.
> 
> I think 
> http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=commit;h=5256bd82343000c76bc0e48139003f90b6184347
> 
>>
>>> Ok to update to 1.5.3.
>>
>> 1.5.3 is only three years old, and not all distros carry it.  This is
>> rather aggressive...
> 
> How come?
> If one wants to develop on a distro that is notoriously outdated then you 
> have to obtain the missing pieces yourself. I wouldn't call three years 
> "aggressive".
> 

I would.

IT departments don't upgrade every machine each time a new distribution
comes out.  They expect to install one version (plus the security
updates, of course) on that machine for its lifetime.  Assuming new
distros are released every couple of years (quite aggressive) and that
IT groups also start installing the new version immediately it is
released on those new machines (extremely aggressive), you've got a 5
year life-cycle for software if you work on the basis that a machine is
expected to last three years.

So in practice, I think 6 years is more like that timeframe that needs
to be considered for these things and even that is quite aggressive.
Some machines have to run older versions of the OS simply because other
software running on them *has* to use an older OS release.

R.


[PATCH] Relax SUPPORTS_STACK_ALIGNMENT with !crtl->stack_realign_tried

2018-08-08 Thread H.J. Lu
Assert for SUPPORTS_STACK_ALIGNMENT was added for dynamic stack
alignment.  At the time, arg_pointer_rtx would only be eliminated
by either hard_frame_pointer_rtx or stack_pointer_rtx only when
dynamic stack alignment is supported.  With

commit cd557ff63f388ad27c376d0a225e74d3594a6f9d
Author: hjl 
Date:   Thu Aug 10 15:29:05 2017 +

i386: Don't use frame pointer without stack access

When there is no stack access, there is no need to use frame pointer
even if -fno-omit-frame-pointer is used and caller's frame pointer is
unchanged.

this can happen when there is no dynamic stack alignment.  This patch
relaxes SUPPORTS_STACK_ALIGNMENT with !crtl->stack_realign_tried to
allow arg_pointer_rtx to be eliminated by either hard_frame_pointer_rtx
or stack_pointer_rtx when there is no dynamic stack alignment at all.

gcc/

PR debug/86593
* dwarf2out.c (based_loc_descr): Replace SUPPORTS_STACK_ALIGNMENT
with (SUPPORTS_STACK_ALIGNMENT || !crtl->stack_realign_tried).
(compute_frame_pointer_to_fb_displacement): Likewise.

gcc/testsuite/

PR debug/86593
* g++.dg/pr86593.C: New test.
---
 gcc/dwarf2out.c|  6 --
 gcc/testsuite/g++.dg/pr86593.C | 11 +++
 2 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr86593.C

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index b67481dd2db..6ecdc4562d0 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -14304,7 +14304,8 @@ based_loc_descr (rtx reg, poly_int64 offset,
   if (elim != reg)
{
  elim = strip_offset_and_add (elim, &offset);
- gcc_assert ((SUPPORTS_STACK_ALIGNMENT
+ gcc_assert (((SUPPORTS_STACK_ALIGNMENT
+   || !crtl->stack_realign_tried)
   && (elim == hard_frame_pointer_rtx
   || elim == stack_pointer_rtx))
  || elim == (frame_pointer_needed
@@ -20492,7 +20493,8 @@ compute_frame_pointer_to_fb_displacement (poly_int64 
offset)
  this, assume that while we cannot provide a proper value for
  frame_pointer_fb_offset, we won't need one either.  */
   frame_pointer_fb_offset_valid
-= ((SUPPORTS_STACK_ALIGNMENT
+= (((SUPPORTS_STACK_ALIGNMENT
+|| !crtl->stack_realign_tried)
&& (elim == hard_frame_pointer_rtx
|| elim == stack_pointer_rtx))
|| elim == (frame_pointer_needed
diff --git a/gcc/testsuite/g++.dg/pr86593.C b/gcc/testsuite/g++.dg/pr86593.C
new file mode 100644
index 000..f4de0c1166a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr86593.C
@@ -0,0 +1,11 @@
+// { dg-options "-O -g -fno-omit-frame-pointer" }
+
+struct Foo
+{
+int bar(int a, int b, int c, int i1, int i2, int i3, int d);
+};
+
+int Foo::bar(int a, int b, int c, int i1, int i2, int i3, int d)
+{
+  return 0;
+}
-- 
2.17.1



Re: [PATCH 2/2] S/390: Remove TARGET_CPU_ZARCH

2018-08-08 Thread Ulrich Weigand
Ilya Leoshkevich wrote:

> TARGET_CPU_ZARCH allowed to distinguish between g5/g6 and newer
> machines.  Since the former are now gone, we can assume that
> TARGET_CPU_ZARCH is always true.  As a side-effect, branch splitting
> is now completely gone.  Some parts of literal pool splitting are
> also gone, but it's still there: we need to support it because
> floating point and vector instructions still cannot use relative
> addressing.

Great to see this finally go!

B.t.w. the literal pool handling can be simplified further now.
The current code was made complicated due to the potential of
interaction between branch splitting and literal pool splitting,
see the long comment:

 However, the two problems are interdependent: splitting the
 literal pool can move a branch further away from its target,
 causing the 64K limit to overflow, and on the other hand,
 replacing a PC-relative branch by an absolute branch means
 we need to put the branch target address into the literal
 pool, possibly causing it to overflow.

 So, we loop trying to fix up both problems until we manage
 to satisfy both conditions at the same time.  [...]

Since branch splitting is gone, this whole logic is superfluous;
the loop will never execute more than once.  (It may not be
necessary any longer to split the logic into the various
chunkify_start/finish routines either ...)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Improve libstdc++ docs w.r.t newer C++ standards

2018-08-08 Thread Jonathan Wakely

On 31/07/18 16:03 +0100, Jonathan Wakely wrote:

Instead of repeating all the old headers for every new standard I've
changed the docs to only list the new headers for each standard.

* doc/xml/manual/test.xml: Improve documentation on writing tests for
newer standards.
* doc/xml/manual/using.xml: Document all headers for C++11 and later.
* doc/html/*: Regenerate.


This fixes a missing header and typo in the LFTS table.

Committed to trunk.

commit 6d6f1e9d3323eb538bae7d5dc70f67f38db854d4
Author: Jonathan Wakely 
Date:   Wed Aug 8 14:26:47 2018 +0100

Add missing  header to docs

* doc/xml/manual/using.xml: Add missing header to table and fix typo.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index a5f2a2d074d..7512b39afaf 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -551,25 +551,26 @@ compilation errors, but will not define anything.
 
 experimental/memory
 experimental/memory_resource
+experimental/numeric
 experimental/optional
 experimental/propagate_const
-experimental/random
 
 
+experimental/random
 experimental/ratio
 experimental/regex
 experimental/set
 experimental/source_location
-experimental/string
 
 
+experimental/string
 experimental/string_view
-experimental/ssytem_error
+experimental/system_error
 experimental/tuple
 experimental/type_traits
-experimental/unordered_map
 
 
+experimental/unordered_map
 experimental/unordered_set
 experimental/utility
 experimental/vector


Re: [PATCH] PR libstdc++/86597 directory_entry observers should clear error_code

2018-08-08 Thread Jonathan Wakely

On 08/08/18 14:08 +0100, Jonathan Wakely wrote:

PR libstdc++/86597
* include/bits/fs_dir.h (directory_entry::_M_file_type(error_code&)):
Clear error_code when cached type is used.
* testsuite/27_io/filesystem/directory_entry/86597.cc: New test.

Committed to trunk, gcc-8 backport to follow.


Oops, wrong patch, this is the right one.


commit 40da21dc3defc7b09ed7783bb34bdf21b2cd0b9c
Author: Jonathan Wakely 
Date:   Wed Aug 8 13:46:06 2018 +0100

PR libstdc++/86597 directory_entry observers should clear error_code

PR libstdc++/86597
* include/bits/fs_dir.h (directory_entry::_M_file_type(error_code&)):
Clear error_code when cached type is used.
* testsuite/27_io/filesystem/directory_entry/86597.cc: New test.

diff --git a/libstdc++-v3/include/bits/fs_dir.h b/libstdc++-v3/include/bits/fs_dir.h
index cf7a3c29376..9ee1cb66b61 100644
--- a/libstdc++-v3/include/bits/fs_dir.h
+++ b/libstdc++-v3/include/bits/fs_dir.h
@@ -318,7 +318,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _M_file_type(error_code& __ec) const noexcept
 {
   if (_M_type != file_type::none && _M_type != file_type::symlink)
-	return _M_type;
+	{
+	  __ec.clear();
+	  return _M_type;
+	}
   return status(__ec).type();
 }
 
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc b/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc
new file mode 100644
index 000..6ca635e16e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc
@@ -0,0 +1,74 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17 -lstdc++fs" }
+// { dg-do run { target c++17 } }
+// { dg-require-filesystem-ts "" }
+
+#include 
+#include 
+#include 
+#include 
+
+// PR libstdc++/86597
+
+void
+test01()
+{
+  const auto bad_ec = make_error_code(std::errc::address_in_use);
+  std::error_code ec;
+  std::filesystem::directory_entry ent(".");
+
+  ec = bad_ec;
+  VERIFY( ent.exists(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( ent.is_directory(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( !ent.is_regular_file(ec) );
+  VERIFY( !ec );
+}
+
+void
+test02()
+{
+  const auto bad_ec = make_error_code(std::errc::address_in_use);
+  std::error_code ec;
+  std::filesystem::directory_entry ent(__gnu_test::nonexistent_path());
+
+  ec = bad_ec;
+  VERIFY( !ent.exists(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( !ent.is_directory(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( !ent.is_regular_file(ec) );
+  VERIFY( !ec );
+}
+
+int
+main()
+{
+  test01();
+  test02();
+}


[PATCH] PR libstdc++/86597 directory_entry observers should clear error_code

2018-08-08 Thread Jonathan Wakely

PR libstdc++/86597
* include/bits/fs_dir.h (directory_entry::_M_file_type(error_code&)):
Clear error_code when cached type is used.
* testsuite/27_io/filesystem/directory_entry/86597.cc: New test.

Committed to trunk, gcc-8 backport to follow.

commit 5849d364a0f81686e70b2a277bf2517fdd1bc968
Author: Jonathan Wakely 
Date:   Wed Aug 8 13:46:06 2018 +0100

PR libstdc++/86597 directory_entry observers should clear error_code

PR libstdc++/86597
* include/bits/fs_dir.h 
(directory_entry::_M_file_type(error_code&)):
Clear error_code when cached type is used.
* testsuite/27_io/filesystem/directory_entry/86597.cc: New test.

diff --git a/libstdc++-v3/include/bits/fs_dir.h 
b/libstdc++-v3/include/bits/fs_dir.h
index cf7a3c29376..9ee1cb66b61 100644
--- a/libstdc++-v3/include/bits/fs_dir.h
+++ b/libstdc++-v3/include/bits/fs_dir.h
@@ -318,7 +318,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _M_file_type(error_code& __ec) const noexcept
 {
   if (_M_type != file_type::none && _M_type != file_type::symlink)
-   return _M_type;
+   {
+ __ec.clear();
+ return _M_type;
+   }
   return status(__ec).type();
 }
 
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc
new file mode 100644
index 000..3b2acbe5842
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/filesystem/directory_entry/86597.cc
@@ -0,0 +1,52 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17 -lstdc++fs" }
+// { dg-do run { target c++17 } }
+// { dg-require-filesystem-ts "" }
+
+#include 
+#include 
+#include 
+
+// PR libstdc++/86597
+
+void
+test01()
+{
+  const auto bad_ec = make_error_code(std::errc::address_in_use);
+  std::error_code ec;
+  std::filesystem::directory_entry ent(".");
+
+  ec = bad_ec;
+  VERIFY( ent.exists(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( ent.is_directory(ec) );
+  VERIFY( !ec );
+
+  ec = bad_ec;
+  VERIFY( !ent.is_regular_file(ec) );
+  VERIFY( !ec );
+}
+
+int
+main()
+{
+  test01();
+}


[PATCH] line-map include-from representation

2018-08-08 Thread Nathan Sidwell
This patch changes the way line-maps hold the included-from information. 
 Currently this is an index to another (earlier) line-map, and relies 
on the fact that #include cause the termination of the current map and 
emission of a new map.  It's then possible to determine the location of 
the include directive as the last line of the just terminated line map 
(by examining the first location of the next map).


With modules, I'd like to show the import path in a similar way.  But 
importing a module doesn't (necessarily) cause termination of a line map 
-- the imported module's 'imported-from' location is in the middle of 
some other map.  Here's what that looks like:

In file of module bob,
  imported at loc-2_b.C:5,
of module stuart,
  imported at loc-2_d.C:2:
loc-2_a.C:5:18: note:   initializing argument 1 of 'int frob(int*)'

[included-from lines may appear in the middle of that stack too]

Thus, this patch uses a source_location to represent the included from 
location.  No data size changes, as old and new forms are really ints. 
IMHO it makes the interface somewhat cleaner, as we stop exposing the 
map indexes to the library users.


[I noticed that a piece of this patch already escaped, we were using 
LAST_SOURCE_COLUMN in diagnostic_report_current_module, but I'd 
inadvertently changed that in the prepatch that was intended to prepare 
the ground for this one.  Sorry.]


booted & tested on x86_64-linux, ok?

nathan
--
Nathan Sidwell
2018-08-08  Nathan Sidwell  

	Make linemap::included_at a location
	libcpp/
	* include/line-map.h (struct line_map_ordinary): Replace
	included_from map index with included_at source_location.
	(ORDINARY_MAP_INCLUDER_FILE_INDEX): Delete.
	(LAST_SOURCE_LINE_LOCATION): Move to line-map.c.
	(LAST_SOURCE_COLUMN): Delete.
	(INCLUDED_AT): New.
	(linemap_included_at): Declare.
	(MAIN_FILE_P): Adjust.
	* line-map.c (linemap_included_at): New.
	(lonemap_check_files_exited): Use linemap_included_at.
	(LAST_SOURCE_LINE_LOCATION): Made internal from header file.
	(linemap_add): Adjust inclusion setting.
	(linemap_dump, linemap_dump_location): Adjust.
	* directives.c (do_linemarker): Use linemap_included_at.
	gcc/
	* diagnostic.c (diagnostic_report_current_module): Use INCLUDED_AT
	& linemap_included_at.
	gcc/c-family/
	* c-common.c (try_to_locate_new_include_inertion_point): Use
	linemap_included_at.
	* c-lex.c (fe_file_change): Use INCLUDED_AT.
	* c-ppoutput.c (pp_file_change): Likewise.
	gcc/fortran/
	* cpp.c (cb_file_change): Use INCLUDED_AT.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 263365)
+++ gcc/c-family/c-common.c	(working copy)
@@ -8413,8 +8413,8 @@ try_to_locate_new_include_insertion_poin
   const line_map_ordinary *ord_map
 	= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
-  const line_map_ordinary *from = INCLUDED_FROM (line_table, ord_map);
-  if (from)
+  if (const line_map_ordinary *from
+	  = linemap_included_at (line_table, ord_map))
 	if (from->to_file == file)
 	  {
 	last_include_ord_map = from;
Index: gcc/c-family/c-lex.c
===
--- gcc/c-family/c-lex.c	(revision 263365)
+++ gcc/c-family/c-lex.c	(working copy)
@@ -199,7 +199,7 @@ fe_file_change (const line_map_ordinary
 	 we already did in compile_file.  */
   if (!MAIN_FILE_P (new_map))
 	{
-	  unsigned int included_at = LAST_SOURCE_LINE_LOCATION (new_map - 1);
+	  location_t included_at = INCLUDED_AT (new_map);
 	  int line = 0;
 	  if (included_at > BUILTINS_LOCATION)
 	line = SOURCE_LINE (new_map - 1, included_at);
Index: gcc/c-family/c-ppoutput.c
===
--- gcc/c-family/c-ppoutput.c	(revision 263365)
+++ gcc/c-family/c-ppoutput.c	(working copy)
@@ -663,11 +663,9 @@ pp_file_change (const line_map_ordinary
 	  /* Bring current file to correct line when entering a new file.  */
 	  if (map->reason == LC_ENTER)
 	{
-	  const line_map_ordinary *from = INCLUDED_FROM (line_table, map);
-	  maybe_print_line (LAST_SOURCE_LINE_LOCATION (from));
+	  maybe_print_line (INCLUDED_AT (map));
+	  flags = " 1";
 	}
-	  if (map->reason == LC_ENTER)
-	flags = " 1";
 	  else if (map->reason == LC_LEAVE)
 	flags = " 2";
 	  print_line (map->start_location, flags);
Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c	(revision 263365)
+++ gcc/diagnostic.c	(working copy)
@@ -590,9 +590,10 @@ diagnostic_report_current_module (diagno
 	  bool first = true;
 	  do
 	{
-	  map = INCLUDED_FROM (line_table, map);
+	  where = INCLUDED_AT (map);
+	  map = linemap_included_at (line_table, map);
 	  const char *line_col
-		= maybe_line_and_column (LAST_SOURCE_LINE (map),
+		= maybe_line_and_column (SOURCE_LINE (map, where),
 	 first && context->show_column
 			

[Committed] S/390: Fix PR85295

2018-08-08 Thread Andreas Krebbel
Bootstrapped and regression tested on s390x with z900 and z13 default.

gcc/ChangeLog:

2018-08-08  Andreas Krebbel  

PR target/85295
* config/s390/constraints.md ("NxHD0", "NxSD0"): New constraint
definitions.
* config/s390/s390.md ("movti"): Add more alternatives for
constant to GPR copies.

gcc/testsuite/ChangeLog:

2018-08-08  Andreas Krebbel  

PR target/85295
* gcc.target/s390/TI-constants-lra.c: New testcase.
* gcc.target/s390/TI-constants-nolra.c: New testcase.

diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index b18622d8dac..b8ba8510096 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -51,7 +51,7 @@
 ;;M -- Constant integer with a value of 0x7fff.
 ;;N -- Multiple letter constraint followed by 4 parameter letters.
 ;; 0..9,x:  number of the part counting from most to least significant
-;; H,Q: mode of the part
+;; S,H,Q:   mode of the part
 ;; D,S,H:   mode of the containing operand
 ;; 0,F: value of the other parts (F - all bits set)
 ;; --
@@ -204,7 +204,7 @@
 
 ;;N -- Multiple letter constraint followed by 4 parameter letters.
 ;; 0..9,x:  number of the part counting from most to least significant
-;; H,Q: mode of the part
+;; S,H,Q:   mode of the part
 ;; D,S,H:   mode of the containing operand
 ;; 0,F: value of the other parts (F = all bits set)
 ;;
@@ -226,6 +226,18 @@
(match_test "s390_N_constraint_str (\"xQS0\", ival)")))
 
 
+(define_constraint "NxHD0"
+  "@internal"
+   (and (match_code "const_int")
+(match_test "s390_N_constraint_str (\"xHD0\", ival)")))
+
+
+(define_constraint "NxSD0"
+  "@internal"
+   (and (match_code "const_int")
+(match_test "s390_N_constraint_str (\"xSD0\", ival)")))
+
+
 (define_constraint "NxQD0"
   "@internal"
(and (match_code "const_int")
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index c4d391bc9b5..1296e26e536 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1516,8 +1516,8 @@
 ; FIXME: More constants are possible by enabling jxx, jyy constraints
 ; for TImode (use double-int for the calculations)
 (define_insn "movti"
-  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,S,v,  v,  v,v,d,v,R, 
d,o")
-(match_operand:TI 1 "general_operand"  " 
S,d,v,j00,jm1,d,v,R,v,dT,d"))]
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,S,v,  v,  v,v,d,v,R,d,  
  d, d,d, d,o")
+(match_operand:TI 1 "general_operand"  " 
S,d,v,j00,jm1,d,v,R,v,K,NxHD0,Os,NxSD0,dT,d"))]
   "TARGET_ZARCH"
   "@
lmg\t%0,%N0,%S1
@@ -1530,10 +1530,14 @@
vl\t%v0,%1
vst\t%v1,%0
#
+   #
+   #
+   #
+   #
#"
-  [(set_attr "op_type" "RSY,RSY,VRR,VRI,VRI,VRR,*,VRX,VRX,*,*")
-   (set_attr "type" "lm,stm,*,*,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "*,*,vx,vx,vx,vx,vx,vx,vx,*,*")])
+  [(set_attr "op_type" "RSY,RSY,VRR,VRI,VRI,VRR,*,VRX,VRX,*,*,*,*,*,*")
+   (set_attr "type" "lm,stm,*,*,*,*,*,*,*,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,vx,vx,vx,vx,vx,vx,vx,*,*,*,extimm,*,*")])
 
 (define_split
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
diff --git a/gcc/testsuite/gcc.target/s390/TI-constants-lra.c 
b/gcc/testsuite/gcc.target/s390/TI-constants-lra.c
new file mode 100644
index 000..cc52a62182b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/TI-constants-lra.c
@@ -0,0 +1,47 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O3" } */
+
+/* 2x lghi */
+__int128 a() {
+  return 0;
+}
+
+/* 2x lghi */
+__int128 b() {
+  return -1;
+}
+
+/* 2x lghi */
+__int128 c() {
+  return -2;
+}
+
+/* lghi + llilh */
+__int128 d() {
+  return 16000 << 16;
+}
+
+/* lghi + llihf */
+__int128 e() {
+  return (unsigned long long)8 << 32;
+}
+
+/* lghi + llihf */
+__int128 f() {
+  return (unsigned __int128)8 << 96;
+}
+
+/* llihf + llihf - this is handled via movti_bigconst pattern */
+__int128 g() {
+  return ((unsigned __int128)8 << 96) | ((unsigned __int128)8 << 32);
+}
+
+/* Literal pool */
+__int128 h() {
+  return ((unsigned __int128)8 << 32) | 1;
+}
+
+/* Literal pool */
+__int128 i() {
+  return (((unsigned __int128)8 << 32) | 1) << 64;
+}
diff --git a/gcc/testsuite/gcc.target/s390/TI-constants-nolra.c 
b/gcc/testsuite/gcc.target/s390/TI-constants-nolra.c
new file mode 100644
index 000..b9948fc4aa5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/TI-constants-nolra.c
@@ -0,0 +1,47 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O3 -mno-lra" } */
+
+/* 2x lghi */
+__int128 a() {
+  return 0;
+}
+
+/* 2x lghi */
+__int128 b() {
+  return -1;
+}
+
+/* 2x lghi */
+__int128 c() {
+  return -2;
+}
+
+/* lghi + llilh */
+__int128 d() {
+  return 16000 << 16;
+}
+
+/* lghi + llihf */
+__int128 e() {
+  return (unsigned long long)8 << 32;
+}
+
+/* lghi + llihf */

Re: [PATCH 1/2] S/390: Remove support for g5 and g6 machines

2018-08-08 Thread Andreas Krebbel
On 08/07/2018 10:13 AM, Ilya Leoshkevich wrote:
> g5 and g6 were deprecated since gcc 6.1.0 (commit 3bd8520f).
> 
> gcc/ChangeLog:
> 
> 2018-08-02  Ilya Leoshkevich  
> 
>   * common/config/s390/s390-common.c (processor_flags_table):
> Remove flags.
>   * config.gcc: Remove with_arch/with_tune support.
>   * config/s390/2064.md: Remove cpu attribute comparisons.
>   * config/s390/driver-native.c (s390_host_detect_local_cpu):
> Remove MTN.
>   * config/s390/linux.h (ASM_SPEC):
> Remove -march support.
>   * config/s390/s390-c.c (s390_cpu_cpp_builtins_internal):
> Use a table to get an arch level.
>   * config/s390/s390-opts.h (enum processor_type):
> Remove enum values.
>   * config/s390/s390.c
> (processor_table): Remove entries, add arch_level values.
> (s390_issue_rate): Remove cases.
> (s390_option_override): Adjust
> s390_option_override_internal() call.
>   (s390_option_override_internal): Remove deprecation warning.
> (s390_valid_target_attribute_tree): Adjust
> s390_option_override_internal() call.
>   * config/s390/s390.h (struct s390_processor):
> Share with s390-c.c, add arch_level field.
>   * config/s390/s390.md:
> Remove occurrences in cpu attribute.
>   * config/s390/s390.opt: Remove -march/-mtune support.
>   * config/s390/tpf.h (ASM_SPEC): Remove -march support.
>   * doc/invoke.texi: Remove deprecation warning.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-08-02  Ilya Leoshkevich  
> 
>   * gcc.target/s390/hotpatch-8.c: Remove.
>   * gcc.target/s390/hotpatch-9.c: Remove.
>   * gcc.target/s390/mnop-mcount-m31-fpic.c: Remove.
>   * gcc.target/s390/mnop-mcount-m31.c: Remove.

Applied. Thanks!



Re: [PATCH 2/2] S/390: Remove TARGET_CPU_ZARCH

2018-08-08 Thread Andreas Krebbel
On 08/07/2018 10:13 AM, Ilya Leoshkevich wrote:
> TARGET_CPU_ZARCH allowed to distinguish between g5/g6 and newer
> machines.  Since the former are now gone, we can assume that
> TARGET_CPU_ZARCH is always true.  As a side-effect, branch splitting
> is now completely gone.  Some parts of literal pool splitting are
> also gone, but it's still there: we need to support it because
> floating point and vector instructions still cannot use relative
> addressing.
> 
> gcc/ChangeLog:
> 
> 2018-08-07  Ilya Leoshkevich  
> 
>   * config/s390/s390.c (s390_loadrelative_operand_p):
> Remove TARGET_CPU_ZARCH usages.
>   (s390_rtx_costs): Likewise.
>   (s390_legitimate_constant_p): Likewise.
>   (s390_cannot_force_const_mem): Likewise.
>   (legitimate_reload_constant_p): Likewise.
>   (s390_preferred_reload_class): Likewise.
>   (legitimize_pic_address): Likewise.
>   (legitimize_tls_address): Likewise.
>   (s390_split_branches): Removed.
>   (s390_add_execute): Removed.
>   (s390_dump_pool): Remove TARGET_CPU_ZARCH usages.
>   (s390_mainpool_start): Likewise.
>   (s390_mainpool_finish): Likewise.
>   (s390_mainpool_cancel): Removed.
>   (s390_chunkify_start): Remove TARGET_CPU_ZARCH usages.
>   (s390_chunkify_cancel): Likewise.
>   (s390_return_addr_rtx): Likewise.
>   (s390_register_info): Remove split_branches_pending_p uages.
>   (s390_optimize_register_info): Likewise.
>   (s390_init_frame_layout): Remove TARGET_CPU_ZARCH and
> split_branches_pending_p usages.
>   (s390_can_eliminate): Remove TARGET_CPU_ZARCH usages.
>   (s390_load_got): Likewise.
>   (s390_expand_split_stack_prologue): Likewise.
>   (output_asm_nops): Likewise.
>   (s390_function_profiler): Likewise.
>   (s390_emit_call): Likewise.
>   (s390_conditional_register_usage): Likewise.
>   (s390_optimize_prologue): Likewise.
>   (s390_reorg): Remove TARGET_CPU_ZARCH and
> split_branches_pending_p usages.
>   (s390_option_override_internal): Remove TARGET_CPU_ZARCH
> usages.
>   (s390_output_indirect_thunk_function): Likewise.
>   * config/s390/s390.h (TARGET_CPU_ZARCH): Removed.
>   (TARGET_CPU_ZARCH_P): Removed.
>   (struct machine_function): Remove split_branches_pending_p.
>   * config/s390/s390.md: Remove TARGET_CPU_ZARCH usages.

Applied. Thanks!

Andreas



Re: [PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Jonathan Wakely

On 08/08/18 10:52 +0200, Sebastian Huber wrote:

While building for Newlib, some configure checks must be hard coded.
The aligned_alloc() is supported since 2015 in Newlib.

libstdc++-v3

PR target/85904
* configure.ac): Define HAVE_ALIGNED_ALLOC if building for


There's a stray closing parenthesis here.


Newlib.
* configure: Regnerate.


Typo "Regnerate".

But the patch itself is fine - OK for trunk.

I'm ambivalent about this being backported to gcc-7 and gcc-8 branches
(gcc-6 is unaffected as it doesn't use aligned_alloc).

It's strictly speaking an ABI change, because HAVE_ALIGNED_ALLOC
affects the memory layout for allocations from operator new(size_t,
align_val_t) (in new_opa.cc) which needs to agree with the
corresponding operator delete (in del_opa.cc). Using static linking it
might be possible to create a binary that has operator new using
aligned_alloc, but operator delete expecting to do ((void**)ptr)[-1],
which would be bad.

Those operators are C++17, so "experimental", but maybe we shouldn't
make the change on release branches.




Re: dejagnu version update?

2018-08-08 Thread Bernhard Reutner-Fischer
On 7 August 2018 18:34:30 CEST, Segher Boessenkool  
wrote:
>On Mon, Aug 06, 2018 at 08:25:49AM -0700, Mike Stump wrote:
>> Since g++ already requires 1.5.3, it make no sense to bump to
>anything older that 1.5.3, so let's bump to 1.5.3.  Those packaging
>systems and OSes that wanted to update by now, have had their chance to
>update.  Those that punt until we bump the requirement, well, they will
>now have to bump.  :-)
>
>"g++ requires it"?  In what way?  I haven't seen any issues with older
>dejagnu versions.

I think 
http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=commit;h=5256bd82343000c76bc0e48139003f90b6184347

>
>> Ok to update to 1.5.3.
>
>1.5.3 is only three years old, and not all distros carry it.  This is
>rather aggressive...

How come?
If one wants to develop on a distro that is notoriously outdated then you have 
to obtain the missing pieces yourself. I wouldn't call three years "aggressive".


Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-08 Thread Jason Merrill
On Wed, Aug 8, 2018 at 9:04 AM, Martin Sebor  wrote:
> On 08/07/2018 02:57 AM, Jason Merrill wrote:
>>
>> On Wed, Aug 1, 2018 at 12:49 AM, Martin Sebor  wrote:
>>>
>>> On 07/31/2018 07:38 AM, Jason Merrill wrote:


 On Tue, Jul 31, 2018 at 9:51 AM, Martin Sebor  wrote:
>
>
> The middle-end contains code to determine the lengths of constant
> character arrays initialized by string literals.  The code is used
> in a number of optimizations and warnings.
>
> However, the code is unable to deal with constant arrays initialized
> using the braced initializer syntax, as in
>
>   const char a[] = { '1', '2', '\0' };
>
> The attached patch extends the C and C++ front-ends to convert such
> initializers into a STRING_CST form.
>
> The goal of this work is to both enable existing optimizations for
> such arrays, and to help detect bugs due to using non-nul terminated
> arrays where nul-terminated strings are expected.  The latter is
> an extension of the GCC 8 _Wstringop-overflow and
> -Wstringop-truncation warnings that help detect or prevent reading
> past the end of dynamically created character arrays.  Future work
> includes detecting potential past-the-end reads from uninitialized
> local character arrays.



>   && TYPE_MAIN_VARIANT (TREE_TYPE (valtype)) == char_type_node)



 Why? Don't we want this for other character types as well?
>>>
>>>
>>> It suppresses narrowing warnings for things like
>>>
>>>   signed char a[] = { 0xff };
>>>
>>> (there are a couple of tests that exercise this).
>>
>>
>> Why is plain char different in this respect?  Presumably one of
>>
>> char a[] = { -1 };
>> char b[] = { 0xff };
>>
>> should give the same narrowing warning, depending on whether char is
>> signed.
>
>
> Right.  I've added more tests to verify that it does.
>
>>> At the same time, STRING_CST is supposed to be able to represent
>>> strings of any integer type so there should be a way to make it
>>> work.  On the flip side, recent discussions of changes in this
>>> area suggest there may be bugs in the wide character handling of
>>> STRING_CST so those would need to be fixed before relying on it
>>> for robust support.
>>>
>>> In any case, if you have a suggestion for how to make it work for
>>> at least the narrow character types I'll adjust the patch.
>>
>>
>> I suppose braced_list_to_string should call check_narrowing for C++.
>
>
> I see.  I've made that change.  That has made it possible to
> convert arrays of all character types.  Thanks!
>
>> Currently it uses tree_fits_shwi_p (signed host_wide_int) and then
>> stores the extracted value in a host unsigned int, which is then
>> converted to host char.  Does the right thing happen for -fsigned-char
>> or targets with a different character set?
>
> I believe so.  I've added tests for these too (ASCII and EBCDIC)
> and also changed the type of the extracted value to HWI to match
> (it doesn't change the results of the tests).
>
> Attached is an updated patch with these changes plus more tests
> as suggested by Joseph.

Great.  Can we also move the call to braced_list_to_string into
check_initializer, so it works for templates as well?  As a case just
before the block that calls reshape_init seems appropriate.

Jason


Re: [PATCH v3] Add HXT Phecda core support

2018-08-08 Thread Hongbo Zhang
Hi James,
You said OK to this previous v2 patch:
https://patchwork.ozlabs.org/patch/933135/
Now I re-base it to the latest GCC commit.

If this is OK, please apply this patch on my behalf.
Thanks.

On 8 August 2018 at 18:38, Hongbo Zhang  wrote:
> HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx,
> reuses the same tuning structure and pipeline with it.
>
> 2018-08-08  Hongbo Zhang  
>
> * config/aarch64/aarch64-cores.def: Add phecda core.
> * config/aarch64/aarch64-tune.md: Regenerate.
> * doc/invoke.texi: Add phecda core.
> ---
> v3 changes:
>  - use 'h' instead of 'H' in the comments.
>  - rebased to the latest commit.
>
>  gcc/config/aarch64/aarch64-cores.def | 3 +++
>  gcc/config/aarch64/aarch64-tune.md   | 2 +-
>  gcc/doc/invoke.texi  | 2 +-
>  3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 3d876b8..437ed1e 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -71,6 +71,9 @@ AARCH64_CORE("qdf24xx", qdf24xx,   falkor,8A,  
> AARCH64_FL_FOR_ARCH8 | AA
>  /* Samsung ('S') cores. */
>  AARCH64_CORE("exynos-m1",   exynosm1,  exynosm1,  8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1,  0x53, 0x001, -1)
>
> +/* HXT ('h') cores. */
> +AARCH64_CORE("phecda",  phecda,falkor,8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x68, 0x000, -1)
> +
>  /* ARMv8.1-A Architecture Processors.  */
>
>  /* Broadcom ('B') cores. */
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index f8c..ec75b45 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> -   
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> +   
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 841cb47..3944fe4 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14840,7 +14840,7 @@ performance of the code.  Permissible values for this 
> option are:
>  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
>  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
>  @samp{cortex-a76}, @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx},
> -@samp{saphira}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
> +@samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
>  @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
>  @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
>  @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
> --
> 2.7.4
>


[PATCH v3] Add HXT Phecda core support

2018-08-08 Thread Hongbo Zhang
HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx,
reuses the same tuning structure and pipeline with it.

2018-08-08  Hongbo Zhang  

* config/aarch64/aarch64-cores.def: Add phecda core.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Add phecda core.
---
v3 changes:
 - use 'h' instead of 'H' in the comments.
 - rebased to the latest commit.

 gcc/config/aarch64/aarch64-cores.def | 3 +++
 gcc/config/aarch64/aarch64-tune.md   | 2 +-
 gcc/doc/invoke.texi  | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 3d876b8..437ed1e 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -71,6 +71,9 @@ AARCH64_CORE("qdf24xx", qdf24xx,   falkor,8A,  
AARCH64_FL_FOR_ARCH8 | AA
 /* Samsung ('S') cores. */
 AARCH64_CORE("exynos-m1",   exynosm1,  exynosm1,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1,  0x53, 0x001, -1)
 
+/* HXT ('h') cores. */
+AARCH64_CORE("phecda",  phecda,falkor,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x68, 0x000, -1)
+
 /* ARMv8.1-A Architecture Processors.  */
 
 /* Broadcom ('B') cores. */
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index f8c..ec75b45 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 841cb47..3944fe4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14840,7 +14840,7 @@ performance of the code.  Permissible values for this 
option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
 @samp{cortex-a76}, @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx},
-@samp{saphira}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
+@samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
 @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
 @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
 @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
-- 
2.7.4



[PATCH] libstdc++-v3: Have aligned_alloc() on Newlib

2018-08-08 Thread Sebastian Huber
While building for Newlib, some configure checks must be hard coded.
The aligned_alloc() is supported since 2015 in Newlib.

libstdc++-v3

PR target/85904
* configure.ac): Define HAVE_ALIGNED_ALLOC if building for
Newlib.
* configure: Regnerate.
---
 libstdc++-v3/configure| 2 ++
 libstdc++-v3/configure.ac | 1 +
 2 files changed, 3 insertions(+)

diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index d33081d544c..53803b7e599 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -28967,6 +28967,8 @@ else
 $as_echo "#define HAVE_TANHF 1" >>confdefs.h
 
 
+$as_echo "#define HAVE_ALIGNED_ALLOC 1" >>confdefs.h
+
 $as_echo "#define HAVE_ICONV 1" >>confdefs.h
 
 $as_echo "#define HAVE_MEMALIGN 1" >>confdefs.h
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 332af3706d3..e15228dde5e 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -330,6 +330,7 @@ else
 AC_DEFINE(HAVE_TANF)
 AC_DEFINE(HAVE_TANHF)
 
+AC_DEFINE(HAVE_ALIGNED_ALLOC)
 AC_DEFINE(HAVE_ICONV)
 AC_DEFINE(HAVE_MEMALIGN)
   else
-- 
2.13.7



Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs

2018-08-08 Thread Siddhesh Poyarekar

On 08/01/2018 04:23 AM, James Greenhalgh wrote:

On Wed, Jul 25, 2018 at 01:10:34PM -0500, Luis Machado wrote:

The adjusted vector costs give Falkor a reasonable boost in performance for FP
benchmarks (both CPU2017 and CPU2006) and doesn't change INT benchmarks that
much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.

OK for trunk?


OK if this is what works best for your subtarget.



I have pushed this for Luis since he is on holiday.

Thanks,
Siddhesh


Re: [PATCH] [AArch64, Falkor] Adjust Falkor's sign extend reg+reg address cost

2018-08-08 Thread Siddhesh Poyarekar

On 08/01/2018 04:24 AM, James Greenhalgh wrote:

OK if this is what is best for your subtarget.



I have pushed this on behalf of Luis since he is on holiday.

Thanks,
Siddhesh