Re: Undefined behavior due to 6.5.16.1p3

2015-03-11 Thread Jakub Jelinek
On Wed, Mar 11, 2015 at 05:31:01PM +0100, Vincent Lefevre wrote:
> > (in C only one union member can be active at any time,
> > we as extension allow type punning through unions etc.)
> 
> I disagree that it is an extension. The standard does not say
> that "one union member can be active at any time".

That is not a standard wording, but what I meant is
6.2.6.1p7 - that when you store some union member other union members take
unspecified values.

Jakub


Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jakub Jelinek
On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
> > H,  Passing the additional option in user code would be one thing,
> > but what about library code?  E.g., using memcpy (either explicitly or
> > implicitly for a structure copy)?
> 
> The memcpy problem isn't restricted to embedded architectures.
> 
>   size_t size;
>   const unsigned char *source;
>   std::vector vec;
>   …
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
> 
> std::vector::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
> 
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.

I disagree.  If you want a function that will have that different property,
don't call it memcpy.

Jakub


Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Jakub Jelinek
On Thu, Feb 19, 2015 at 06:16:05PM -0300, Daniel Gutson wrote:
> what about then two warnings (disabled by default), one intended to
> tell the user each time the compiler removes a conditional
> (-fdelete-null-pointer-checks)
> and another intended to tell the user each time the compiler adds a trap due 
> to
> dereference an address 0?
> 
> E.g.
>-Wnull-pointer-check-deleted
>-Wnull-dereference-considered-erroneous
> 
> or alike

That would be extremely difficult.  The -fdelete-null-pointer-checks option
is used in many places, like the path isolation, value range propagation,
alias oracle, number of iteration analysis etc.  E.g. in case of value
range propagation, it is really hard to warn if something has been optimized
some way because of it, because you really don't know the reason why after
all the propagation some SSA_NAME got certain range, to warn you'd
essentially have to do all of VRP twice, once with
-fdelete-null-pointer-checks and once without, and then compare that when
actually performing optimizations.  But some optimizations are also done far
later than directly in the VRP pass.

If you have hw where NULL is mapped and you know your code violates the
C/C++ standards by placing objects at that address, simply do use
the option that is designed for that purpose.

Jakub


Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-18 Thread Jakub Jelinek
On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> Starting with gcc 4.9, -O2 implicitly invokes
> 
> -fisolate-erroneous-paths-dereference:
> 
> which
> 
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> 
> documents as
> 
> Detect paths that trigger erroneous or undefined behavior due to
> dereferencing a null pointer. Isolate those paths from the main control
> flow and turn the statement with erroneous or undefined behavior into a
> trap. This flag is enabled by default at -O2 and higher.
> 
> This results in a sizable number of previously working embedded programs 
> mysteriously
> crashing when recompiled under gcc 4.9.  The problem is that embedded
> programs will often have ram starting at address zero (think hardware-defined
> interrupt vectors, say) which gets initialized by code which the
> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> writing address zero.

If you have some pages mapped at address 0, you really should compile your
code with -fno-delete-null-pointer-checks, otherwise you can run into tons
of other issues.
Also, there is -fsanitize=undefined that allows discovery of such invalid
calls at runtime, though admittedly it isn't supported for all targets.

Jakub


Re: Serious Regressions tables on https://gcc.gnu.org

2015-02-13 Thread Jakub Jelinek
On Fri, Feb 13, 2015 at 03:31:22PM -0500, Jack Howarth wrote:
> On Fri, Feb 13, 2015 at 3:16 PM, Marek Polacek  wrote:
> > On Fri, Feb 13, 2015 at 03:12:22PM -0500, Jack Howarth wrote:
> >> Is there a reason why the Serious Regressions tables, displayed by
> >> the links in the 'Release Series and Status' section at
> >> https://gcc.gnu.org, no longer have a column for the priority
> >> (importance) of each bug? We used to have that and it was quite nice
> >> to be able to click on the priority column header to regenerate the
> >> table sorted by bug priority (to quickly see how many P1s are open).
> >> Any chance of getting that functionality back?
> >
> > You probably need to enable the Priority column in Change Columns at
> > the bottom of the page.
> >
> > Marek
> 
> That works but we used to have the Priority column listed in the GCC
> Bugzilla default column listing.

What columns do you get in the standard queries is what you've configured,
in the sorting order you've configured.

Jakub


Re: Failure to dlopen libgomp due to static TLS data

2015-02-12 Thread Jakub Jelinek
On Thu, Feb 12, 2015 at 11:09:59AM -0500, Rich Felker wrote:
> On Thu, Feb 12, 2015 at 04:18:57PM +0100, Ulrich Weigand wrote:
> > Hello,
> > 
> > we're running into a problem related to use of initial-exec access to
> > TLS variables in dynamically-loaded libraries.  Now, in general, this
> > is actually not supported.  However, there seems to an "inofficial"
> > extension that allows selected system libraries to use small amounts
> > of static TLS space to allow critical variables to be defined to use
> > the initial-exec model even in dynamically-loaded libraries.
> 
> This usage is supposed to be deprecated. Why isn't libgomp using
> TLSDESC/gnu2 model?

Because it is significantly slower.

Jakub


Re: GCC 5.0 and OpenMP 4.0 accelerator : Adapteva/Parallella board

2015-02-12 Thread Jakub Jelinek
On Thu, Feb 12, 2015 at 06:42:17PM +0300, Ilya Verbin wrote:
> Hi,
> 
> On Wed, Feb 11, 2015 at 21:33:47 -0800, Nicholas Yue wrote:
> > I would like to find out if this is the correct forum to
> > ask/discuss about GCC 5's OpenMP 4.0 implementation, in particular
> > the new accelerator feature which from what I understand, allows the
> > compute to be offloaded to external GPU/accelerator.
> > 
> > I have a Parallella board (ARM dual core) which has an Adapteva
> > chip (16 cores) and I would like to build a GCC 5 version for it.
> > 
> > I recall that the Adapteva is a supported CPU with GCC.
> 
> Currently offloading to Epiphany targets is not supported by GCC.
> 
> To support it, one needs to implement at least 2 things:
> 
> 1. mkoffload tool, like gcc/config/i386/intelmic-mkoffload.c or
> gcc/config/nvptx/mkoffload.c
> 
> 2. libgomp plugin, like liboffloadmic/plugin/libgomp-plugin-intelmic.cpp or
> libgomp/plugin/plugin-nvptx.c

And likely
3. port libgomp to the epiphany which supposedly doesn't have pthread
support, but some other way to spawn threads (this is similar to nvptx).

Jakub


Re: Failure to dlopen libgomp due to static TLS data

2015-02-12 Thread Jakub Jelinek
On Thu, Feb 12, 2015 at 04:18:57PM +0100, Ulrich Weigand wrote:
> we're running into a problem related to use of initial-exec access to
> TLS variables in dynamically-loaded libraries.  Now, in general, this
> is actually not supported.  However, there seems to an "inofficial"
> extension that allows selected system libraries to use small amounts
> of static TLS space to allow critical variables to be defined to use
> the initial-exec model even in dynamically-loaded libraries.

You can always LD_PRELOAD libgomp or link the main app with it if you need
it.  Otherwise, sure, there is no guarantee it will work, but usually it
does, and the performance difference is significant enough to make it
worthwhile.  Making libgomp -Wl,-z,nodlopen would just make it problem for
everyone, even when it works fine for most people.
And, the restriction you are mentioning is there only if
!RTLD_SINGLE_THREAD_P, so you can also avoid it by dlopening libgomp before
you spawn first threads rather than after that.

Jakub


Re: pass_stdarg problem when run after pass_lim

2015-02-03 Thread Jakub Jelinek
On Tue, Feb 03, 2015 at 02:36:53PM +0100, Michael Matz wrote:
> Hi,
> 
> On Tue, 3 Feb 2015, Tom de Vries wrote:
> 
> > Ironically, that fix breaks the va_list_gpr/fpr_size optimization, so 
> > I've disabled that by default for now.
> > 
> > I've done a non-bootstrap and bootstrap build using all languages.
> > 
> > The non-bootstrap test shows (at least) two classes of real failures:
> > - gcc.c-torture/execute/20020412-1.c, gcc.target/i386/memcpy-strategy-4.c 
> > and
> >   gcc.dg/lto/20090706-1_0.c.
> >   These are test-cases with vla as va_arg argument. It ICEs in
> >   force_constant_size with call stack
> >   gimplify_va_arg_expr -> create_tmp_var -> gimple_add_tmp_var ->
> >   force_constant_size
> 
> Hah, yeah, that's the issue I remembered with create_tmp_var.  This needs 
> a change in how to represent the va_arg "call", because the LHS can't be a 
> temporary that's copied to the real LHS afterwards.

It can be lowered during gimplification to some internal call.  What
arguments and return values will it have can be decided based on what will
be most suitable for the lowering.

Jakub


Re: libgomp support for RTEMS

2015-01-30 Thread Jakub Jelinek
On Fri, Jan 30, 2015 at 12:14:26PM +0100, Sebastian Huber wrote:
> Hello,
> 
> I would like to add support for libgomp for the RTEMS operating system. I
> likely cannot use the standard Pthread API for this in some places since I
> have to account for RTEMS specifics related to partitioned/clustered
> scheduling and the priority based scheduler. If I implement for example
> functions like this
> 
> void gomp_init_thread_affinity (pthread_attr_t *attr, unsigned int place)
> 
> outside of the GCC provided libgomp (e.g. in the RTEMS sources) can I choose
> an arbitrary license for this or do I have to use the GPLv3 with the GCC
> Runtime Library Exception for it?
> 
> Is it possible to add a gomp_free() to complement the gomp_malloc() etc.?
> This would enable the usage of a dedicated heap for OpenMP in RTEMS.

Why would you want to implement it outside of libgomp?
libgomp has a config/ tree, so just add config/rtems/ in there and implement
it in there.

Jakub


Re: pass_stdarg problem when run after pass_lim

2015-01-29 Thread Jakub Jelinek
On Thu, Jan 29, 2015 at 07:44:29PM +0100, Richard Biener wrote:
> On January 29, 2015 6:25:35 PM CET, Jakub Jelinek  wrote:
> >On Thu, Jan 29, 2015 at 06:19:45PM +0100, Tom de Vries wrote:
> >> consider attached patch, which adds pass_lim after fre1 (a
> >simplification of
> >> my oacc kernels patch series).
> >> 
> >> The included testcase lim-before-stdarg.c fails.
> >> 
> >> The first sign of trouble is in lim-before-stdarg.c.088t.stdarg
> >(attached):
> >> ...
> >> gen_rtvec: va_list escapes 0, needs to save 0 GPR units and 0 FPR
> >units.
> >> ...
> >> 
> >> Because of the 'need to save 0 GPRs units', at expand no prologue is
> >> generated to dump the varargs in registers onto stack.
> >
> >The stdarg pass can't grok too heavy optimizations, so if at all
> >possible,
> >don't schedule such passes early, and if you for some reason do, avoid
> >optimizing in there the va_list related accesses.  I'm afraid that is
> >the
> >only recommendation I can give here for that.
> 
> The other possibility (Matz has patches for that) is to delay vaarg lowering 
> currently done by gimplification and combine it with the stdarg pass.

Yeah, that should work too.  But stage1 material probably.

Jakub


Re: pass_stdarg problem when run after pass_lim

2015-01-29 Thread Jakub Jelinek
On Thu, Jan 29, 2015 at 06:19:45PM +0100, Tom de Vries wrote:
> consider attached patch, which adds pass_lim after fre1 (a simplification of
> my oacc kernels patch series).
> 
> The included testcase lim-before-stdarg.c fails.
> 
> The first sign of trouble is in lim-before-stdarg.c.088t.stdarg (attached):
> ...
> gen_rtvec: va_list escapes 0, needs to save 0 GPR units and 0 FPR units.
> ...
> 
> Because of the 'need to save 0 GPRs units', at expand no prologue is
> generated to dump the varargs in registers onto stack.

The stdarg pass can't grok too heavy optimizations, so if at all possible,
don't schedule such passes early, and if you for some reason do, avoid
optimizing in there the va_list related accesses.  I'm afraid that is the
only recommendation I can give here for that.

Jakub


Re: GCC 5 Status Report (2015-01-19), Trunk in Stage 4

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 10:04:38AM +0100, Richard Biener wrote:
> On Tue, 27 Jan 2015, Andreas Krebbel wrote:
> > I would like to apply the following patch:
> > 
> > [PATCH] S/390: -mhotpatch v2
> > https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02370.html
> > 
> > It is a backend only change to our existing -mhotpatch feature
> > requested by the Linux kernel guys for the ftrace implementation:
> > https://lkml.org/lkml/2015/1/26/320
> > 
> > They need it in an upstream GCC asap. If we don't get it into 5.0 we
> > probably would need to commit it onto 5.1 branch right after the
> > release. I would rather try to avoid this since it would make the
> > hotpatch feature incompatible between 5.0 and 5.1.
> > 
> > Ok to do it now?
> 
> Ok.  It needs an entry in changes.html.
> 
> Do you plan to backport this change?
> 
> Did you consider using an alternate option name instead of changing
> it in an incompatible way?  I realize SUSE will need to backport this

Yeah, the option incompatibility worries me.  Can't -mhotpatch without =
stand for the old behavior?  Does it map to some -mhotpatch=X,Y value,
or is it not worth to support both?

Jakub


Re: GCC 5 Status Report (2015-01-19), Trunk in Stage 4

2015-01-19 Thread Jakub Jelinek
On Mon, Jan 19, 2015 at 05:32:39PM +, Jonathan Wakely wrote:
> I would like to commit these two patches which complete the C++11
> library implementation:
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01694.html
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01648.html
> 
> And this one to complete C++14 library implementation:
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01704.html
> 
> This one is tiny, and only affects the  header,
> which is, as the name suggests, experimental:
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01684.html
> 
> OK?

I think this is worth making an exception for it.
So ok from me, but please give Richard and/or Joseph a chance to chime in if
they disagree.

Jakub


Re: gcc Digest 26 Dec 2014 16:51:42 -0000 Issue 7953

2015-01-13 Thread Jakub Jelinek
On Tue, Jan 13, 2015 at 02:14:30PM +0300, Andrew Senkevich wrote:
> >> Consensus is required to commit x86_64 vector math functions by Glibc
> >> maintainer.
> >
> > With the difference that b stands for SSE2, not SSE4, and the fact
> > that those functions do not use the __regcall calling conventions, but
> > normal psABI calling conventions after replacing the arguments/return values
> > with the vectors documented in the 0.9.5 pdf (and/or adding the vector mask
> > arg) it describes what has been implemented, yes.
> 
> But which name use for SSE4?
> Gcc generates the same as for SSE2, and we now have SSE4 implementations.

You probably need to use IFUNC for that.  The problem is that the
_Z*b* symbol can be called even in code that requires only SSE2 HW, so you
can't assume that because somebody called you through this symbol you have
SSE4 available.  You know you have at least SSE2 or higher available.

Jakub


Re: gcc Digest 26 Dec 2014 16:51:42 -0000 Issue 7953

2015-01-12 Thread Jakub Jelinek
On Mon, Jan 12, 2015 at 07:38:10PM +0300, Andrew Senkevich wrote:
> > during work on addition vector math functions to Glibc and discussions
> > with community was found an issue with meaning of “#pragma omp declare
> > simd” (which will appear in math.h).
> >
> > Issue is there are no working way­ to specify ISA of vector function
> > in GCC 5.0, and hence no way to determine exact vector function name.
> >
> > Here is description of exact meaning of “#pragma omp declare simd” for 
> > x86_64.
> >
> > This is proposed as agreement between compilers supporting OpenMP.
> >
> > *** OpenMP vector function ABI for x86_64 ***
> >
> > Name of vector math function is based on Intel Vector Function ABI
> > (http://www.cilkplus.org/sites/default/files/open_specifications/Intel-ABI-Vector-Function-2012-v0.9.5.pdf)
> > with a little difference in part of name specifying ISA – namely
> > letters b, c, d instead of x, y, Y.
> >
> > #pragma omp declare simd notinbranch simdlen(2) for some function
> > “func” means what the name of vector version is:
> >
> > _ZGVbN2v_func (it is SSE4 implementation).
> >
> > #pragma omp declare simd notinbranch simdlen(4) for some function
> > “func” means what the following names are available:
> >
> >  _ZGVcN4v_func (it is AVX implementation)
> > and
> > _ZGVdN4v_func (it is AVX2 implementation).
> >
> > Every vector function should be provided by math library for each
> > supported ISA (currently SSE4, AVX and AVX2).
> > Semantics of those pragmas are independent of the processor for which
> > code is being generated.
> > Those pragmas must not be interpreted as meaning version of other ISA
> > of functions are available even if code is being built for a processor
> > with such ISA support.
> > Any future ABI extension that defines additional vector function
> > versions will also define a different pragma to declare their
> > availability.
> >
> > *
> >
> > Any feedback?
> 
> is this agreement OK?
> 
> Consensus is required to commit x86_64 vector math functions by Glibc
> maintainer.

With the difference that b stands for SSE2, not SSE4, and the fact
that those functions do not use the __regcall calling conventions, but
normal psABI calling conventions after replacing the arguments/return values
with the vectors documented in the 0.9.5 pdf (and/or adding the vector mask
arg) it describes what has been implemented, yes.

Jakub


GCC 5 Status Report (2015-01-08), Stage 4 to start soon

2015-01-08 Thread Jakub Jelinek
The trunk is still in Stage 3 now, which means it is open for general
bugfixing, but will enter Stage 4 on Friday, 16th, end of day (timezone
of your preference).  Once that happens, only wrong-code fixes, regression
bugfixes and documentation fixes will be allowed, as is normal for
our release branches too.

There are still a few patches that have been posted during Stage 1,
please get them committed into trunk before Stage 4 starts.

Still misleading quality data below - some P3 bugs have not been
re-prioritized.

Quality Data


Priority  #   Change from last report
---   ---
P1   39+  24
P2   98+  15
P3   48-  84
---   ---
Total   185-  45


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-11/msg00249.html


Re: GCC 4.9.2 -O3 gives a seg fault / GCC 4.8.2 -O3 works

2015-01-06 Thread Jakub Jelinek
On Tue, Jan 06, 2015 at 08:50:58AM -0500, Paul Smith wrote:
> On Tue, 2015-01-06 at 09:43 +0100, Jakub Jelinek wrote:
> > On Tue, Jan 06, 2015 at 03:18:48AM -0500, Paul Smith wrote:
> > > Hi all.  It's possible my code is doing something illegal, but it's also
> > > possible I've found a problem with -O3 optimization in GCC 4.9.2.  I've
> > > built this same code with GCC 4.8.2 -O3 on GNU/Linux and it works fine.
> > > It also works with GCC 4.9.2 with lower -O (-O2 for example).
> > 
> > Your testcase is invalid.
> > GCC trunk -fsanitize=undefined (in particular -fsanitize=nonnull-attribute)
> > diagnoses it:
> > /tmp/mystring.cpp:103:26: runtime error: null pointer passed as argument 2, 
> > which is declared to never be null
> > LD_PRELOAD=libmemstomp.so detects it too.
> > 
> > Calling memcpy (p, NULL, 0); is invalid according to C and C++
> > standards, you need to guard it, e.g. with if (data) memcpy (p, data, len1);
> > or if (len1) memcpy (p, data, len1);
> 
> Ah interesting.  You're right, this is definitely not correct.  But
> since len1 is 0 in this case, no implementation of memcpy() actually
> tried to dereference the data pointer and so there was no failure (we
> build and test with clang on OSX and MSVC on Windows, and run with
> valgrind and ASAN (clang)).
> 
> I'll have to look at other possible failure situations.

Note, it is even mentioned in GCC 4.9 porting to documentation:
https://gcc.gnu.org/gcc-4.9/porting_to.html

Jakub


Re: Bad link on gcc.gnu.org front page

2015-01-06 Thread Jakub Jelinek
On Tue, Jan 06, 2015 at 10:27:29AM +, Jonathan Wakely wrote:
> On 6 January 2015 at 09:34,   wrote:
> >
> > Hello!
> > There's a link to "GCC5" right on top of the News section on the home page 
> > of gcc.gnu.org that takes me to a 403 forbidden access page: 
> > https://gcc.gnu.org/gcc-5/ .
> > I think it's a bug.
> 
> Yes
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64469
> https://gcc.gnu.org/ml/gcc/2014-12/msg00160.html

I've already changed it to gcc-5/changes.html earlier today.
gcc-5/ has not been released and won't for 2-3 months, so
it is correct that gcc-5/ is inaccessible.

Jakub


Re: GCC 4.9.2 -O3 gives a seg fault / GCC 4.8.2 -O3 works

2015-01-06 Thread Jakub Jelinek
On Tue, Jan 06, 2015 at 03:18:48AM -0500, Paul Smith wrote:
> Hi all.  It's possible my code is doing something illegal, but it's also
> possible I've found a problem with -O3 optimization in GCC 4.9.2.  I've
> built this same code with GCC 4.8.2 -O3 on GNU/Linux and it works fine.
> It also works with GCC 4.9.2 with lower -O (-O2 for example).

Your testcase is invalid.
GCC trunk -fsanitize=undefined (in particular -fsanitize=nonnull-attribute)
diagnoses it:
/tmp/mystring.cpp:103:26: runtime error: null pointer passed as argument 2, 
which is declared to never be null
LD_PRELOAD=libmemstomp.so detects it too.

Calling memcpy (p, NULL, 0); is invalid according to C and C++
standards, you need to guard it, e.g. with if (data) memcpy (p, data, len1);
or if (len1) memcpy (p, data, len1);

Jakub


GCC 4.8.5 Status Report (2014-12-19)

2014-12-19 Thread Jakub Jelinek
Status
==

GCC 4.8.4 has been released, the branch is again open for regression
bugfixes and documentation fixes.  GCC 4.8.5 could be tentatively released
in April next year.


Quality Data


Priority  #   Change from Last Report
---   ---
P10   +- 0
P2  123   +- 0
P36   +- 0
---   ---
Total   129   +- 0


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-12/msg00080.html


GCC 4.8.4 Released

2014-12-19 Thread Jakub Jelinek
The GNU Compiler Collection version 4.8.4 has been released.

GCC 4.8.4 is the fourth bug-fix release containing important fixes for
regressions and serious bugs in GCC 4.8.3 with over 80 bugs fixed since
the previous release.

This release is available from the FTP servers listed at:

  http://www.gnu.org/order/ftp.html

Please do not contact me directly regarding questions or comments about
this release.  Instead, use the resources available from
http://gcc.gnu.org.

As always, a vast number of people contributed to this GCC release -- far
too many to thank them individually!


Re: GCC 4.8.4 Status Report (2014-12-05)

2014-12-17 Thread Jakub Jelinek
On Wed, Dec 17, 2014 at 11:16:18AM +0100, Dominique Dhumieres wrote:
> Currently gcc 4.8.4 does not bootstrap on darwin14 (Yosemite) due to pr61407.

Why has it not been pushed in earlier?
I guess if you test it sufficiently and if Darwin maintainer acks it,
perhaps.  Note that the patch violates coding conventions by using
CamelCase minorDigitIdx.

Jakub


Re: trying out openacc 2.0

2014-12-16 Thread Jakub Jelinek
On Wed, Dec 17, 2014 at 08:54:06AM +1300, Mark Farnell wrote:
> That's good news.  Does it mean that if I want to try out openACC with
> KNL and PTX support, then all I need to do is to compile the
> gomp-4_0-branch *without* any extra parameters into ./configure ?

No.  Please read the wiki page Tobias mentioned, you need to build 2
compilers and pass some configure options to get OpenMP + KNL support.
OpenACC has not been committed to trunk yet, but even when it will make it
in, you'll still need similarly to build 2 compilers and configure them
non-default way.

> Also, are other GPUs such as the AMD ATI and the built-in GPUs such as
> the Intel GPU and AMD fusion supported?  If so, are they already
> supported in the trunk, or only specific branch?

Some AMD HSA support is on the hsa svn branch, but AFAIK OpenMP 4.0
offloading nor OpenACC is supported there yet (instead just auto-offloading
some OpenMP 3.x loops I believe).

> Finally, when will support of Knights Corner (knc) be added to the
> trunk and/or one of the branches?

Unlikely.  The advantage of KNL is that it uses the same vector ISA as
the future desktop/server CPUs, not a different one; to support KNC we'd
need to make the i?86 backend larger and more complicated for something that
is not going to be used in any? future CPUs.

Jakub


Re: GCC 4.8.4 Status Report (2014-12-05)

2014-12-14 Thread Jakub Jelinek
On Sun, Dec 14, 2014 at 06:50:55AM -0800, H.J. Lu wrote:
> On Fri, Dec 5, 2014 at 4:08 AM, H.J. Lu  wrote:
> > On Fri, Dec 5, 2014 at 1:18 AM, Jakub Jelinek  wrote:
> >> Status
> >> ==
> >>
> >> It is time for another 4.8 release, I'd like to create 4.8.4 release
> >> candidate at the end of the next week and if all goes well, 4.8.4 release
> >> a week after that.  If you have any safe fixes you'd like to be backported,
> >> please do so soon, and if there are any known issues on the branch, please
> >> make sure they are reported in bugzilla and let us RMs know about those.
> >>
> >
> > I backported the fix for
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64037
> >
> 
> My fix was reverted on trunk and 4.8 branch.  I submitted an updated
> patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01183.html
> 
> I'd like to see it get fixed in 4.8.4.

Given that the previous version broke some targets, I think we shouldn't
rush this in.  The bug has been there for several years, 4.8.5 will be
released in a few months.

Jakub


Re: GCC 4.8.4 Status Report (2014-12-12)

2014-12-13 Thread Jakub Jelinek
On Sat, Dec 13, 2014 at 08:24:48AM -0500, David Edelsohn wrote:
> Jakub,
> 
> I would like to backport this fixincludes patch to the GCC 4.8 branch.
> 
> https://gcc.gnu.org/ml/gcc-patches/2013-08/msg01975.html

Ok.

Jakub


Re: GCC 4.8.4 Status Report (2014-12-12)

2014-12-12 Thread Jakub Jelinek
On Fri, Dec 12, 2014 at 04:51:21PM -0500, David Edelsohn wrote:
> GCC 4.8 branch has degraded from 14 libstdc++ failures to 153.  This

On which target, when it has been reported, are the libstdc++ folks aware of
that?
I see zero regressions on x86_64-linux and i686-linux from a build 14 days
ago, compared to a build from early September there is
+FAIL: g++.dg/tls/thread_local10.C -std=c++11 (test for excess errors)
+UNRESOLVED: g++.dg/tls/thread_local10.C -std=c++11 compilation failed to 
produce executable
but that has been reported and is just a testsuite issue, not compiler or
library bug.

Jakub


GCC 4.8.4 Status Report (2014-12-12)

2014-12-12 Thread Jakub Jelinek
Status
==

The GCC 4.8.4-rc1 release candidate has been released.
The branch is frozen now, all changes require release manager approval
until the final release of GCC 4.8.4 which should happen roughly
one week after the release candidate.


Quality Data


Priority  #   Change from Last Report
---   ---
P10   +- 0
P2  123   + 28
P36   - 39
---   ---
Total   129   - 11


Previous Report
===

http://gcc.gnu.org/ml/gcc/2014-12/msg00035.html

The next report will be sent by me again.


GCC 4.8.4 Release Candidate available from gcc.gnu.org

2014-12-12 Thread Jakub Jelinek
The first release candidate for GCC 4.8.4 is available from

 ftp://gcc.gnu.org/pub/gcc/snapshots/4.8.4-RC-20141212

and shortly its mirrors.  It has been generated from SVN revision 218649.

I have so far bootstrapped and tested the release candidate on
x86_64-linux and i686-linux.  Please test it and report any issues to
bugzilla.

If all goes well, I'd like to release 4.8.4 at the end of next week.


Re: [patch, build] Restore bootstrap in building libcc1 on darwin

2014-12-05 Thread Jakub Jelinek
On Fri, Dec 05, 2014 at 11:34:28PM +0100, Dominique Dhumieres wrote:
> > As I've tried to explain, that is IMHO wrong though.
> > If what you are after is the -B stuff too, then perhaps:
> > ...
> 
> Sorry but it does not work:

Sorry, make that (just removed 4x ' in each file):

2014-12-05  Jakub Jelinek  

PR bootstrap/64023
* Makefile.tpl (EXTRA_TARGET_FLAGS): Set STAGE1_LDFLAGS
to POSTSTAGE1_LDFLAGS and STAGE1_LIBS to POSTSTAGE1_LIBS.
Add -B to libstdc++-v3/src/.libs and libstdc++-v3/libsupc++/.libs
to CXX.
* Makefile.in: Regenerated.

--- Makefile.tpl.jj 2014-11-12 09:31:59.0 +0100
+++ Makefile.tpl2014-12-05 21:12:21.486031062 +0100
@@ -641,7 +641,9 @@ EXTRA_TARGET_FLAGS = \
'AS=$(COMPILER_AS_FOR_TARGET)' \
'CC=$$(CC_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CFLAGS=$$(CFLAGS_FOR_TARGET)' \
-   'CXX=$$(CXX_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
+   'CXX=$$(CXX_FOR_TARGET) -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/src/.libs \
+-B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs \
+$$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CXXFLAGS=$$(CXXFLAGS_FOR_TARGET)' \
'DLLTOOL=$$(DLLTOOL_FOR_TARGET)' \
'GCJ=$$(GCJ_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
@@ -659,6 +661,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)
--- Makefile.in.jj  2014-11-28 14:40:52.0 +0100
+++ Makefile.in 2014-12-05 21:11:48.276616008 +0100
@@ -835,7 +835,9 @@ EXTRA_TARGET_FLAGS = \
'AS=$(COMPILER_AS_FOR_TARGET)' \
'CC=$$(CC_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CFLAGS=$$(CFLAGS_FOR_TARGET)' \
-   'CXX=$$(CXX_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
+   'CXX=$$(CXX_FOR_TARGET) -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/src/.libs \
+-B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs \
+$$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CXXFLAGS=$$(CXXFLAGS_FOR_TARGET)' \
'DLLTOOL=$$(DLLTOOL_FOR_TARGET)' \
'GCJ=$$(GCJ_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
@@ -853,6 +855,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)

Jakub


Re: [patch, build] Restore bootstrap in building libcc1 on darwin

2014-12-05 Thread Jakub Jelinek
On Fri, Dec 05, 2014 at 08:11:53PM +0100, Dominique Dhumieres wrote:
> > ...
> > Can you please test it on Darwin (or whatever other target has similar
> > issues with bootstrapping libcc1)?
> >
> > 2014-12-05  Jakub Jelinek  
> > ...
> 
> The patch does not work for x86_64-apple-darwin14.0.0. However the following 
> patch does:

As I've tried to explain, that is IMHO wrong though.
If what you are after is the -B stuff too, then perhaps:

2014-12-05  Jakub Jelinek  

PR bootstrap/64023
* Makefile.tpl (EXTRA_TARGET_FLAGS): Set STAGE1_LDFLAGS
to POSTSTAGE1_LDFLAGS and STAGE1_LIBS to POSTSTAGE1_LIBS.
Add -B to libstdc++-v3/src/.libs and libstdc++-v3/libsupc++/.libs
to CXX.
* Makefile.in: Regenerated.

--- Makefile.tpl.jj 2014-11-12 09:31:59.0 +0100
+++ Makefile.tpl2014-12-05 21:12:21.486031062 +0100
@@ -641,7 +641,9 @@ EXTRA_TARGET_FLAGS = \
'AS=$(COMPILER_AS_FOR_TARGET)' \
'CC=$$(CC_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CFLAGS=$$(CFLAGS_FOR_TARGET)' \
-   'CXX=$$(CXX_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
+   'CXX=$$(CXX_FOR_TARGET) -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/src/.libs' 
\
+   ' -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs' \
+   ' $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CXXFLAGS=$$(CXXFLAGS_FOR_TARGET)' \
'DLLTOOL=$$(DLLTOOL_FOR_TARGET)' \
'GCJ=$$(GCJ_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
@@ -659,6 +661,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)
--- Makefile.in.jj  2014-11-28 14:40:52.0 +0100
+++ Makefile.in 2014-12-05 21:11:48.276616008 +0100
@@ -835,7 +835,9 @@ EXTRA_TARGET_FLAGS = \
'AS=$(COMPILER_AS_FOR_TARGET)' \
'CC=$$(CC_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CFLAGS=$$(CFLAGS_FOR_TARGET)' \
-   'CXX=$$(CXX_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
+   'CXX=$$(CXX_FOR_TARGET) -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/src/.libs' 
\
+   ' -B$$r/$$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs' \
+   ' $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
'CXXFLAGS=$$(CXXFLAGS_FOR_TARGET)' \
'DLLTOOL=$$(DLLTOOL_FOR_TARGET)' \
'GCJ=$$(GCJ_FOR_TARGET) $$(XGCC_FLAGS_FOR_TARGET) $$(TFLAGS)' \
@@ -853,6 +855,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)


Jakub


Re: [patch, build] Restore bootstrap in building libcc1 on darwin

2014-12-05 Thread Jakub Jelinek
On Mon, Nov 24, 2014 at 01:06:45AM +0100, FX wrote:
> tl;dr: One question to build maintainers, and one patch submitted to toplevel 
> configure.ac
> 
> ---
> 
> I’m looked into the issue some more, and am comparing two builds of trunk 
> (exact same source), one configured with system compiler (clang) in PATH, the 
> other with GCC 4.9.2 in PATH.
> At the toplevel configure, the only meaningful difference is that the 
> gcc-based build sets stage1_ldflags='-static-libstdc++ -static-libgcc' while 
> the clang-based has stage1_ldflags='' (clang doesn’t recognized 
> -static-libstdc++).
> 
> This is included into the toplevel Makefile as STAGE1_LDFLAGS (the comment 
> appropriately says "Linker flags to use on the host, for stage1 or when not 
> bootstrapping”).
> Those are exported by HOST_EXPORTS, which is are then used by 
> configure-libcc1, all-libcc1, etc. Thus, we end up using STAGE1_LDFLAGS, 
> which correspond to the system compiler, instead of the stage3 compiler (as 
> we should).
> 
> So, this is “false negative” part of the problem (namely, why we don’t see 
> the failure when bootstrapping with clang): we use STAGE1_LDFLAGS in building 
> libcc1, and with clang as system compiler we don’t use static linking of the 
> C++ library. This part, I don’t know how to fix: it is for the build experts 
> to address. It is a real problem: it leads to libcc1.so being linked 
> dynamically to libstdc++ and libgcc, instead of statically (as it should).
> 
> ---
> 
> Second part of the question: when the freshly built g++ is used, we need to 
> pass the appropriate -B options. As I understand it, the appropriate place 
> for that is in the toplevel configure.ac, where we already pass down the 
> respective -L options. Indeed, the attached patch restores bootstrap on 
> x86_64-apple-darwin14 with gcc as system compiler (and doesn’t break the 
> bootstrap with clang as system compiler).
> 
> OK to commit?

Reading the toplevel Makefile and trying to understand how things work
for non-bootstrap vs. bootstrap host dirs that aren't bootstrapped,
I'd say the right fix should be something like following
(bootstrapping/regtesting it right now on x86_64-linux and i686-linux,
though it won't make much difference there, on x86_64-linux
STAGE1_LDFLAGS is equal to POSTSTAGE1_LDFLAGS and STAGE1_LIBS is equal
to POSTSTAGE1_LIBS.  On i686-linux there is at least a difference
for some reason (possibly related with my setarch and gcc -m32 wrappers
hacks to make i686-linux bootstrap work on x86_64-linux box) in
*STAGE1_LDFLAGS, only the POSTSTAGE1_LDFLAGS is -static-libstdc++ 
-static-libgcc.

>From my reading, POSTSTAGE1_HOST_EXPORTS is clearly inappropriate for the
modules like libcc1, because it uses prev-gcc/, while we want to use gcc/,
but otherwise looking at the HOST_EXPORTS vs. POSTSTAGE1_HOST_EXPORTS
differences, LDFLAGS and HOST_LIBS is what needs changing.
For some reason POSTSTAGE1_HOST_EXPORTS sets LDFLAGS to 
$(POSTSTAGE1_LDFLAGS) $(BOOT_LDFLAGS)
(the first part is ok and clear, the latter differs from the HOST_EXPORTS
$(STAGE1_LDFLAGS) $(LDFLAGS).
With my patch below, one actually ends up with
$(POSTSTAGE1_LDFLAGS) $(LDFLAGS_FOR_TARGET)
for libcc1 when bootstrapping in LDFLAGS, while previously
$(STAGE1_LDFLAGS) $(LDFLAGS_FOR_TARGET)
was used.  STAGE1_L{DFLAGS,IBS} is only used in $(HOST_EXPORTS),
so at least in theory I think my patch should DTRT.

Can you please test it on Darwin (or whatever other target has similar
issues with bootstrapping libcc1)?

2014-12-05  Jakub Jelinek  

PR bootstrap/64023
* Makefile.tpl (EXTRA_TARGET_FLAGS): Set STAGE1_LDFLAGS
to POSTSTAGE1_LDFLAGS and STAGE1_LIBS to POSTSTAGE1_LIBS.
* Makefile.in: Regenerated.

--- Makefile.tpl.jj 2014-11-12 09:31:59.0 +0100
+++ Makefile.tpl2014-12-05 17:14:16.115295667 +0100
@@ -659,6 +659,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)
--- Makefile.in.jj  2014-11-28 14:40:52.0 +0100
+++ Makefile.in 2014-12-05 17:15:04.322439003 +0100
@@ -853,6 +853,8 @@ EXTRA_TARGET_FLAGS = \
'WINDRES=$$(WINDRES_FOR_TARGET)' \
'WINDMC=$$(WINDMC_FOR_TARGET)' \
'XGCC_FLAGS_FOR_TARGET=$(XGCC_FLAGS_FOR_TARGET)' \
+   'STAGE1_LDFLAGS=$$(POSTSTAGE1_LDFLAGS)' \
+   'STAGE1_LIBS=$$(POSTSTAGE1_LIBS)' \
"TFLAGS=$$TFLAGS"
 
 TARGET_FLAGS_TO_PASS = $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)


Jakub


GCC 4.8.4 Status Report (2014-12-05)

2014-12-05 Thread Jakub Jelinek
Status
==

It is time for another 4.8 release, I'd like to create 4.8.4 release
candidate at the end of the next week and if all goes well, 4.8.4 release
a week after that.  If you have any safe fixes you'd like to be backported,
please do so soon, and if there are any known issues on the branch, please
make sure they are reported in bugzilla and let us RMs know about those.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2   95+   3
P3   45+   2
---   ---
Total   140+   5


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-05/msg00263.html


Re: fn spec attribute on builtin function in fortran

2014-12-01 Thread Jakub Jelinek
On Mon, Dec 01, 2014 at 09:35:25AM +0100, Tom de Vries wrote:
> I've been adding an fn spec function attribute to some openacc builtin 
> functions:
> ...
> diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
> index 9c05a94..4e34192 100644
> --- a/gcc/builtin-attrs.def
> +++ b/gcc/builtin-attrs.def
> @@ -64,6 +64,7 @@ DEF_ATTR_FOR_INT (6)
>DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL, \
> ATTR_##ENUM, ATTR_NULL)
>  DEF_ATTR_FOR_STRING (STR1, "1")
> +DEF_ATTR_FOR_STRING (DOT_DOT_DOT_r_r_r, "...rrr")
>  #undef DEF_ATTR_FOR_STRING
> 
>  /* Construct a tree for a list of two integers.  */
> @@ -127,6 +128,8 @@ DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LIST, ATTR_PURE,\
>   ATTR_NULL, ATTR_NOTHROW_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LEAF_LIST, ATTR_PURE,\
>   ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
> +DEF_ATTR_TREE_LIST 
> (ATTR_FNSPEC_DOT_DOT_DOT_NOCLOB_NOCLOB_NOCLOB_NOTHROW_LIST,\
> +   ATTR_FNSPEC, ATTR_LIST_DOT_DOT_DOT_r_r_r, 
> ATTR_NOTHROW_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, ATTR_NORETURN, \
>   ATTR_NULL, ATTR_NOTHROW_LIST)
>  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
> ...
> 
> That worked well for c. When compiling the fortran compiler, I ran into this 
> error:
> ...
> In file included from gcc/fortran/f95-lang.c:1194:0:
> gcc/fortran/../oacc-builtins.def: In function 'void 
> gfc_init_builtin_functions()':
> gcc/fortran/../oacc-builtins.def:32:1: error:
> 'ATTR_FNSPEC_DOT_DOT_DOT_NOCLOB_NOCLOB_NOCLOB_NOTHROW_LIST' was not declared
> in this scope
> make[2]: *** [fortran/f95-lang.o] Error 1

Fortran FE uses gfc_build_library_function_decl_with_spec to build these.

Jakub


Re: [PATCH] gcc parallel make check

2014-11-25 Thread Jakub Jelinek
On Tue, Nov 25, 2014 at 03:27:40PM +0100, Tom de Vries wrote:
> This patch fixes that by ensuring that we print that unsupported message only 
> once.
> 
> The resulting test result comparison diff is:
> 2014-11-25  Tom de Vries  
> 
>   * testsuite/libstdc++-prettyprinters/prettyprinters.exp: Add missing
>   dg-finish.  Only print unsupported message once.

LGTM.

> --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
> +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
> @@ -30,7 +30,14 @@ if ![info exists ::env(GUALITY_GDB_NAME)] {
>  }
>  
>  if {! [gdb_version_check]} {
> +dg-finish
> +# Only print unsupported message in one instance.
> +if ![gcc_parallel_test_run_p prettyprinters] {
> + return
> +}
> +gcc_parallel_test_enable 0
>  unsupported "prettyprinters.exp"
> +gcc_parallel_test_enable 1
>  return
>  }
>  
> -- 
> 1.9.1
> 


Jakub


Re: [PATCH] gcc/testsuite: guality.exp: Fix `test_counts' restoration

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 09:01:25PM +, Maciej W. Rozycki wrote:
> 2014-11-14  Maciej W. Rozycki  
> 
>   gcc/testsuite/ 
>   * g++.dg/guality/guality.exp (check_guality): Fix `test_counts' 
>   restoration.

Ok, thanks.

> --- gcc-fsf-trunk-quilt.orig/gcc/testsuite/g++.dg/guality/guality.exp 
> 2014-11-14 18:33:47.0 +
> +++ gcc-fsf-trunk-quilt/gcc/testsuite/g++.dg/guality/guality.exp  
> 2014-11-14 20:18:35.038856372 +
> @@ -28,7 +28,7 @@ proc check_guality {args} {
>set ret [string match "*1 PASS, 0 FAIL, 0 UNRESOLVED*" $execout]
>  }
>  remote_file build delete $output
> -array get test_counts [array get saved_test_counts]
> +array set test_counts [array get saved_test_counts]
>  return $ret
>  }
>  

Jakub


[committed] Improve e.54.2.c testcase

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 11:14:59AM +0100, Richard Biener wrote:
> On Thu, Nov 13, 2014 at 10:47 PM, Jakub Jelinek  wrote:
> > On Thu, Nov 13, 2014 at 11:53:53PM +0300, Ilya Verbin wrote:
> >> > Don't you need another plugin to claim those offload IR sections?
> >>
> >> No, the plan was that a regular plugin will just ignore offload IR
> >> sections by default.  In your configuration ld detects a __gnu_lto_slim
> >> symbol and decided that the object file contains only LTO IR without asm.
> >> I am going to investigate where is the difference with my configuration
> >> and fix the bug.
> >
> > FYI, I'm getting
> > +WARNING: program timed out.
> > +FAIL: libgomp.c/examples-4/e.54.2.c execution test
> > on both x86_64-linux and i686-linux (normal --enable-checking=yes,rtl
> > bootstrap, no offloading configure options).
> > binutils-2.24, ld.bfd.
> 
> Me too, with default checking (timeouts are really annoying).

So, it turns out that the problem is in heavily parallel testing that
these two testcases spawn just way too many #pragma omp parallels in
sequence (32768), unnecessarily so.  And, with the default partial busy
waiting (something in between OMP_WAIT_POLICY=passive and active) if too
often when some threads busy wait on some barrier big amounts of other jobs
(other tests) contend for the CPUs, it means the test can run for minutes,
while normally it takes 0.5s.  Fixed thusly, committed to trunk.

With all the additions to the libgomp test, seems even on a fast box (but
with heavy contention) libgomp testing now takes already 30 minutes or so,
I guess I should play with the possibility to set OMP_WAIT_POLICY=passive
for libgomp testing in the environment of the tests, e.g. if make -jN
was used (and keep the default wait policy if it wasn't).

2014-11-14  Jakub Jelinek  

* libgomp.c/examples-4/e.54.2.c (main): Use N / 8 instead
of 32 as block_size.
* libgomp.fortran/examples-4/e.54.2.f90 (e_54_1): Use n / 8
instead of 32 as block_size.

--- libgomp/testsuite/libgomp.c/examples-4/e.54.2.c.jj  2014-11-13 
15:13:18.0 +0100
+++ libgomp/testsuite/libgomp.c/examples-4/e.54.2.c 2014-11-14 
15:07:07.428485712 +0100
@@ -61,7 +61,7 @@ int main ()
   init (v1, v2, N);
 
   p1 = dotprod_ref (v1, v2, N);
-  p2 = dotprod (v1, v2, N, 32, 2, 8);
+  p2 = dotprod (v1, v2, N, N / 8, 2, 8);
 
   check (p1, p2);
 
--- libgomp/testsuite/libgomp.fortran/examples-4/e.54.2.f90.jj  2014-11-13 
15:13:17.0 +0100
+++ libgomp/testsuite/libgomp.fortran/examples-4/e.54.2.f90 2014-11-14 
15:08:41.471776859 +0100
@@ -59,7 +59,7 @@ program e_54_1
   allocate (B(n), C(n))
   call init (B, C, n)
   ref = dotprod_ref (B, C, n)
-  d = dotprod (B, C, n, 32, 2, 8)
+  d = dotprod (B, C, n, n / 8, 2, 8)
   call check (ref, d)
   deallocate (B, C)
 end program

Jakub


Re: [PATCH 0/4] OpenMP 4.0 offloading to Intel MIC

2014-11-13 Thread Jakub Jelinek
On Thu, Nov 13, 2014 at 11:53:53PM +0300, Ilya Verbin wrote:
> > Don't you need another plugin to claim those offload IR sections?
> 
> No, the plan was that a regular plugin will just ignore offload IR
> sections by default.  In your configuration ld detects a __gnu_lto_slim
> symbol and decided that the object file contains only LTO IR without asm. 
> I am going to investigate where is the difference with my configuration
> and fix the bug.

FYI, I'm getting
+WARNING: program timed out.
+FAIL: libgomp.c/examples-4/e.54.2.c execution test
on both x86_64-linux and i686-linux (normal --enable-checking=yes,rtl
bootstrap, no offloading configure options).
binutils-2.24, ld.bfd.

Jakub


Re: [PATCH 0/4] OpenMP 4.0 offloading to Intel MIC

2014-11-13 Thread Jakub Jelinek
On Thu, Nov 13, 2014 at 04:15:48PM +0100, Tobias Burnus wrote:
> Question: Is the latter up to date - and the item above correct?

Will leave that to Kirill.

> BTW: you could update gcc.gnu.org ->news and gcc.gnu.org/gcc-5/changes.html

Indeed, that should be updated.

> Otherwise:
> * OpenACC support is about to be merged (as alternative to OpenMP 4)

I hope so.

> * Support for offloading to NVidia GPUs via PTX is also about to be merged.

Ditto.

Then the question is how hard will it be to get OpenACC offloading to
XeonPhi (real hw or emulation) - I guess it is a matter of whether the
plugin needs to implement some extra hooks for OpenACC, and also
whether we can get OpenMP offloading to PTX (dunno if Thomas or his
collegues have actually tried it on simple testcases, I bet the hardest part
will be porting libgomp away from pthread_* to optionally be supported
by the limited nvptx target and use its threading model; whether __thread
is already supported by nvptx etc.).  I'm willing to help with this once I
have some hw, but some help from people familiar with PTX would be certainly
appreciated.  Because without libgomp ported to nvptx-*-* target (or some
way to inline all the GOMP_*/omp_* calls in offloading regions for nvptx,
but the latter might be too hard), I guess one could offload very simple
target regions, but not anything using #pragma omp inside of them.

Jakub


Re: libgomp: "GNU OpenMP Runtime Library" (was: [PATCH 1/5] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin (repost))

2014-11-12 Thread Jakub Jelinek
On Wed, Nov 12, 2014 at 03:22:21PM -0500, David Malcolm wrote:
> On Wed, 2014-11-12 at 14:47 +0100, Jakub Jelinek wrote:
> > On Wed, Nov 12, 2014 at 08:33:34AM -0500, David Malcolm wrote:
> > > Apologies for bikeshedding, and I normally dislike "cute" names, but
> > > renaming it to
> > > 
> > >"GNU Offloading and Multi Processing library"
> > > 
> > > would allow a backronym of "libgomp", thus preserving the existing
> > > filenames/SONAME etc.
> > 
> > I think this is fine, can you change it both in libgomp/configure.ac
> > and texi docs?
> 
> Am attaching a patch that does so, though I suspect the wording in the
> texi may need some more work (not my area of expertise).

Oops, I didn't mean by "you" above you, but the OpenACC folks, sorry for
confusion.  Anyway, your patch is ok for trunk.  Thanks.

> >From f52f7d0e2115d3f88e8662cab650f8746a2c147d Mon Sep 17 00:00:00 2001
> From: David Malcolm 
> Date: Wed, 12 Nov 2014 12:25:25 -0500
> Subject: [PATCH] Change "human" name of libgomp
> 
> libgomp/ChangeLog:
>   * configure.ac (AC_INIT): Rename from "GNU OpenMP Runtime Library"
>   to "GNU Offloading and Multi Processing Runtime Library".
>   * libgomp.texi (direntry): Likewise.  Reword to refer to both
>   OpenMP and OpenACC.
>   (Introduction): Reword.
>   (Runtime Library Routines): Reword.
> ---
>  libgomp/configure.ac |  2 +-
>  libgomp/libgomp.texi | 14 --
>  2 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/libgomp/configure.ac b/libgomp/configure.ac
> index 84d250f..1a70058 100644
> --- a/libgomp/configure.ac
> +++ b/libgomp/configure.ac
> @@ -2,7 +2,7 @@
>  # aclocal -I ../config && autoconf && autoheader && automake
>  
>  AC_PREREQ(2.64)
> -AC_INIT([GNU OpenMP Runtime Library], 1.0,,[libgomp])
> +AC_INIT([GNU Offloading and Multi Processing Runtime Library], 
> 1.0,,[libgomp])
>  AC_CONFIG_HEADER(config.h)
>  
>  # ---
> diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
> index 254be57..78e8404 100644
> --- a/libgomp/libgomp.texi
> +++ b/libgomp/libgomp.texi
> @@ -31,11 +31,12 @@ texts being (a) (see below), and with the Back-Cover 
> Texts being (b)
>  @ifinfo
>  @dircategory GNU Libraries
>  @direntry
> -* libgomp: (libgomp).GNU OpenMP runtime library
> +* libgomp: (libgomp).   GNU Offloading and Multi Processing Runtime library
>  @end direntry
>  
> -This manual documents the GNU implementation of the OpenMP API for 
> -multi-platform shared-memory parallel programming in C/C++ and Fortran.
> +This manual documents libgomp, the GNU Offloading and Multi
> +Processing Runtime library.  This is the GNU implementation of the OpenMP
> +and OpenACC APIs for parallel programming in C/C++ and Fortran.
>  
>  Published by the Free Software Foundation
>  51 Franklin Street, Fifth Floor
> @@ -69,7 +70,8 @@ Boston, MA 02110-1301, USA@*
>  @top Introduction
>  @cindex Introduction
>  
> -This manual documents the usage of libgomp, the GNU implementation of the 
> +This manual documents the usage of libgomp, the GNU Offloading and Multi
> +Processing Runtime library.  This is the GNU implementation of the
>  @uref{http://www.openmp.org, OpenMP} Application Programming Interface (API)
>  for multi-platform shared-memory parallel programming in C/C++ and Fortran.
>  
> @@ -82,8 +84,8 @@ for multi-platform shared-memory parallel programming in 
> C/C++ and Fortran.
>  @comment
>  @menu
>  * Enabling OpenMP::How to enable OpenMP for your applications.
> -* Runtime Library Routines::   The OpenMP runtime application programming 
> -   interface.
> +* Runtime Library Routines::   The offloading and multiprocessing runtime
> +   application programming interface.
>  * Environment Variables::  Influencing runtime behavior with environment 
> variables.
>  * The libgomp ABI::Notes on the external ABI presented by 
> libgomp.
> -- 
> 1.8.5.3
> 


Jakub


Re: libgomp: "GNU OpenMP Runtime Library" (was: [PATCH 1/5] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin (repost))

2014-11-12 Thread Jakub Jelinek
On Wed, Nov 12, 2014 at 08:33:34AM -0500, David Malcolm wrote:
> Apologies for bikeshedding, and I normally dislike "cute" names, but
> renaming it to
> 
>"GNU Offloading and Multi Processing library"
> 
> would allow a backronym of "libgomp", thus preserving the existing
> filenames/SONAME etc.

I think this is fine, can you change it both in libgomp/configure.ac
and texi docs?

Jakub


Re: [RFC] UBSan unsafely uses VRP

2014-11-12 Thread Jakub Jelinek
On Wed, Nov 12, 2014 at 12:58:37PM +0300, Yury Gribov wrote:
> On 11/12/2014 11:45 AM, Marek Polacek wrote:
> >On Wed, Nov 12, 2014 at 11:42:39AM +0300, Yury Gribov wrote:
> >>On 11/11/2014 05:15 PM, Jakub Jelinek wrote:
> >>>>There are also some unsafe code in functions
> >>>>ubsan_expand_si_overflow_addsub_check, ubsan_expand_si_overflow_mul_check
> >>>>which uses get_range_info to reduce checks number. As seen before vrp 
> >>>>usage
> >>>>for sanitizers may decrease quality of error detection.
> >>>
> >>>Using VRP is completely intentional there, we don't want to generate too
> >>>slow code if you decide you want to optimize your code (for -O0 VRP isn't
> >>>performed of course).
> >>
> >>On the other hand detection quality is probably more important than
> >>important regardless of optimization level. When I use a checker, I don't
> >>want it to miss bugs due to overly aggressive optimization.
> >
> >Yes, but as said above, VRP is only run with >-O2 and -Os.
> 
> Hm, I must be missing something.  99% of users will only run their code
> under -O2 because it'll be too slow otherwise.  Why should we penalize them
> for this by lowering analysis quality?  Isn't error detection the main goal
> of sanitizers (performance being the secondary at best)?

But, if -O0 isn't too slow for them, having unnecessary bloat even at -O2
is bad the same.  But not using VRP at all, you are giving up all the cases
where you know something won't overflow because you e.g. sign extend
or zero extend from some smaller type, sum op such values, and something
with constant, or you can use a cheaper code to multiply etc.
Turning off -faggressive-loop-optimizations is certainly the right thing for
-fsanitize=undefined (any undefined I'd say), so are perhaps selected other
optimizations.

Jakub


Re: libgomp: "GNU OpenMP Runtime Library" (was: [PATCH 1/5] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin (repost))

2014-11-12 Thread Jakub Jelinek
On Wed, Nov 12, 2014 at 12:18:13PM +0100, Thomas Schwinge wrote:
> On Wed, 12 Nov 2014 11:06:26 +0100, Jakub Jelinek  wrote:
> > On Tue, Nov 11, 2014 at 01:53:23PM +, Julian Brown wrote:
> > > --- a/libgomp/configure.ac
> > > +++ b/libgomp/configure.ac
> > > @@ -2,6 +2,8 @@
> > >  # aclocal -I ../config && autoconf && autoheader && automake
> > >  
> > >  AC_PREREQ(2.64)
> > > +#TODO: Update for OpenACC?  But then also have to update copyright 
> > > notices in
> > > +#all source files...
> | >  AC_INIT([GNU OpenMP Runtime Library], 1.0,,[libgomp])
> > 
> > Please drop this.
> 
> (I agree to drop the TODO marker, obviously.)  Note that I'm not trying
> to drive this into a "bikeshedding" discussion, and neither is my
> intention to discredit the lots of pioneering OpenMP work in GCC (which
> we're largely basing our OpenACC work on -- thanks!).
> 
> The underlying question here is, with offloading generally as well as the
> OpenACC Runtime Library also to be living in libgomp, calling it "GNU
> OpenMP Runtime Library" is no longer accurate.  (Also, I'm not proposing
> to change the libgomp library name -- that would probably be too much of
> a hassle?)  Do we want a new "verbose" name for libgomp, "GNU Offloading,
> OpenACC, and OpenMP Runtime Library" (sorting alphabetically), or
> something else, or no change.  I'm afraid that not changing it will be
> confusing to users who are looking for the GCC implementation of the
> OpenACC Runtime Library, for example?

Yeah, it is something I wanted to mention in the review of the documentation
patch, calling it just GNU OpenMP Runtime Library is not right after
it handles OpenACC too, but GNU Offloading, OpenACC and OpenMP Runtime
Library sounds bad to me too, because offloading (both OpenMP offloading and
OpenACC offloading) is actually only a small part of what the library is
about, I still view the library primarily as being a runtime for
OpenMP parallelization, tasking etc.; that's how it started and even OpenMP
offloading is just a matter of the last year (and until today in upstream
not even any actual offloading), for OpenACC it is solely about
offloading and directives in the offloaded code, right?

So, don't want to bikeshed, but I'd call it
GNU OpenMP and OpenACC Runtime Library simply from the things what it does
and how it evolved, I know it isn't alphabetically sorted that way, but
it will the library has more than 9 years of history now and tons of users
already.

Jakub


Re: [RFC] UBSan unsafely uses VRP

2014-11-11 Thread Jakub Jelinek
On Tue, Nov 11, 2014 at 07:12:55PM +0300, Marat Zakirov wrote:
> It is seems that -fsanitize=something do not set
> flag_aggressive_loop_optimizations to 0 in current GCC version. I made a
> watchpoint on it but changes after init_options_struct weren't found. I will
> make fix for both flag_aggressive_loop_optimizationsno-strict-overflow and
> flag_strict_overflow.

Ah, you're right, we only have:
/* When instrumenting the pointers, we don't want to remove
   the null pointer checks.  */
if (opts->x_flag_sanitize & (SANITIZE_NULL | SANITIZE_NONNULL_ATTRIBUTE
 | SANITIZE_RETURNS_NONNULL_ATTRIBUTE))
  opts->x_flag_delete_null_pointer_checks = 0;
and as Joseph said, even that is misplaced, it should be done after all
options are processed, so that it isn't dependent on whether
-fsanitize=undefined or
--fdelete-null-pointer-checks/-faggressive-loop-optimizations come first.

Jakub


Re: [RFC] UBSan unsafely uses VRP

2014-11-11 Thread Jakub Jelinek
On Tue, Nov 11, 2014 at 05:02:55PM +0300, Marat Zakirov wrote:
> I found that UBSan uses vrp pass to optimize generated checks. Keeping in
> mind that vrp pass is about performance not stability I found example where
> UBSan may skip true positive.
> 
> Example came from spec2006 perlbench:
> 
> int ext;
> 
> int
> Perl_do_sv_dump()
> {
> int freq[10];
> int i;
> int max = 0;
> int t = INT_MAX - 20;
> 
> if (max < ext)
>   max = ext;
> 
> for (i = 0; i <= max; i++)
> if (freq[i])
>   ext = 0;
> 
> t += i;  <<< (*)
> return t;
> }
> 
> vrp pass here sets vrp('i') to [0..10] in assumption that 'freq[i]' wont
> violate array bound (vrp uses loop iteration number calculation, see
> adjust_range_with_scev in tree-vrp.c). This means that UBSAN_CHECK_ADD build
> for (*) should be deleted as redundant (and actually it is deleted by vrp
> pass). So if at the execution max = 30, freq[5] != 0 uncaught overflow will
> occur.

Well, if max is >= 10, then you should get -fsanitize=bounds error already.
-fsanitize=undefined already disables -faggressive-loop-optimizations,
perhaps it can also disable other optimizations (I thought deriving number
of iterations from assuming undefined behavior doesn't occur in loop stmts
is already guarded by -faggressive-loop-optimizations though).

> There are also some unsafe code in functions
> ubsan_expand_si_overflow_addsub_check, ubsan_expand_si_overflow_mul_check
> which uses get_range_info to reduce checks number. As seen before vrp usage
> for sanitizers may decrease quality of error detection.

Using VRP is completely intentional there, we don't want to generate too
slow code if you decide you want to optimize your code (for -O0 VRP isn't
performed of course).

Jakub


Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-10 Thread Jakub Jelinek
On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> > > To be constructive here - the above case is from within a
> > > GIMPLE_ASSIGN case label
> > > and thus I'd have expected
> > > 
> > > case GIMPLE_ASSIGN:
> > >   {
> > > gassign *a1 = as_a  (s1);
> > > gassign *a2 = as_a  (s2);
> > >   lhs1 = gimple_assign_lhs (a1);
> > >   lhs2 = gimple_assign_lhs (a2);
> > >   if (TREE_CODE (lhs1) != SSA_NAME
> > >   && TREE_CODE (lhs2) != SSA_NAME)
> > > return (operand_equal_p (lhs1, lhs2, 0)
> > > && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> > >  gimple_assign_rhs1 
> > > (a2)));
> > >   else if (TREE_CODE (lhs1) == SSA_NAME
> > >&& TREE_CODE (lhs2) == SSA_NAME)
> > > return vn_valueize (lhs1) == vn_valueize (lhs2);
> > >   return false;
> > >   }
> > > 
> > > instead.  That's the kind of changes I have expected and have approved of.
> > 
> > But even that looks like just adding extra work for all developers, with no
> > gain.  You only have to add extra code and extra temporaries, in switches
> > typically also have to add {} because of the temporaries and thus extra
> > indentation level, and it doesn't simplify anything in the code.
> 
> The branch attempts to use the C++ typesystem to capture information
> about the kinds of gimple statement we expect, both:
>   (A) so that the compiler can detect type errors, and
>   (B) as a comprehension aid to the human reader of the code
> 
> The ideal here is when function params and struct field can be
> strengthened from "gimple" to a subclass ptr.  This captures the
> knowledge that every use of a function or within a struct has a given
> gimple code.

I just don't like all the as_a/is_a stuff enforced everywhere,
it means more typing, more temporaries, more indentation.
So, as I view it, instead of the checks being done cheaply (yes, I think
the gimple checking as we have right now is very cheap) under the
hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
put the burden on the developers, who has to check that manually through
the as_a/is_a stuff everywhere, more typing and uglier syntax.
I just don't see that as a step forward, instead a huge step backwards.
But perhaps I'm alone with this.
Can you e.g. compare the size of - lines in your patchset combined, and
size of + lines in your patchset?  As in, if your changes lead to less
typing or more.

Jakub


Re: missing warnings with -Warray-bounds

2014-11-10 Thread Jakub Jelinek
On Mon, Nov 10, 2014 at 12:52:02AM -0800, Martin Uecker wrote:
> Jakub Jelinek :
> > On Mon, Nov 10, 2014 at 12:20:03AM -0800, Martin Uecker wrote:
> > > There is also no warning in the following example
> > > when the array is the last element of a struct.
> > > 
> > > struct h3 {
> > > int i;
> > > int j[3];
> > > };
> > > 
> > > struct h3* h3 = malloc(sizeof(struct h) + 3 * sizeof(int));
> > > h3->j[4] = 1;
> > > 
> > > I guess this is to avoid warnings for the 'struct hack', but why 
> > > is this not limited to arrays with size 0 (and maybe 1) and 
> > > flexible array members?
> > 
> > Because 0 or 1 are not the only ones recognized as poor man's flexible array
> > members, any trailing arrays are, whatever the constant is.  So it is very
> > much intentional we don't warn above.  
> 
> Is such code common?

Yes.

> Clang does warn in this case. 

Clang clearly doesn't care about false positives, it is driven by the desire
to emit as many warnings as possible.

> The warning seems very useful to me and can easily be turned off. 
> Or one could add -W(no-)warn-struct-hack if really needed.
> 
> Another odd case is:
> 
> struct h0b {
>   int i;
>   int j[0];
>   int k;
> };
> 
> struct h0b* h0b = ...
> 
> h0b->j[4] = 1;  

-fsanitize=undefined should catch this.

> > You haven't provided struct h definition,
> 
> Sorry, this should have been sizeof(struct h3).

In that case the code you've posted is valid, there should be no warnings or
runtime error messages.

Jakub


Re: missing warnings with -Warray-bounds

2014-11-10 Thread Jakub Jelinek
On Mon, Nov 10, 2014 at 12:20:03AM -0800, Martin Uecker wrote:
> There is also no warning in the following example
> when the array is the last element of a struct.
> 
> struct h3 {
> int i;
> int j[3];
> };
> 
> struct h3* h3 = malloc(sizeof(struct h) + 3 * sizeof(int));
> h3->j[4] = 1;
> 
> I guess this is to avoid warnings for the 'struct hack', but why 
> is this not limited to arrays with size 0 (and maybe 1) and 
> flexible array members?

Because 0 or 1 are not the only ones recognized as poor man's flexible array
members, any trailing arrays are, whatever the constant is.  So it is very
much intentional we don't warn above.  You haven't provided struct h definition,
if you meant offsetof(struct h3, j[0]) or similar instead, then I think
-fsanitize=undefined should diagnose this at runtime (and of course
-fsanitize=address too).

Jakub


Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-08 Thread Jakub Jelinek
On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> To be constructive here - the above case is from within a
> GIMPLE_ASSIGN case label
> and thus I'd have expected
> 
> case GIMPLE_ASSIGN:
>   {
> gassign *a1 = as_a  (s1);
> gassign *a2 = as_a  (s2);
>   lhs1 = gimple_assign_lhs (a1);
>   lhs2 = gimple_assign_lhs (a2);
>   if (TREE_CODE (lhs1) != SSA_NAME
>   && TREE_CODE (lhs2) != SSA_NAME)
> return (operand_equal_p (lhs1, lhs2, 0)
> && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>  gimple_assign_rhs1 (a2)));
>   else if (TREE_CODE (lhs1) == SSA_NAME
>&& TREE_CODE (lhs2) == SSA_NAME)
> return vn_valueize (lhs1) == vn_valueize (lhs2);
>   return false;
>   }
> 
> instead.  That's the kind of changes I have expected and have approved of.

But even that looks like just adding extra work for all developers, with no
gain.  You only have to add extra code and extra temporaries, in switches
typically also have to add {} because of the temporaries and thus extra
indentation level, and it doesn't simplify anything in the code.

Jakub


Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-07 Thread Jakub Jelinek
On Fri, Nov 07, 2014 at 10:01:45PM +0100, Richard Biener wrote:
> > --- a/gcc/tree-ssa-tail-merge.c
> > +++ b/gcc/tree-ssa-tail-merge.c
> > @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e)
> >
> >hstate.add_int (gimple_code (stmt));
> >if (is_gimple_assign (stmt))
> > -   hstate.add_int (gimple_assign_rhs_code (stmt));
> > +   hstate.add_int (gimple_assign_rhs_code (as_a  (stmt)));
> >if (!is_gimple_call (stmt))
> > continue;
> >if (gimple_call_internal_p (stmt))
> > @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, 
> > gimple s2)
> >if (TREE_CODE (lhs1) != SSA_NAME
> >   && TREE_CODE (lhs2) != SSA_NAME)
> > return (operand_equal_p (lhs1, lhs2, 0)
> > -   && gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
> > -gimple_assign_rhs1 (s2)));
> > +   && gimple_operand_equal_value_p (gimple_assign_rhs1 (
> > +  as_a  (s1)),
> > +gimple_assign_rhs1 (
> > +  as_a  (s2;
> 
> Just a comment as these patches flow by - I think this is a huge step
> backwards from "enforcing" s1/s2 being a gimple_assign inside
> gimple_assign_rhs1 to this as_a  boilerplate at _each_ callsite!
> 
> Which means this step of the refactoring is totally broken and probably
> requires much more manual work to avoid this kind of uglyness.
> 
> I definitely won't approve of this kind of changes.

I have to agree with this, this is too ugly to live with.
I must say I don't find anything wrong with what we have right now,
unlike RTL checking, the gimple checking is inexpensive, and much better
to do it that way then enforce all all developers to write it this way.
Otherwise we'll end up with code as ugly as in LLVM :(.

Jakub


GCC 5.0 Status Report (2014-11-03), Stage 1 ends Nov 15th

2014-11-03 Thread Jakub Jelinek
Status
==

The trunk is scheduled to transition from Stage 1 to Stage 3 at the end
of Saturday, November 15th (use your timezone to your advantage).

We have been in Stage 1 for almost 7 months now with a fortnight
 
still to go.  Still now is a good time to look into bugzilla
and pick one or two regressions in your area of expertise and fix them
(you may want to prioritize regressions against both 4.9 and 5).

What larger merges are still planned for GCC 5?
I'm aware of pending merges from match-and-simplify branch, there
are the JIT changes partially? approved, MPX also partially? approved,
Intel offloading patches partially approved, PTX support partially
reviewed.  Thomas, do you plan to post OpenACC changes for review
still during stage1?  Do you have any dependencies there (PTX and/or
Intel offloading being merged first?)?  What else have been people working
on and can get posted for review before stage1 closes?
As before, when new features are posted for review during stage 1 and only
acked early during stage 3, they can still be accepted for GCC 5.

Somewhat misleading quality data below, P3 bugs have not been
re-prioritized for quite some time now.  We promise to do this shortly
after entering Stage 3.

Quality Data


Priority  #   Change from last report
---   ---
P1   10+  10
P2   82+   6
P3   92+  86
---   ---
Total   184+ 102

Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-04/msg00090.html

The next report will be sent by Richard, announcing transition to stage 3.


Re: [RFC] Adjusted VRP

2014-10-30 Thread Jakub Jelinek
On Thu, Oct 30, 2014 at 04:19:24PM +0300, Marat Zakirov wrote:
> We didn't find reasonable performance gains to use VRP in asan. But even if
> we found we couldn't use it because it is not safe for asan. It make some
> optimistic conclusions invalid for asan.
> 
> Adjusted VRP memory upper bound is #{trees that are compared} x nblocks
> which could be reduced by some threshold.
> 
> If making stuff inside VRP is a right thing why can't we do all
> VRP-dependent optimizations in the VRP transform phase? Why do we need
> range_infos if they are so imprecise?!

It is not imprecise, and really isn't (so far) used that heavily in the
compiler, most of the optimizations based on value ranges really do happen
in the VRP tansform phase.

Jakub


Re: [RFC] Adjusted VRP

2014-10-30 Thread Jakub Jelinek
On Thu, Oct 30, 2014 at 02:16:04PM +0300, Yury Gribov wrote:
> On 10/30/2014 01:27 PM, Richard Biener wrote:
> >Well, VRP is not path-insensitive - it is the value-ranges we are able
> >to retain after removing the ASSERT_EXPRs VRP inserts.
> >
> >Why can't you do the ASAN optimizations in the VRP transform phase?
> 
> I think this is not Asan-specific: Marat's point was that allowing
> basic-block-precise ranges would generally allow middle-end to produce
> better code.

The reason for get_range_info in the current form is that it is cheap, and
unless we want to make some SSA_NAMEs non-propagatable [*], IMHO it should
stay that way.  Now that we have ASAN_ internal calls, if you want to
optimize away ASAN_CHECK calls where VRP suggests that e.g. array
index will be within the right bounds and you'd optimize away ASAN_CHECK to
a VAR_DECL access if the index was constant (say minimum or maximum of the
range), you can do so in VRP and it is the right thing to do it there.

[*] - that is something I've been talking about for __builtin_unreachable ()
etc., whether it would be worth it if range_info of certain SSA_NAME that
would VRP want to remove again was significantly better than range info of
the base SSA_NAME, to keep that SSA_NAME around and e.g. block forwprop etc.
from propagating the SSA_NAME copy, unless something other than SSA_NAME has
been propagated into it.  Richard was against that though.

Jakub


GCC 4.9.2 Status Report (2014-10-30)

2014-10-30 Thread Jakub Jelinek
Status
==

GCC 4.9.2 has been released, the branch is now open again
for regression and documentation fixes.
Unless some blocker bug is found, GCC 4.9.3 should be released
in March or April.

Quality Data


Priority  #   Change from Last Report
---   ---
P10  0
P2   82  0
P3   46   +  6
---   ---
Total   128   +  6


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-10/msg00195.html

The next report will be sent by Richard.


GCC 4.9.2 Released

2014-10-30 Thread Jakub Jelinek
The GNU Compiler Collection version 4.9.2 has been released.

GCC 4.9.2 is a bug-fix release from the GCC 4.9 branch
containing important fixes for regressions and serious bugs in
GCC 4.9.1 with more than 65 bugs fixed since the previous release.
This release is available from the FTP servers listed at:

  http://www.gnu.org/order/ftp.html

Please do not contact me directly regarding questions or comments
about this release.  Instead, use the resources available from
http://gcc.gnu.org.

As always, a vast number of people contributed to this GCC release
-- far too many to thank them individually!


Re: PR63633: May middle-end come up width hard regs for insn expanders?

2014-10-28 Thread Jakub Jelinek
On Tue, Oct 28, 2014 at 01:06:38PM +0100, Georg-Johann Lay wrote:
> The C test case is
> 
> 
> __attribute__((noinline,noclone))
> unsigned bug_mulhi_highpart (unsigned x)
> {
> register unsigned reg26 asm ("26") = 10;
> a = x / 10;
> __asm volatile ("; %0 " : "+r" (reg26));
> return reg26;
> }
> 
> int main (void)
> {
> if (10 != bug_mulhi_highpart (0))
> __builtin_abort();
> return 0;
> }
> 
> 
> The specification guarantees reg26 to be allocated to R26/27 only at the
> time it is used as operand to the inline asm.

I'd say if on the target reg26 or reg27 is used or clobbered by
multiplication, then it is a user error to use it this way, the register
then isn't suitable for the local hard register usage.
It is the same thing as trying to use asm ("eax") on i?86 around code that
requires clobbering/setting of that register (e.g. set flags,
division/multiplication etc.).
E.g. glibc when using register asm ("...") carefully puts all needed
computations into temporary variables before entering code with
the local register vars, which are just initialized to the temporaries,
then some inline asm that needs those vars, perhaps save result of some of
the local reg vars into temporaries again and leave the scope with them.
Of course if you use fixed reg or some really general purpose one where you
have several other regs usable for the same purpose, it is not that big
issue as when you use some specialized regs.

Jakub


Re: PR63633: May middle-end come up width hard regs for insn expanders?

2014-10-24 Thread Jakub Jelinek
On Fri, Oct 24, 2014 at 08:19:57PM +0200, Georg-Johann Lay wrote:
> Yes, that's the straight forward approach which works so far.  Bit tedious,
> but well...
> 
> In one case expmed generated different code, though: divmodhi instead of
> mulhi_highpart for HI division by const_int.  Cheating with costs did not
> help.  However for now I am mostly after correct, ICE-less code.
> 
> What I am concerned about is:
> 
> 1) May it happen that a value lives in a hard-reg across the expander? The
> expander has no means to detect that situation and interfere, e.g.
> 
> hard-reg = source_value // middle-end
> expand-code // back-end
> sink_value = hard-reg   // middle-end
> 
> where "expand-code" is one defind_expand / define_insn that clobbers /
> changes (parts of) hard-reg but does *not* get hard-reg as operand. This is
> wrong code obviously.

It can happen, but if it happens, that would mean user code bug, like using
register asm with an register that is unsuitable for use as global or local
register var on the target, or it could be backend bug (expansion of some
pattern clobbering register that has other fixed uses).
You shouldn't ICE on it, but what happens is undefined.

Before RA, the use of hard regs should be limited (pretty much just fixed
regs where really necessary, global and local register variables (user needs
to use with care), function arguments and return values (short lived around
the call patterns).

Jakub


Re: Restricting arguments to intrinsic functions

2014-10-24 Thread Jakub Jelinek
On Fri, Oct 24, 2014 at 06:25:29PM +0100, Charles Baylis wrote:
> On 24 October 2014 17:05, Andrew Pinski  wrote:
> > On Fri, Oct 24, 2014 at 8:11 AM, Tejas Belagod  
> > wrote:
> 
> >> The diagnostic issued points to the line in arm_neon.h, but we expect this
> >> to point to the line in cr.c. I suspect we need something closer to the
> >> front-end?
> >
> >
> > You need to change arm_neon.h to use the __artificial__ attribute as I
> > mentioned before.  Also please move away from "static inline" since
> > they cannot be used from templates in C++98/03.
> 
> The __artificial__ attribute seems like it would improve the debug
> view, but it does not seem to affect the location of error reporting.

Note, for that case you can't emit the error too early, you need to wait
until inlining and constant propagation are performed, otherwise the
argument would never be constant.  And for -O0, you need to use
macros instead of inlines, see what i?86 *intrin.h headers are doing for
that.

Jakub


Re: Restricting arguments to intrinsic functions

2014-10-24 Thread Jakub Jelinek
On Fri, Oct 24, 2014 at 06:25:29PM +0100, Charles Baylis wrote:
> On 24 October 2014 17:05, Andrew Pinski  wrote:
> > On Fri, Oct 24, 2014 at 8:11 AM, Tejas Belagod  
> > wrote:
> 
> >> The diagnostic issued points to the line in arm_neon.h, but we expect this
> >> to point to the line in cr.c. I suspect we need something closer to the
> >> front-end?
> >
> >
> > You need to change arm_neon.h to use the __artificial__ attribute as I
> > mentioned before.  Also please move away from "static inline" since
> > they cannot be used from templates in C++98/03.
> 
> The __artificial__ attribute seems like it would improve the debug
> view, but it does not seem to affect the location of error reporting.

You can use %K for that, at least with -g you'll get all the details you
want.

> As an example (static inline from current code - will fix later)
> === t_artificial.h ===
> /* extracted from arm_neon.h */
> typedef float float32_t;
> typedef __builtin_neon_sf float32x2_t   __attribute__ ((__vector_size__ (8)));
> 
> __extension__ static __inline float32_t __attribute__
> ((__always_inline__,__artificial__)) vget_lane_f32 (float32x2_t __a,
> const int __b)
> {
>   return (float32_t)__builtin_neon_vget_lanev2sf (__a, __b, 3);
> }
> 
> === t_artificial.c ===
> #include "t_artificial.h"
> 
> int f(float32x2_t x) {
> int i = 3;
> float a = vget_lane_f32(x, i);
> }
> === end ===
> 
> $ arm-unknown-linux-gnueabihf-gcc -c - t_artificial.c -mfpu=neon
> 
> In file included from t_artificial.c:1:0:
> t_artificial.h: In function ‘f’:
> t_artificial.h:7:10: error: argument must be a constant
>return (float32_t)__builtin_neon_vget_lanev2sf (__a, __b, 3);
>   ^
> (...followed by an ICE because we allow this to result in an invalid
> insn, Jakub has explained how to address this)
> 
> 
> The error is reported in t_artificial.h, rather than t_artificial.c:5
> which is where the user has made the mistake.
> https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html says that
> the artificial attribute affects generated debug info and doesn't
> mention error reporting at all.

Jakub


Re: PR63633: May middle-end come up width hard regs for insn expanders?

2014-10-24 Thread Jakub Jelinek
On Fri, Oct 24, 2014 at 10:23:22AM -0600, Jeff Law wrote:
> On 10/24/14 08:03, Georg-Johann Lay wrote:
> >Investigating PR63633 turned out that the middle-end calls insn
> >expanders with hard registers, namely mulsi3 from avr back-end:
> >
> >(define_expand "mulsi3"
> >   [(parallel [(set (match_operand:SI 0 "register_operand" "")
> >(mult:SI (match_operand:SI 1 "register_operand" "")
> > (match_operand:SI 2 "nonmemory_operand" "")))
> >   (clobber (reg:HI 26))
> >   (clobber (reg:DI 18))])]
> >...
> >
> >is being called with operands[0] = (reg:SI 22).  This overlaps hard
> >(reg:DI 18) which extends from R18...R25.
> >
> >mulsi3 assumes all operands are pseudo registers and consequently the
> >generated insn raises an ICE in the remainder, and there are other cases
> >for other expanders where wrong code gets generated because the clobbers
> >clobber an hard-reg input operand.
> >
> >Is it in order to have hard registers as arguments to expanders (and
> >maybe also to insns) like that at expand time (pass .expand) ?
> >
> >It is easy enough to kick these hard regs into new pseudos in the
> >respective expanders, however the question arises where the culprit is:
> >
> >back-end or middle-end?
> >
> >Until PR63633 I've never seen middle-end coming up with hard regs that
> >way, thus bunch of avr insns have been written under that -- maybe wrong
> >-- assumption...
> I don't think you can assume you're always going to get a pseudo, though the
> vast majority of the time you will.  Normally it won't matter unless (as
> noted above) the expander/pattern has references (particularly
> sets/clobbers) of overlapping hard registers.
> 
> It's also probably worth investigating why you got the hard register in the
> first place as that may indicate something that's wrong or suboptimal either
> in the expanders or in the target dependent bits.

But I'd say, if you can't handle hard regs in the operands (either general,
or some specific ones), you should
force the hard regs into pseudos (all hard regs, or just the problematic
ones) in the expander.
So in this case, check if they overlap with those 2 regs, and force the
input operands into pseudos if they do; the output will be harder, guess
you'd need to emit the pattern into a pseudo and emit_move_insn it
afterwards to the hard reg.

Jakub


Re: common subexpression elimination no longer working for asm()?

2014-10-24 Thread Jakub Jelinek
On Fri, Oct 24, 2014 at 10:01:52AM +0100, Jan Beulich wrote:
> > This changed because of my http://gcc.gnu.org/PR60663 fix.
> > In your testcase the inline asm doesn't have more than one output
> > (which IMNSHO is very much desirable not to CSE), and doesn't have explicit
> > clobbers either, but happens to have implicit clobbers (fprs and cc),
> > so CSE still could generate invalid code out of that without the fix
> > (if it decided to materialize the inline asm somewhere, instead of reusing
> > existing inline asm).
> > So, if we e.g. weakened the PR60663 fix so that it only bails out
> > if the inline asm contains more than one output. we'd need to fix up CSE, so
> > that it analyzes all the clobbers and doesn't consider asms as equivalent
> > just based on the ASM_OPERANDS, it needs to have the same clobbers too,
> > and either doesn't try to materialize it out without preexisting insn
> > if it has any clobbers.
> 
> So why would clobbers in general matter? I can see memory clobbers
> to need special care, but any others? If two asm()-s only differ in the

Please start by looking at the PR the change fixed.
There CSE decided (ok, with the help of not very smart costs, but as the
testcase shows, it clearly can happen) to rematerialize the asm in a place
where the asm wasn't originally at all.  At that point it just inserted the
single ASM_OPERANDS, without anything else, leaving the other ASM_OPERANDS
(the testcase had asm with two outputs) and in theory anything else (like
clobbers) out.  Leaving the clobbers out completely is definitely not
desirable.

IMHO we should never CSE together asm with different clobbers, GCC
intentionally does not try to think what exactly the asm pattern does,
it is a black box, and if the programmer decides to use one set of clobbers
in one case and a different in another case, he might have a reason for
that.

Jakub


Re: Restricting arguments to intrinsic functions

2014-10-23 Thread Jakub Jelinek
On Thu, Oct 23, 2014 at 11:06:24AM -0700, Andrew Pinski wrote:
> On Thu, Oct 23, 2014 at 11:00 AM, Andrew Pinski  wrote:
> > On Thu, Oct 23, 2014 at 10:52 AM, Charles Baylis
> >  wrote:
> >> Hi
> >>
> >> ( tl;dr: How do I handle intrinsic or builtin functions where there
> >> are restrictions on the arguments which can't be represented in a C
> >> function prototype? Do other ports have this problem, how do they
> >> solve it? Language extension for C++98 to provide static_assert?)
> >
> > The attribute __artificial__ .
> 
> Long example:
> extern __inline void __attribute__((__gnu_inline__, __always_inline__,
> __artificial__))
> _mm_stream_sd (double * __P, __m128d __Y)
> {
>   __builtin_ia32_movntsd (__P, (__v2df) __Y);
> }
> 
> Don't use static inline either because it is not valid thing to do
> from a template in C++98.

And the argument checking (compile time constant, what range etc.) can be
done either when expanding the builtin, or when folding it (there are
target hooks for both generic and gimple foldings of builtin), you can
report it in either of those (and fold to constant or similar if there are
errors).

Jakub


GCC 4.9.2 Status Report (2014-10-23)

2014-10-23 Thread Jakub Jelinek
Status
==

GCC 4.9.2-rc1 has been created and announced, the branch is now frozen
for blocking regressions and documentation fixes only and all changes
to the branch require a RM approval.
If all goes well, 4.9.2 will be released in a week.

Quality Data


Priority  #   Change from Last Report
---   ---
P10  0
P2   82   -  6
P3   40   + 19
---   ---
Total   122   + 13


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-07/msg00163.html

The next report will be sent by me again, announcing the
4.9.2 release or another release candidate if needed.


GCC 4.9.2 Release Candidate available from gcc.gnu.org

2014-10-23 Thread Jakub Jelinek
GCC 4.9.2 Release Candidate available from gcc.gnu.org

The first release candidate for GCC 4.9.2 is available from

 ftp://gcc.gnu.org/pub/gcc/snapshots/4.9.2-RC-20141023

and shortly its mirrors.  It has been generated from SVN revision 216570.

I have so far bootstrapped and tested the release candidate on
x86_64-linux and i686-linux.  Please test it and report any issues to
bugzilla.

If all goes well, I'd like to release 4.9.2 on Thursday, 30th.


Re: common subexpression elimination no longer working for asm()?

2014-10-23 Thread Jakub Jelinek
On Wed, Oct 22, 2014 at 04:28:52PM +0100, Jan Beulich wrote:
> I noticed the issue with 4.9.1 (in that x86 Linux'es
> this_cpu_read_stable() no longer does what the comment preceding
> its definition promises), and the example below demonstrates this in
> a simplified (but contrived) way. I just now verified that trunk has
> the same issue; 4.8.3 still folds redundant ones as expected. Is this
> known, or possibly even intended (in which case I'd be curious as to
> what the reasons are, and how the functionality Linux wants can be
> gained back)?

This changed because of my http://gcc.gnu.org/PR60663 fix.
In your testcase the inline asm doesn't have more than one output
(which IMNSHO is very much desirable not to CSE), and doesn't have explicit
clobbers either, but happens to have implicit clobbers (fprs and cc),
so CSE still could generate invalid code out of that without the fix
(if it decided to materialize the inline asm somewhere, instead of reusing
existing inline asm).
So, if we e.g. weakened the PR60663 fix so that it only bails out
if the inline asm contains more than one output. we'd need to fix up CSE, so
that it analyzes all the clobbers and doesn't consider asms as equivalent
just based on the ASM_OPERANDS, it needs to have the same clobbers too,
and either doesn't try to materialize it out without preexisting insn
if it has any clobbers.

Feel free to file a PR about this.

Jakub


Re: [PATCH, fixincludes]: Add pthread.h to glibc_c99_inline_4 fix

2014-10-21 Thread Jakub Jelinek
On Tue, Oct 21, 2014 at 11:30:49AM +0200, Uros Bizjak wrote:
> At the end of the day, adding pthread.h to glibc_c99_inline_4 fix
> fixes the bootstrap. The fix applies __attribute__((__gnu_inline__))
> to the declaration:
> 
> extern __inline __attribute__ ((__gnu_inline__)) void
> __pthread_cleanup_routine (struct __pthread_cleanup_frame *__frame)
> 
> 2014-10-21  Uros Bizjak  
> 
> * inclhack.def (glibc_c99_inline_4): Add pthread.h to files.
> * fixincl.x: Regenerate.
> 
> Bootstrapped and regression tested on CentOS 5.11 x86_64-linux-gnu {,-m32}.
> 
> OK for mainline?

Ok, thanks.

> --- inclhack.def  (revision 216501)
> +++ inclhack.def  (working copy)
> @@ -1687,7 +1687,8 @@
>   */
>  fix = {
>  hackname  = glibc_c99_inline_4;
> -files = sys/sysmacros.h, '*/sys/sysmacros.h', wchar.h, '*/wchar.h';
> +files = sys/sysmacros.h, '*/sys/sysmacros.h', wchar.h, '*/wchar.h',
> +pthread.h, '*/pthread.h';
>  bypass= "__extern_inline|__gnu_inline__";
>  select= "(^| )extern __inline";
>  c_fix = format;


Jakub


Re: Recent bootstrap failure on CentOS 5.11, /usr/bin/ld: Dwarf Error: found dwarf version '4' ...

2014-10-16 Thread Jakub Jelinek
On Thu, Oct 16, 2014 at 01:57:51PM +0200, Uros Bizjak wrote:
> On Thu, Oct 16, 2014 at 11:25 AM, Uros Bizjak  wrote:
> 
> > Recent change caused bootstrap failure on CentOS 5.11:
> >
> > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only
> > handles version 2 information.
> > unwind-dw2-fde-dip_s.o: In function `__pthread_cleanup_routine':
> > unwind-dw2-fde-dip.c:(.text+0x1590): multiple definition of
> > `__pthread_cleanup_routine'
> > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only
> > handles version 2 information.
> > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here
> > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only
> > handles version 2 information.
> > unwind-sjlj_s.o: In function `__pthread_cleanup_routine':
> > unwind-sjlj.c:(.text+0x0): multiple definition of 
> > `__pthread_cleanup_routine'
> > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here
> > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only
> > handles version 2 information.
> > emutls_s.o: In function `__pthread_cleanup_routine':
> > emutls.c:(.text+0x170): multiple definition of `__pthread_cleanup_routine'
> > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here
> > collect2: error: ld returned 1 exit status
> > gmake[5]: *** [libgcc_s.so] Error 1
> >
> > $ ld --version
> > GNU ld version 2.17.50.0.6-26.el5 20061020
> 
> It looks like a switch-to-c11 fallout. Older glibc versions have
> issues with c99 (and c11) conformance [1].
> 
> Changing "extern __inline void __pthread_cleanup_routine (...)" in
> system /usr/include/pthread.h to
> 
> if __STDC_VERSION__ < 199901L
> extern
> #endif
> __inline__ void __pthread_cleanup_routine (...)
> 
> fixes this issue and allows bootstrap to proceed.
> 
> However, fixincludes is not yet built in stage1 bootstrap. Is there a
> way to fix this issue without changing system headers?
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2006-11/msg01030.html

Yeah, old glibcs are totally incompatible with -fno-gnu89-inline.
Not sure if it is easily fixincludable, if yes, then -fgnu89-inline should
be used for code like libgcc which is built with the newly built compiler
before it is fixincluded.
Or we need -fgnu89-inline by default for old glibcs (that is pretty
much what we do e.g. in Developer Toolset for RHEL5).

Jakub


Re: Towards GNU11

2014-10-15 Thread Jakub Jelinek
On Wed, Oct 15, 2014 at 11:25:26PM +0200, Marek Polacek wrote:
> On Wed, Oct 15, 2014 at 11:05:45PM +0200, Jakub Jelinek wrote:
> > On Wed, Oct 15, 2014 at 09:28:09PM +0200, Uros Bizjak wrote:
> > > i686-linux-gnu testsuite trivially regressed [1]:
> 
> Thanks for the log Uros.
>  
> > I have half of that already in patch form, will test and send either later
> > tonight or tomorrow.
> 
> Please don't force yourself into doing that, it's more up to me to fix
> my fallout ;).  Feel free to send me the partial patch and I will finish
> it tomorrow mornings (reproducing is easy with --target_board=unix/-m32).
> In any case, thanks.

If you want to finish it up, here it is (I think I have covered
all the GNU11 related new FAILs, but haven't tested it).

Where missing prototypes or return types looked like an omission, I've tried
to tweak it, in various tests -std=gnu89 looked like better option.
In several tests __builtin_ia32_crc32di was used, which is fine for 64-bit
tests where it is a builtin, but not in 32-bit tests where it is not.
Some of them looked like they test or just expect it to be parsed as
implicitly prototyped function, so I've used there -std=gnu89 too.

--- gcc/testsuite/gcc.dg/builtin-apply4.c.jj2014-10-10 10:20:08.0 
+0200
+++ gcc/testsuite/gcc.dg/builtin-apply4.c   2014-10-15 19:33:23.140595264 
+0200
@@ -1,6 +1,6 @@
 /* PR tree-optimization/20076 */
 /* { dg-options "-O2 -Wmissing-noreturn -fgnu89-inline" } */
-/* { dg-options "-O2 -mno-mmx" { target { { i?86-*-* x86_64-*-* } && ia32 } } 
} */
+/* { dg-additional-options "-mno-mmx" { target { { i?86-*-* x86_64-*-* } && 
ia32 } } } */
 /* { dg-do run } */
 
 extern void abort (void);
--- gcc/testsuite/gcc.dg/sync-2.c.jj2014-09-25 15:02:31.0 +0200
+++ gcc/testsuite/gcc.dg/sync-2.c   2014-10-15 19:38:42.361989299 +0200
@@ -11,6 +11,7 @@
 
 extern void abort (void);
 extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 static char AI[18];
 static char init_qi[18] = { 3,5,7,9,0,0,0 ,0  ,-1,0,0,-1,0,0  ,-1,0,0,-1 };
--- gcc/testsuite/gcc.dg/pr32176.c.jj   2014-09-25 15:02:30.0 +0200
+++ gcc/testsuite/gcc.dg/pr32176.c  2014-10-15 19:37:48.439990744 +0200
@@ -2,7 +2,9 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */
-/* { dg-options "-O2 -fprefetch-loop-arrays -march=i686 -msse" { target { { 
i?86-*-* x86_64-*-* } && ia32 } } } */
+/* { dg-additional-options "-march=i686 -msse" { target { { i?86-*-* 
x86_64-*-* } && ia32 } } } */
+
+extern void _gfortran_abort ();
 
 void foo (void)
 {
--- gcc/testsuite/gcc.dg/sync-3.c.jj2014-09-25 15:02:24.0 +0200
+++ gcc/testsuite/gcc.dg/sync-3.c   2014-10-15 19:38:57.266714706 +0200
@@ -8,6 +8,7 @@
 
 extern void abort (void);
 extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 static char AI[18] __attribute__((__aligned__ (4)));
 static char init_qi[18] = { 3,5,7,9,0,0,0 ,0  ,-1,0,0,-1,0,0  ,-1,0,0,-1 };
--- gcc/testsuite/gcc.dg/ia64-sync-1.c.jj   2014-09-25 15:02:25.0 
+0200
+++ gcc/testsuite/gcc.dg/ia64-sync-1.c  2014-10-15 19:34:53.619177553 +0200
@@ -13,6 +13,7 @@ __extension__ typedef __SIZE_TYPE__ size
 
 extern void abort (void);
 extern void *memcpy (void *, const void *, size_t);
+extern int memcmp (const void *, const void *, size_t);
 
 static int AI[12];
 static int init_noret_si[12] = { 0, 0, 0, 1, 0, 0, 0 , 0  , -1, 0, 0, -1 };
--- gcc/testsuite/gcc.dg/ia64-sync-2.c.jj   2014-09-25 15:02:27.0 
+0200
+++ gcc/testsuite/gcc.dg/ia64-sync-2.c  2014-10-15 19:35:28.854628785 +0200
@@ -13,6 +13,7 @@ __extension__ typedef __SIZE_TYPE__ size
 
 extern void abort (void);
 extern void *memcpy (void *, const void *, size_t);
+extern int memcmp (const void *, const void *, size_t);
 
 static int AI[18];
 static int init_si[18] = { 0,0,0,1,0,0, 0,0  ,-1,0,0,-1,0,0  ,-1,0,0,-1 };
--- gcc/testsuite/gcc.dg/ia64-sync-3.c.jj   2014-09-25 15:02:31.0 
+0200
+++ gcc/testsuite/gcc.dg/ia64-sync-3.c  2014-10-15 19:35:41.882351595 +0200
@@ -10,6 +10,7 @@ __extension__ typedef __SIZE_TYPE__ size
 
 extern void abort (void);
 extern void *memcpy (void *, const void *, size_t);
+extern int memcmp (const void *, const void *, size_t);
 
 static int AI[4];
 static int init_si[4] = { -30,-30,-50,-50 };
--- gcc/testsuite/gcc.dg/20020122-2.c.jj2014-09-25 15:02:30.0 
+0200
+++ gcc/testsuite/gcc.dg/20020122-2.c   2014-10-15 19:32:21.202599849 +0200
@@ -3,9 +3,10 @@
   
 /* { dg-do compile } */
 /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */
-/* { dg-options "-O2 -fprefetch-loop-arrays -march=athlon" { target { { 
i?86-*-* x86_64-*-* } &am

Re: Towards GNU11

2014-10-15 Thread Jakub Jelinek
On Wed, Oct 15, 2014 at 09:28:09PM +0200, Uros Bizjak wrote:
> Hello!
> 
> >> The consensus seems to be to go forward with this change.  I will
> >> commit the patch in 24 hours unless I hear objections.
> >
> > I made the change.  Please report any fallout to me.
> 
> i686-linux-gnu testsuite trivially regressed [1]:

I have half of that already in patch form, will test and send either later
tonight or tomorrow.

> FAIL: gcc.dg/20020122-2.c (test for excess errors)
> FAIL: gcc.dg/builtin-apply4.c (test for excess errors)
> FAIL: gcc.dg/ia64-sync-1.c (test for excess errors)
> FAIL: gcc.dg/ia64-sync-2.c (test for excess errors)
> FAIL: gcc.dg/ia64-sync-3.c (test for excess errors)
> FAIL: gcc.dg/pr32176.c (test for excess errors)
> FAIL: gcc.dg/sync-2.c (test for excess errors)
> FAIL: gcc.dg/sync-3.c (test for excess errors)
> FAIL: gcc.target/i386/20060125-1.c (test for excess errors)
> FAIL: gcc.target/i386/20060125-2.c (test for excess errors)
> FAIL: gcc.target/i386/980312-1.c (test for excess errors)
> FAIL: gcc.target/i386/980313-1.c (test for excess errors)
> FAIL: gcc.target/i386/990524-1.c (test for excess errors)
> FAIL: gcc.target/i386/avx512f-pr57233.c (test for excess errors)
> FAIL: gcc.target/i386/avx512f-typecast-1.c (test for excess errors)
> FAIL: gcc.target/i386/builtin-apply-mmx.c (test for excess errors)
> FAIL: gcc.target/i386/crc32-2.c (test for excess errors)
> FAIL: gcc.target/i386/crc32-3.c (test for excess errors)
> FAIL: gcc.target/i386/intrinsics_3.c (test for excess errors)
> FAIL: gcc.target/i386/loop-1.c (test for excess errors)
> FAIL: gcc.target/i386/memcpy-1.c (test for excess errors)
> FAIL: gcc.target/i386/pr26826.c (test for excess errors)
> FAIL: gcc.target/i386/pr37184.c (test for excess errors)
> FAIL: gcc.target/i386/pr40934.c (test for excess errors)
> FAIL: gcc.target/i386/pr44948-2a.c (test for excess errors)
> FAIL: gcc.target/i386/pr47564.c (test for excess errors)
> FAIL: gcc.target/i386/pr50712.c (test for excess errors)
> FAIL: gcc.target/i386/sse-5.c (test for excess errors)
> FAIL: gcc.target/i386/stackalign/asm-1.c -mno-stackrealign (test for
> excess errors)
> FAIL: gcc.target/i386/stackalign/asm-1.c -mstackrealign (test for excess 
> errors)
> FAIL: gcc.target/i386/stackalign/return-2.c -mno-stackrealign (test
> for excess errors)
> FAIL: gcc.target/i386/stackalign/return-2.c -mstackrealign (test for
> excess errors)
> FAIL: gcc.target/i386/vectorize4.c (test for excess errors)

Jakub


Re: Backporting KAsan patches to 4.9 branch

2014-10-14 Thread Jakub Jelinek
On Tue, Oct 14, 2014 at 03:19:10PM +0400, Dmitry Vyukov wrote:
> > One problem is that for BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N patch I need
> > libsanitizer APIs (__asan_loadN, __asan_storeN) which were introduced in a
> > giant libsanitizer merge in 5.0. In current patchset I backport the whole
> > merge patch (and a bunch of cherry-picks which followed it) but it changes
> > libsanitizer ABI (new version of __asan_init_vXXX, etc.) which is probably
> > undesirable. Another option would be to backport just the necessary minimum
> > (__asan_loadN, __asan_storeN). How should I proceed?
> 
> Backporting only __asan_loadN/__asan_storeN looks like the safest option to 
> me.

That's still an ABI change, libasan is not symbol versioned (perhaps we
should change that).

Jakub


Re: Backporting KAsan patches to 4.9 branch

2014-10-14 Thread Jakub Jelinek
On Tue, Oct 14, 2014 at 03:07:42PM +0400, Yury Gribov wrote:
> Finally got time to look into this. I've successfully backported 22 patches
> to 4.9:
> * bugfixes (12 patches)
> * install Asan headers (1 patch)
> * libsanitizer merge (1 patch) - this is questionable, see below for
> discussion
> * BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N (2 patches)
> * instrumentation with calls (1 patch)
> * optimize strlen instrumentation (1 patch)
> * move inlining to sanopt pass (2 patches)
> * Kasan (2 patches)
> 
> One problem is that for BUILT_IN_ASAN_REPORT_{LOAD,STORE}_N patch I need
> libsanitizer APIs (__asan_loadN, __asan_storeN) which were introduced in a
> giant libsanitizer merge in 5.0. In current patchset I backport the whole
> merge patch (and a bunch of cherry-picks which followed it) but it changes
> libsanitizer ABI (new version of __asan_init_vXXX, etc.) which is probably
> undesirable. Another option would be to backport just the necessary minimum
> (__asan_loadN, __asan_storeN). How should I proceed?

Yeah, such library changes are definitely not acceptable for 4.9 branch.
So, can you throw away all the patches which depend on the new libsanitizer
entry points?  If you need those for KAsan which doesn't need the library
(right?), then tweak the patches so that the new functions are used in
-fsanitize=kernel-address mode only, and never in 4.9 for -fsanitize=address
-?

> Another question: Should I update patch CL dates for backported patches? If
> not - should I insert them to CLs in chronological order or just stack on
> top of previous contents?

Usually, if you backport patches (not the same committer or committed
on later date to branch from the commit date to trunk), one writes
2014-xx-yy  You  

Backported from mainline
2014-zz-nn  Somebody  
...
See branch ChangeLog for examples.

Jakub


Re: [Consult] g++: About "-Wunused-variable" for constant variable initialized by function

2014-10-13 Thread Jakub Jelinek
On Mon, Oct 13, 2014 at 10:01:31PM +0800, Chen Gang wrote:
> > Is it correct?

This mailing list is for development of GCC, not the right place to learn
C++.  Please ask either on gcc-help mailing list, or on some C++ user
forums.

Jakub


Re: [Consult] g++: About "-Wunused-variable" for constant variable initialized by function

2014-10-13 Thread Jakub Jelinek
On Mon, Oct 13, 2014 at 09:10:31PM +0800, Chen Gang wrote:
> Oh, yes. Originally I got this warning by compiling Qemu. And sorry for
> my sample (test.cc) may be not quite precise.
> 
> For me, I guess:
> 
>  - If the constant number is defined in the header file, and never be
>used, our g++ need not report warning about [-Wunused-variable].

That is nonsense, even if you define such a "constant" in a header file,
it still means runtime overhead (the variable needs to be constructed at
runtime, const is not the same thing as constexpr).
So, IMHO the warning is desirable even if it is in headers, it is something
you should reconsider.  Making the function constexpr makes the warning of
course go away, then there is no runtime overhead associated with it; but
you'll need C++11 for that.

Jakub


Re: [PATCH] gcc parallel make check

2014-10-10 Thread Jakub Jelinek
On Fri, Oct 10, 2014 at 04:50:47PM +0200, Christophe Lyon wrote:
> On 10 October 2014 16:19, Jakub Jelinek  wrote:
> > On Fri, Oct 10, 2014 at 04:09:39PM +0200, Christophe Lyon wrote:
> >> my.exp contains the following construct which is often used in the 
> >> testsuite:
> >> ==
> >> foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.c]] {
> >> # If we're only testing specific files and this isn't one of them,
> >> skip it.
> >> if ![runtest_file_p $runtests $src] then {
> >> continue
> >> }
> >> c-torture-execute $src $additional_flags
> >> gcc-dg-runtest $src "" $additional_flags
> >> }
> >> ==
> >> Note that gcc-dg-runtest calls runtest_file_p too.
> >
> > Such my.exp is invalid, you need to guarantee gcc_parallel_test_run_p
> > is run the same number of times in all instances unless
> > gcc_parallel_test_enable has been disabled.
> 
> Thanks for your prompt answer.
> 
> Is this documented somewhere, so that such cases do not happen in the future?

Feel free to submit a documentation patch.

> It's in a patch which has been under review for quite some time
> (started before your change), that's why you missed it.

Ah, ok.

> What about my remark about:
> >  # For parallelized check-% targets, this decides whether parallelization
> >  # is desirable (if -jN is used and RUNTESTFLAGS doesn't contain anything
> >  # but optional --target_board or --extra_opts arguments).  If desirable,
> I think it should be removed from gcc/Makefile.in

Only the " and RUNTESTFLAGS ... arguments" part of that.  Patch preapproved.

Jakub


Re: [PATCH] gcc parallel make check

2014-10-10 Thread Jakub Jelinek
On Fri, Oct 10, 2014 at 04:09:39PM +0200, Christophe Lyon wrote:
> my.exp contains the following construct which is often used in the testsuite:
> ==
> foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.c]] {
> # If we're only testing specific files and this isn't one of them,
> skip it.
> if ![runtest_file_p $runtests $src] then {
> continue
> }
> c-torture-execute $src $additional_flags
> gcc-dg-runtest $src "" $additional_flags
> }
> ==
> Note that gcc-dg-runtest calls runtest_file_p too.

Such my.exp is invalid, you need to guarantee gcc_parallel_test_run_p
is run the same number of times in all instances unless
gcc_parallel_test_enable has been disabled.

See the patches I've posted when adding the fine-grained parallelization,
e.g. go testsuite has been fixed that way, etc.
So, in your above example, you'd need:
gcc_parallel_test_enable 0
line before c-torture-execute and
gcc_parallel_test_enable 1
line after gcc-dg-runtest.  That way, if runtest_file_p says the test should
be scheduled by current instance, all the subtests will be run there.

If my.exp is part of gcc/testsuite, I'm sorry for missing it, if it is
elsewhere, just fix it up.

Note, there are #verbose lines in gcc_parallel_test_run_p, you can uncomment
them and through sed on the log files verify that each instance performs the
same parallelization checks (same strings).

Jakub


Re: fast-math optimization question

2014-10-10 Thread Jakub Jelinek
On Thu, Oct 09, 2014 at 03:55:34PM -0700, Steve Ellcey wrote:
> On Thu, 2014-10-09 at 19:50 +, Joseph S. Myers wrote:
> > On Thu, 9 Oct 2014, Steve Ellcey wrote:
> > 
> > > Do you know which pass does the simple
> > > '(float)function((double)float_val)' demotion?  Maybe that would be a
> > > good place to extend things.
> > 
> > convert.c does such transformations.  Maybe the transformations in there 
> > could move to the match-and-simplify infrastructure - convert.c is not a 
> > particularly good place for optimization, and having similar 
> > transformations scattered around (fold-const, convert.c, front ends, SSA 
> > optimizers) isn't helpful; hopefully match-and-simplify will allow some 
> > unification of this sort of optimization.
> 
> I did a quick and dirty experiment with the match-and-simplify branch
> just to get an idea of what it might be like.  The branch built for MIPS
> right out of the box so that was great and I added a couple of rules
> (see below) just to see if it would trigger the optimization I wanted
> and it did.  I was impressed with the match-and-simplify infrastructure,
> it seemed to work quite well.  Will this branch be included in GCC 5.0?

Though, is such optimization desirable even for fast-math?
I mean, in the normal demotion, all that changes compared to original
source is the possibility of double rounding, or, if right now as in glibc
the *f suffixed varaints aren't 0.5ulp precise while double ones are.

If you want to demote a chain of calls, you add roundings in the middle too,
and depending on which function it is and which exact argument, I'd worry
the maximum error would already be not just slightly higher, but significantly
worse.  Even for -ffast-math we want only slightly worse precision, not
significantly worse one.

Jakub


Re: msan and gcc ?

2014-10-02 Thread Jakub Jelinek
On Thu, Oct 02, 2014 at 11:30:50AM +0400, Yury Gribov wrote:
> On 10/01/2014 10:39 PM, Kostya Serebryany wrote:
> >On Wed, Oct 1, 2014 at 11:38 AM, Toon Moene  wrote:
> >>On 10/01/2014 08:00 PM, Kostya Serebryany wrote:
> >>>
> >>>-gcc folks.
> >>>
> >>>Why not use clang then?
> >>>It offers many more nice features.
> >>
> >>
> >>What's the Fortran front-end called for clang (or do you really think we are
> >>going to write Weather Forecasting codes in C :-) )
> >
> >Oh, crap. :)
> 
> Well, there's always f2c ;)

You mean for performance critical code?  Fortran has different aliasing
rules than C, so it is hard to express those in C...

Jakub


Re: Enable EBX for x86 in 32bits PIC code

2014-09-29 Thread Jakub Jelinek
On Wed, Sep 24, 2014 at 03:20:44PM -0600, Jeff Law wrote:
> On 09/24/14 14:32, Ilya Enkovich wrote:
> >2014-09-24 19:27 GMT+04:00 Jeff Law :
> >>On 09/24/14 00:56, Ilya Enkovich wrote:
> 
> >>>
> >>>After register allocation we have no idea where GOT address is and
> >>>therefore delegitimize_address target hook becomes less efficient and
> >>>cannot remove UNSPECs. That's what I see now when build GCC with patch
> >>>applied:
> >>
> >>In theory this shouldn't be too hard to fix.
> >>
> >>I haven't looked at the code, but it might be something looking explicitly
> >>for ebx by register #, or something similar.  Which case within
> >>delegitimize_address isn't firing as it should after your changes?
> >
> >It is the case I had to fix:
> >
> >@@ -14415,7 +14433,8 @@ ix86_delegitimize_address (rtx x)
> >  ...
> >  movl foo@GOTOFF(%ecx), %edx
> >  in which case we return (%ecx - %ebx) + foo.  */
> >-  if (pic_offset_table_rtx)
> >+  if (pic_offset_table_rtx
> >+ && (!reload_completed || !ix86_use_pseudo_pic_reg ()))
> >  result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx 
> > (addend),
> >  pic_offset_table_rtx),
> >result);
> >
> >Originally if there is a UNSPEC_GOTOFFSET but no EBX usage then we
> >just remove this UNSPEC and substract EBX value.  With pseudo PIC reg
> >we should use PIC register instead of EBX but it is unclear what to
> >use after register allocation.
> What's the RTL before & after allocation?  Feel free to just pass along the
> dump files for sum_r4 that you referenced in a prior message.

I wonder if during/after reload we just couldn't look at
ORIGINAL_REGNO of hard regs if ix86_use_pseudo_pic_reg.  Or is that
the other case, where you don't have any PIC register replacement around,
and want to subtract something?  Perhaps in that case we could just
subtract the value of _GLOBAL_OFFSET_TABLE_ symbol if we have nothing better
around.

Jakub


Re: Listing a maintainer for libcilkrts, and GCC's Cilk Plus implementation generally?

2014-09-29 Thread Jakub Jelinek
On Mon, Sep 29, 2014 at 12:56:06PM +0200, Thomas Schwinge wrote:
> Hi!
> 
> On Tue, 23 Sep 2014 11:02:30 +, "Zamyatin, Igor" 
>  wrote:
> > Jeff Law wrote:
> > > The original plan was for Balaji to take on this role; however, his 
> > > assignment
> > > within Intel has changed and thus he's not going to have time to work on
> > > Cilk+ anymore.
> > > 
> > > Igor Zamyatin has been doing a fair amount of Cilk+ maintenance/bugfixing
> > > and it might make sense for him to own it in the long term if he's 
> > > interested.
> > 
> > That's right. 
> 
> Thanks!
> 
> > Can I add 2 records (cilk plus and libcilkrts) to Various Maintainers 
> > section?
> 
> I understand Jeff's email as a pre-approval of such a patch.

I think only SC can appoint maintainers, and while Jeff is in the SC,
my reading of that mail wasn't that it was the SC that has acked that, but
rather a question if Igor is willing to take that role, which then would
need to be acked by SC.

Jakub


Re: Problem with accessing built-ins from Fortran front-end

2014-09-26 Thread Jakub Jelinek
On Fri, Sep 26, 2014 at 11:07:28AM +0200, FX wrote:
> > Thus, the middle-end assumes that if you have __builtin_{isfinite,isnormal},
> > you also have __builtin_is{less,greater}equal builtins too.
> 
> Many thanks to both of you! I wasn’t looking into the right place at all. I 
> defined them, and now it works.
> 
> One related question: the __builtin_signbit is not type generic, is there
> any reason for it not to be?  Is there any reason for it not to be?  I ask

I guess history, builtins that were added earlier when we didn't have any
typegeneric builtins were all *{f,,l} etc. suffixed.
As they are used quite heavily in user code, it is too late to change that.
Just look at what glibc  does for years:
/* Return nonzero value if sign of X is negative.  */
# ifdef __NO_LONG_DOUBLE_MATH
#  define signbit(x) \
 (sizeof (x) == sizeof (float) ? __signbitf (x) : __signbit (x))
# else
#  define signbit(x) \
 (sizeof (x) == sizeof (float) \
  ? __signbitf (x) \
  : sizeof (x) == sizeof (double) \
  ? __signbit (x) : __signbitl (x))
# endif
We can't stop supporting that.

Jakub


Re: Problem with accessing built-ins from Fortran front-end

2014-09-26 Thread Jakub Jelinek
On Fri, Sep 26, 2014 at 10:29:57AM +0200, FX wrote:
> Hi,
> 
> I’m trying to make the Fortran front-end emit calls to some builtins we don’t 
> currently use (isfinite, isnormal). However, trying to use the same code as 
> isnan doesn’t work at all. Our gfc_define_builtin does three things:
> 
>   decl = add_builtin_function (name, type, code, BUILT_IN_NORMAL, 
> library_name, NULL_TREE);
>   set_call_expr_flags (decl, attr);
>   set_builtin_decl (code, decl, true);
> 
> While doing so works with isnan, it fails with isfinite or isnormal. When I 
> try to, I get an ICE in tree checking:
> 
>   * frame #0: 0x000100c02338 f951`build_call_expr_loc_array(loc=0, 
> fndecl=0x, n=2, argarray=0x7fff5fbfeb20) + 24 at 
> tree.h:2846
> frame #1: 0x000100c0259a f951`build_call_expr(fndecl=, 
> n=) + 186 at tree.c:10550
> frame #2: 0x00010046f671 
> f951`fold_builtin_interclass_mathfn(loc=, fndecl=, 
> arg=0x00014340c990) + 337 at builtins.c:9393
> frame #3: 0x000100485e34 f951`fold_builtin_1(loc=260, 
> fndecl=0x000143449360, arg0=0x00014340c990, ignore=) + 
> 2964 at builtins.c:10050
> frame #4: 0x00010049030c f951`fold_builtin_n(loc=, 
> fndecl=, args=, nargs=, 
> ignore=) + 1116 at builtins.c:10409
> frame #5: 0x000100491c33 f951`fold_builtin_call_array(loc=260, 
> type=0x000143405690, fn=0x00014351fc40, n=1, 
> argarray=0x7fff5fbfeeb0) + 355 at builtins.c:10575
> frame #6: 0x000100c024c5 f951`build_call_expr_loc(loc=, 
> fndecl=, n=) + 181 at tree.c:10533
> 
> 
> I don’t understand how the middle-end builtins decl should work, would 
> someone have a hint of how to fix this?

Just look what foold_builtin_interclass_mathfn does:
isfinite(x) -> islessequal(fabs(x),DBL_MAX)
isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) & islessequal(fabs(x),DBL_MAX)

Thus, the middle-end assumes that if you have __builtin_{isfinite,isnormal},
you also have __builtin_is{less,greater}equal builtins too.
Just create those too and you'll be fine (they are also typegeneric, like
__builtin_is{finite,normal}), i.e. they take ... arguments and the FE is
supposed to verify they are ok (or, if the FE creates them artificially, it
will ensure that too).

Jakub


Re: Enable EBX for x86 in 32bits PIC code

2014-09-23 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 10:00:00AM -0600, Jeff Law wrote:
> On 09/23/14 08:34, Jakub Jelinek wrote:
> >On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:
> >>use fixed EBX at least until we make sure pseudo PIC doesn't harm debug
> >>info generation.  If we have such option then gcc.target/i386/pic-1.c and
> >
> >For debug info, it seems you are already handling this in
> >delegitimize_address target hook, I'd suggest just building some very large
> >shared library at -O2 -g -fpic on i?86 and either look at the
> >sizes of .debug_info/.debug_loc sections with/without the patch,
> >or use the locstat utility from elfutils (talk to Petr Machata if needed).
> Can't hurt, but I really don't see how changing from a fixed to an
> allocatable register is going to muck up debug info in any significant way.

What matters is if the delegitimize_address target hook is as efficient in
delegitimization as before.  E.g. if it previously matched only when seeing
%ebx + gotoff or similar, and wouldn't match anything now, some vars could
have debug locations including UNSPEC and be dropped on the floor.

Jakub


Re: Enable EBX for x86 in 32bits PIC code

2014-09-23 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:
> use fixed EBX at least until we make sure pseudo PIC doesn't harm debug
> info generation.  If we have such option then gcc.target/i386/pic-1.c and

For debug info, it seems you are already handling this in
delegitimize_address target hook, I'd suggest just building some very large
shared library at -O2 -g -fpic on i?86 and either look at the
sizes of .debug_info/.debug_loc sections with/without the patch,
or use the locstat utility from elfutils (talk to Petr Machata if needed).

Jakub


Re: x86: combined usage of "-Os -mregparm=3" leads to broken codes

2014-09-23 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 04:20:16PM +0800, Bin Meng wrote:
> Hi Richard,
> 
> On Tue, Sep 23, 2014 at 4:09 PM, Richard Biener
>  wrote:
> > Your asm constraints do not specify that they use %edx.
> >
> > Richard.
> >
> 
> Sorry, I don't understand. The %edx is not used by the inline assembly codes.
> The 'mov(%eax),%edx' corresponds to C code:
> 
> pcall = (PCALL)(r->a + r->b);
> 
> where the %edx holds the value of r->a.

You are doing a call in the inline-asm behind compiler's back, and
some registers are call clobbered in the ABI.  So, unless you call a very
special function written in assembly that doesn't clobber those registers
(basically, uses a custom calling convention), you need to tell the compiler
that your inline-asm clobbers all call clobbered registers in the inline-asm
pattern.  That is not just about general purpose registers, but e.g.
SSE registers or i387 registers are clobbered too.

Jakub


Re: [PATCH] gcc parallel make check

2014-09-22 Thread Jakub Jelinek
On Mon, Sep 22, 2014 at 12:21:08PM -0400, Jason Merrill wrote:
> On 09/22/2014 11:58 AM, Jakub Jelinek wrote:
> >LGTM (though, supposedly we want similar change in
> >libstdc++-v3/testsuite/Makefile.am).
> >Or, if people would really like to see the commands, we could print them
> >just once, using e.g.
> > -$(if $(check_p_subno),@)(rootme= ...
> >(then e.g. check-parallel-gcc goal would print the command, but
> >check-parallel-gcc-1 or check-parallel-gcc-112 would not).
> 
> So, like this?

Ok, thanks.

> commit c750897381a3f936e27cabd825cfa85ce936a6a9
> Author: Jason Merrill 
> Date:   Mon Sep 22 11:44:00 2014 -0400
> 
> gcc/
>   * Makefile.in (check-parallel-%): Add @.
> libstdc++-v3/
>   * testsuite/Makefile.am (%/site.exp): Add @.
>   (check-DEJAGNU): Likewise.
>   * testsuite/Makefile.in: Regenerate.
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 6f251a5..97b439a 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3674,10 +3674,10 @@ $(lang_checks_parallelized): check-% : site.exp
>   fi
>  
>  check-parallel-% : site.exp
> - -test -d plugin || mkdir plugin
> - -test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
> - test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir 
> $(TESTSUITEDIR)/$(check_p_subdir)
> - -(rootme=`${PWD_COMMAND}`; export rootme; \
> + -@test -d plugin || mkdir plugin
> + -@test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
> + @test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir 
> $(TESTSUITEDIR)/$(check_p_subdir)
> + -$(if $(check_p_subno),@)(rootme=`${PWD_COMMAND}`; export rootme; \
>   srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \
>   if [ -n "$(check_p_subno)" ] \
>  && [ -n "$$GCC_RUNTEST_PARALLELIZE_DIR" ] \
> diff --git a/libstdc++-v3/testsuite/Makefile.am 
> b/libstdc++-v3/testsuite/Makefile.am
> index e206aba..b4c9e85 100644
> --- a/libstdc++-v3/testsuite/Makefile.am
> +++ b/libstdc++-v3/testsuite/Makefile.am
> @@ -91,9 +91,9 @@ new-abi-baseline:
> ${extract_symvers} ../src/.libs/libstdc++.so $${output})
>  
>  %/site.exp: site.exp
> - -test -d $* || mkdir $*
> + -@test -d $* || mkdir $*
>   @srcdir=`cd $(srcdir); ${PWD_COMMAND}`;
> - objdir=`${PWD_COMMAND}`/$*; \
> + @objdir=`${PWD_COMMAND}`/$*; \
>   sed -e "s|^set srcdir .*$$|set srcdir $$srcdir|" \
>   -e "s|^set objdir .*$$|set objdir $$objdir|" \
>   site.exp > $*/site.exp.tmp
> @@ -115,7 +115,7 @@ $(check_DEJAGNU_normal_targets): check-DEJAGNUnormal%: 
> normal%/site.exp
>  
>  # Run the testsuite in normal mode.
>  check-DEJAGNU $(check_DEJAGNU_normal_targets): check-DEJAGNU%: site.exp
> - AR="$(AR)"; export AR; \
> + $(if $*,@)AR="$(AR)"; export AR; \
>   RANLIB="$(RANLIB)"; export RANLIB; \
>   if [ -z "$*" ] && [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
> rm -rf normal-parallel || true; \
> diff --git a/libstdc++-v3/testsuite/Makefile.in 
> b/libstdc++-v3/testsuite/Makefile.in
> index 59060b8..0fc26f4 100644
> --- a/libstdc++-v3/testsuite/Makefile.in
> +++ b/libstdc++-v3/testsuite/Makefile.in
> @@ -553,9 +553,9 @@ new-abi-baseline:
> ${extract_symvers} ../src/.libs/libstdc++.so $${output})
>  
>  %/site.exp: site.exp
> - -test -d $* || mkdir $*
> + -@test -d $* || mkdir $*
>   @srcdir=`cd $(srcdir); ${PWD_COMMAND}`;
> - objdir=`${PWD_COMMAND}`/$*; \
> + @objdir=`${PWD_COMMAND}`/$*; \
>   sed -e "s|^set srcdir .*$$|set srcdir $$srcdir|" \
>   -e "s|^set objdir .*$$|set objdir $$objdir|" \
>   site.exp > $*/site.exp.tmp
> @@ -566,7 +566,7 @@ $(check_DEJAGNU_normal_targets): check-DEJAGNUnormal%: 
> normal%/site.exp
>  
>  # Run the testsuite in normal mode.
>  check-DEJAGNU $(check_DEJAGNU_normal_targets): check-DEJAGNU%: site.exp
> - AR="$(AR)"; export AR; \
> + $(if $*,@)AR="$(AR)"; export AR; \
>   RANLIB="$(RANLIB)"; export RANLIB; \
>   if [ -z "$*" ] && [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
> rm -rf normal-parallel || true; \


Jakub


Re: [PATCH] gcc parallel make check

2014-09-22 Thread Jakub Jelinek
On Mon, Sep 22, 2014 at 11:43:35AM -0400, Jason Merrill wrote:
> On 09/22/2014 11:26 AM, Jakub Jelinek wrote:
> >On Mon, Sep 22, 2014 at 11:21:14AM -0400, Jason Merrill wrote:
> >>If I say 'rgt dg.exp=var-templ1.C' the actual test results are lost in the
> >>explosion of shell verbosity.  Could we add some '@'s to more of the rules,
> >>perhaps?
> >
> >I've been considering that too, but not sure what info people find valuable
> >and what they don't.
> 
> I don't see much information in the ~128 repetitions of the check-parallel
> rules with different numbers; the actual runtest command is the same in all
> of them.  Adding @ to all of the commands of the check-parallel-% rule makes
> things much better for me:

LGTM (though, supposedly we want similar change in
libstdc++-v3/testsuite/Makefile.am).
Or, if people would really like to see the commands, we could print them
just once, using e.g.
-$(if $(check_p_subno),@)(rootme= ...
(then e.g. check-parallel-gcc goal would print the command, but
check-parallel-gcc-1 or check-parallel-gcc-112 would not).

> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3674,10 +3674,10 @@ $(lang_checks_parallelized): check-% : site.exp
>   fi
>  
>  check-parallel-% : site.exp
> - -test -d plugin || mkdir plugin
> - -test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
> - test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir 
> $(TESTSUITEDIR)/$(check_p_subdir)
> - -(rootme=`${PWD_COMMAND}`; export rootme; \
> + -@test -d plugin || mkdir plugin
> + -@test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
> + @test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir 
> $(TESTSUITEDIR)/$(check_p_subdir)
> + -@(rootme=`${PWD_COMMAND}`; export rootme; \
>   srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \
>   if [ -n "$(check_p_subno)" ] \
>  && [ -n "$$GCC_RUNTEST_PARALLELIZE_DIR" ] \


Jakub


Re: [PATCH] gcc parallel make check

2014-09-22 Thread Jakub Jelinek
On Mon, Sep 22, 2014 at 10:44:06AM -0500, Segher Boessenkool wrote:
> On Mon, Sep 22, 2014 at 05:26:04PM +0200, Jakub Jelinek wrote:
> > I've been considering that too, but not sure what info people find valuable
> > and what they don't.
> 
> The ten million "Running blablablalba.exp ..." messages on a very parallel
> run aren't helpful in my opinion.  There might be more but that drowns out
> everything else :-)

It has some value, it shows the actual progress.  Sure, you can just watch
the *.log files as they are populated and get better picture.  I think the
Running *.exp messages go from dejagnu, not from gcc testsuite changes.

Jakub


Re: [PATCH] gcc parallel make check

2014-09-22 Thread Jakub Jelinek
On Mon, Sep 22, 2014 at 11:21:14AM -0400, Jason Merrill wrote:
> On 09/12/2014 08:04 PM, Jakub Jelinek wrote:
> >I've been worried about the quick cases where
> >parallelization is not beneficial, like make check-gcc \
> >RUNTESTFLAGS=dg.exp=pr60123.c or similar, but one doesn't usually pass -jN
> >in that case.
> 
> I have -jN in my $MAKEFLAGS, so I've been running into this with my rgt
> shell function:
> 
> rgt ()
> {
> ( cd ~/m/$CANON/gcc/gcc;
> make check-c++ ${1:+RUNTESTFLAGS="$*"} )
> }
> 
> If I say 'rgt dg.exp=var-templ1.C' the actual test results are lost in the
> explosion of shell verbosity.  Could we add some '@'s to more of the rules,
> perhaps?

I've been considering that too, but not sure what info people find valuable
and what they don't.

Jakub


Re: Is this a compiler bug?

2014-09-22 Thread Jakub Jelinek
On Mon, Sep 22, 2014 at 10:22:59AM -0400, David Malcolm wrote:
> On Sun, 2014-09-21 at 22:15 -0700, Andrew Pinski wrote:
> > On Sun, Sep 21, 2014 at 8:08 PM, Steve Kargl
> >  wrote:
> > > On Sun, Sep 21, 2014 at 07:57:45PM -0700, Andrew Pinski wrote:
> > >> On Sun, Sep 21, 2014 at 6:56 PM, Steve Kargl
> > >>  wrote:
> > >> > + is a binary operator.  0x3ffe is a hexidecimal-constant according
> > >> > to 6.6.4.1 in n1256.pdf.  63 is, of course, a decimal-constant.
> > >>
> > >>
> > >> This is before tokens happen and during lexing of the program.
> > >> e+64 is exponent-part see 6.4.4.2.
> > >
> > > 6.4.4.2 applies to floating point constant.
> > > 6.4.4.1 is for integer constants.
> > 
> > Nope again, this time from bug 3885:
> > Strange as it may seem, the behavior is correct, and mandated by the C
> > Standard.  0x00E-0x00A is a single preprocessor token, of type
> > pp-number, and it must become a single compiler token, but it can't.
> > The gotcha is the `E-' sequence, that makes it seem like the exponent
> > notation of floating-point constants.
> > 
> > Looks like this is a common misunderstood part of C.
> 
> If people tend to get this wrong, should we issue a warning for it?

We already diagnose it, with an error actually, not warning:
error: invalid suffix "+63" on integer constant
etc.
Just apparently people are surprised when the compiler diagnoses it.
So perhaps some extra note explaining it (== hint how to fix that).

Jakub


Re: Fwd: Building gcc-4.9 on OpenBSD

2014-09-20 Thread Jakub Jelinek
On Sat, Sep 20, 2014 at 08:33:01AM +0100, Andrew Haley wrote:
> On 20/09/14 02:45, Ian Grant wrote:
> 
> > You get first prize for most informative intelligent answer so far!
> > Careful, you might get second prize too :-)
> > 
> > The problem is that we need to find a way to tell people _what_ is in
> > that "dwarf" code. Open BSD's gcc ignores it, prints a warning, and
> > goes about its business. That's probably why OpenBSD gcc 4.9 binaries
> > are 17MB against the 64MB compiled by gcc 4.9. But that is a really
> > fucking big dwarf they're stuffing in threre. What is the data,
> > really? We can't just say "it's dwarf" because that doesn't really
> > mean a whole lot, does it?
> 
> It's debugging information for debuggers.  What more are you asking
> for?  Do you need to know about the structure of the debuginfo, or
> something?

And if you need that, it is easily available, see http://www.dwarfstd.org/
https://fedorahosted.org/elfutils/wiki/DwarfExtensions .  If you care only
about what matters for execution of the generated binaries, it is only the
allocated sections that matter (with A flag in readelf -WS output), and
all the debug info is in non-allocated sections only.

Jakub


Re: [RFC] Add asm constraint modifier to mark strict memory accesses

2014-09-18 Thread Jakub Jelinek
On Thu, Sep 18, 2014 at 03:09:34PM +0400, Yury Gribov wrote:
> Current semantics of memory constraints in GCC inline asm (i.e. "m", "v",
> etc.) is somewhat loosy in that it tells GCC that asm code _may_ access
> given amount of bytes but is not guaranteed to do so. This is (ab)used by
> e.g. glibc (and also some pieces of kernel):
> __STRING_INLINE void *
> __rawmemchr (const void *__s, int __c)
> {
> ...
>   __asm__ __volatile__
> ("cld\n\t"
>  "repne; scasb\n\t"
> ...
>"m" ( *(struct { char __x[0xfff]; } *)__s)
> 
> Imprecise size specification prevents code analysis tools from understanding
> semantics of inline asm (without parsing inline asm instructions which e.g.
> Asan in Clang tries to do). In particular we can't automatically instrument
> inline asm in kernel with Kasan because we can not determine exact access
> size (see e.g. discussion in
> https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02530.html).
> 
> Would it make sense to add another constraint modifier (like "=", "&", etc.)
> that would tell compiler/tool that memory access in asm is _guaranteed_ to
> have the specified size?

CCing Richard/Jeff on this for thoughts.

Would that modifier mean that the inline asm is unconditionally reading
resp. writing that memory? "m"/"=m" right now is always about might read or
might write, not must.

In any case, as no GCC versions support that, you'd need to heavily macroize
it in the kernel, not sure the kernel people would like that very much.

Jakub


Re: Backporting KAsan patches to 4.9 branch

2014-09-18 Thread Jakub Jelinek
On Thu, Sep 18, 2014 at 01:46:21PM +0400, Yury Gribov wrote:
> Kernel Asan patches are currently being discussed in LKML. One of the points
> raised during review was that KAsan requires GCC 5.0 which is presumably
> unstable (e.g. compilation of kernel modules has been broken for two months
> due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848).
> 
> Would it make sense to backport Kasan-related patches to 4.9 branch to make
> this feature more accessible to kernel developers? Quick analysis showed
> that at the very least this would require
> * r211091 (BUILT_IN_ASAN_REPORT_LOAD_N and friends)
> * r211092 (instrument unaligned accesses)
> * r211713 and r211699 (New asan-instrumentation-with-call-threshold
> parameter)
> * r213367 (initial support for -fsanitize=kernel-address)
> and also maybe ~10 bugfix patches.
> 
> Is it ok to backport these to 4.9? Note that I would discard patches for
> other sanitizers (UBsan, Tsan).

I'd say so, if it doesn't need any library changes (especially not any ABI
visible ones, guess bugfixes could be acceptable).

What asan related patches are still pending review (sorry for missing some)?
Do we have any known regressions in 5 from 4.9?  Those would need to be
resolved first.

Jakub


Re: Offloading not relocatable

2014-09-17 Thread Jakub Jelinek
On Wed, Sep 17, 2014 at 08:39:38PM +0400, Ilya Verbin wrote:
> 2014-09-17 20:30 GMT+04:00 Bernd Schmidt :
> > That's also a solved problem in nvptx mkoffload - you do need to unset these
> > environment variables when invoking the target compiler. I've posted the
> > source a few times but here it is again.
> 
> I see there:
>   unsetenv ("GCC_EXEC_PREFIX");
>   unsetenv ("COMPILER_PATH");
>   unsetenv ("LIBRARY_PATH");
> 
> Or do you mean, that there is no need to set them to the new values
> before invoking the target compiler?

If you are invoking the target compiler DRIVER (rather than compiler),
you should not have those in the environment, otherwise those env vars
will override where the target compiler driver would be looking for
the target compiler, libraries, headers etc.

Jakub


Re: Offloading not relocatable

2014-09-17 Thread Jakub Jelinek
On Wed, Sep 17, 2014 at 06:11:25PM +0200, Jakub Jelinek wrote:
> On Wed, Sep 17, 2014 at 04:21:42PM +0200, Jakub Jelinek wrote:
> > >From -v dump, that sounds like a bug in mkoffload, which has been found
> > properly relatively to the gcc driver or whatever.
> > 
> > /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
> >  @/tmp/ccS6Q83l
> > collect2: error trying to exec 
> > '/usr/local/bin/x86_64-intelmic-linux-gnu-g++': execvp: No such file or 
> > directory
> > lto-wrapper: fatal error: 
> > /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
> >  terminated with signal 11 [Segmentation fault], core dumped
> > compilation terminated.
> > /usr/bin/ld: lto-wrapper failed
> > collect2: error: ld returned 1 exit status
> > 
> 
> In the same setup, I've added (as root)
> #!/bin/sh
> exec /usr/src/gcc-git/objinst/usr/local/bin/x86_64-intelmic-linux-gnu-g++ "$@"
> script to /usr/local/bin/x86_64-intelmic-linux-gnu-g++ .
> But things don't work in that case either, the error is now:
> /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
>  @/tmp/cc0nIjXd
> x86_64-intelmic-linux-gnu-g++: error trying to exec 'cc1plus': execvp: No 
> such file or directory
> lto-wrapper: fatal error: 
> /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
>  terminated with signal 11 [Segmentation fault], core dumped
> compilation terminated.
> /usr/bin/ld: lto-wrapper failed
> collect2: error: ld returned 1 exit status
> 
> which presumably means that some undesirable vars from host environment
> were kept by mkoffload in the environment for the offloading compiler
> and that has the undesirable effect of affecting how the offloading
> compiler driver works.  Because, if I invoke
> /usr/local/bin/x86_64-intelmic-linux-gnu-g++ manually, it can find cc1plus
> just fine.

unset GCC_EXEC_PREFIX
in the /usr/local/bin/x86_64-intelmic-linux-gnu-g++ script somewhat helps,
presumably mkoffload shouldn't export GCC_EXEC_PREFIX env var when
calling the driver, as it should be driver's business to find everything
else on its own.

/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 @/tmp/ccySFivX
x86_64-intelmic-linux-gnu-g++: fatal error: -fuse-linker-plugin, but 
liblto_plugin.so not found
compilation terminated.
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

is the next error though.
find `pwd` -name \*lto\*.so\*
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/liblto_plugin.so
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/liblto_plugin.so.0.0.0
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/liblto_plugin.so.0
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-intelmic-linux-gnu/5.0.0/liblto_plugin.so
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-intelmic-linux-gnu/5.0.0/liblto_plugin.so.0.0.0
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-intelmic-linux-gnu/5.0.0/liblto_plugin.so.0
strace tells me it has searched (far before that the host compiler's lto plugin 
has been found
just fine):
[pid  9491] 
access("/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/accel/x86_64-intelmic-linux-gnu/liblto_plugin.so",
 R_OK) = -1 ENOENT (No such file or directory)
[pid  9491] 
access("/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/liblto_plugin.so",
 R_OK) = -1 ENOENT (No such file or directory)
[pid  9491] 
access("/usr/local/libexec/gcc/x86_64-intelmic-linux-gnu/5.0.0/x86_64-unknown-linux-gnu/5.0.0/accel/x86_64-intelmic-linux-gnu/liblto_plugin.so",
 R_OK) = -1 ENOENT (No such file or directory)
[pid  9491] 
access("/usr/local/libexec/gcc/x86_64-intelmic-linux-gnu/5.0.0/liblto_plugin.so",
 R_OK) = -1 ENOENT (No such file or directory)
[pid  9491] 
access("/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/accel/x86_64-intelmic-linux-gnu/x86_64-unknown-linux-gnu/5.0.0/accel/x86_64-intelmic-linux-gnu/liblto_plugin.so",
 R_OK) = -1 ENOENT (No such file or directory)
[pid  9491] 
access("/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/accel/x86_64-intelmic-linux-gnu/liblto_plugin.s

Re: Offloading not relocatable

2014-09-17 Thread Jakub Jelinek
On Wed, Sep 17, 2014 at 04:21:42PM +0200, Jakub Jelinek wrote:
> >From -v dump, that sounds like a bug in mkoffload, which has been found
> properly relatively to the gcc driver or whatever.
> 
> /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
>  @/tmp/ccS6Q83l
> collect2: error trying to exec 
> '/usr/local/bin/x86_64-intelmic-linux-gnu-g++': execvp: No such file or 
> directory
> lto-wrapper: fatal error: 
> /usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
>  terminated with signal 11 [Segmentation fault], core dumped
> compilation terminated.
> /usr/bin/ld: lto-wrapper failed
> collect2: error: ld returned 1 exit status
> 

In the same setup, I've added (as root)
#!/bin/sh
exec /usr/src/gcc-git/objinst/usr/local/bin/x86_64-intelmic-linux-gnu-g++ "$@"
script to /usr/local/bin/x86_64-intelmic-linux-gnu-g++ .
But things don't work in that case either, the error is now:
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 @/tmp/cc0nIjXd
x86_64-intelmic-linux-gnu-g++: error trying to exec 'cc1plus': execvp: No such 
file or directory
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

which presumably means that some undesirable vars from host environment
were kept by mkoffload in the environment for the offloading compiler
and that has the undesirable effect of affecting how the offloading
compiler driver works.  Because, if I invoke
/usr/local/bin/x86_64-intelmic-linux-gnu-g++ manually, it can find cc1plus
just fine.

Jakub


Re: Offloading not relocatable

2014-09-17 Thread Jakub Jelinek
On Wed, Sep 17, 2014 at 04:21:42PM +0200, Jakub Jelinek wrote:
> So, I'd say that mkoffload should be using similar stuff like the gcc.c
> driver uses to find cc1 relative to the location of the gcc binary.

E.g. strace can tell where gcc driver is looking for cc1:
mv 
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/cc1{,.xx}
strace -e access,stat /usr/src/gcc-git/objinst/usr/local/bin/gcc -fopenmp -o 
target-1.s target-1.c -S 
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
access("/usr/src/gcc-git/objinst/usr/local/bin/", X_OK) = 0
access("/usr/src/gcc-git/objinst/usr/local/bin/", X_OK) = 0
access("target-1.c", F_OK)  = 0
access("/usr/src/gcc-git/objinst/usr/local/bin/", X_OK) = 0
access("/usr/src/gcc-git/objinst/usr/local/bin/", X_OK) = 0
access("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.0.0/specs",
 R_OK) = -1 ENOENT (No such file or directory)
access("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/specs", R_OK) = -1 
ENOENT (No such file or directory)
access("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../x86_64-unknown-linux-gnu/lib/x86_64-unknown-linux-gnu/5.0.0/specs",
 R_OK) = -1 ENOENT (No such file or directory)
access("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../x86_64-unknown-linux-gnu/lib/specs",
 R_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/lib/gcc/x86_64-unknown-linux-gnu/specs", R_OK) = -1 ENOENT 
(No such file or directory)
access("/usr/src/gcc-git/objinst/usr/local/bin/", X_OK) = 0
stat("/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/cc1",
 0x7fff73be0aa0) = -1 ENOENT (No such file or directory)
stat("/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/cc1", 
0x7fff73be0aa0) = -1 ENOENT (No such file or directory)
stat("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu/5.0.0/cc1",
 0x7fff73be0aa0) = -1 ENOENT (No such file or directory)
stat("/usr/src/gcc-git/objinst/usr/local/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../x86_64-unknown-linux-gnu/bin/cc1",
 0x7fff73be0aa0) = -1 ENOENT (No such file or directory)
gcc: error trying to exec 'cc1': execvp: No such file or directory

mkoffload should make similar attempts to find the offloading compiler
driver.  It should try (relative to mkoffload binary) probably:
../../../../../bin/x86_64-intelmic-linux-gnu-g++
(why does it try g++ and not gcc btw?), then perhaps:
./x86_64-intelmic-linux-gnu-g++
(so that one can e.g. put a symlink next to mkoffload and point it wherever 
needed)
and perhaps as last fallback, try to execute x86_64-intelmic-linux-gnu-g++
from PATH env var.

Jakub


Offloading not relocatable

2014-09-17 Thread Jakub Jelinek
Hi!

So, with the patch (probably undesirable) I've posted, I've managed to
configure/make/make install DESTDIR=... the offload compiler and
configure/make/make install DESTDIR=... the host compiler.

Generally gcc is a relocatable package, so appart from perhaps needing to
add LD_LIBRARY_PATH it doesn't matter if the gcc installed tree as whole
lives somewhere else than prefix (DESTDIR=/usr/src/gcc-git/objinst is
what I've been using), one can still use the compiler normally.

But apparently that is not the case with offloading:

/usr/src/gcc-git/objinst/usr/local/bin/gcc -fopenmp -o target-1 target-1.c
collect2: error trying to exec '/usr/local/bin/x86_64-intelmic-linux-gnu-g++': 
execvp: No such file or directory
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

>From -v dump, that sounds like a bug in mkoffload, which has been found
properly relatively to the gcc driver or whatever.

/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 @/tmp/ccS6Q83l
collect2: error trying to exec '/usr/local/bin/x86_64-intelmic-linux-gnu-g++': 
execvp: No such file or directory
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.0.0//accel/x86_64-intelmic-linux-gnu/mkoffload
 terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

So, I'd say that mkoffload should be using similar stuff like the gcc.c
driver uses to find cc1 relative to the location of the gcc binary.

Also, unrelated to that, I've mentioned to Kirill on IRC privately that
it would be nice if the libgomp offloading plugin API/ABI used some
clearly libgomp & offloading related prefix for all the:
get_type
get_num_devices
offload_register
device_init
device_get_table
device_alloc
device_free
device_dev2host
device_host2dev
device_run
symbols that are part of the plugin interface (GOMP_OFFLOAD_*, or whatever
else).

Jakub


Re: Designated Initializers in C++

2014-09-16 Thread Jakub Jelinek
On Tue, Sep 16, 2014 at 01:14:35PM +0200, David Brown wrote:
> After a recent discussion about designated initializers in C++, I
> noticed that they are accepted by modern gcc (when gcc extensions are
> enabled).
> 
> On , the
> documentation specifically says "This extension is not implemented in
> GNU C++".  That certainly used to be the case - as far as I have tested,
> it was implemented in gcc 4.6 or 4.7.
> 
> Could someone with commit access to the documentation fix that sentence?

???  Designated initializers definitely are not implemented in G++,
what there is is just very limited parsing, but the C++ FE requires that
the designators are just useless annotations, you can't initialize even
a POD out of order with the designators, skip some field, initialize
something twice etc.

Jakub


Re: [PATCH] gcc parallel make check

2014-09-16 Thread Jakub Jelinek
On Tue, Sep 16, 2014 at 11:20:37AM +0200, Richard Biener wrote:
> > This confuses me, but, no matter.  Isn’t 8hrs time data?  :-)

It is, but not time(1) data, just wall clock computed from subtracting
mtimes of my make check output log and make -j48 bootstrap log.

> >> patch toplevel make -j48 -k check took:
> >> real40m21.984s
> >> user341m51.675s
> >> sys 112m46.993s
> >> and with the patch make -j48 -k check took:
> >> real32m22.066s
> >> user355m1.788s
> >> sys 117m5.809s
> >
> > These numbers are useful to try and ensure the overhead (scaling factor) is 
> > reasonable, thanks.
> 
> A nice improvement indeed.  The patched result is 15 times faster
> than the serial unpatched run.  So there is room for improvement

Note, the box used was oldish AMD 16-core, no ht, box, haven't tried it on 
anything
more parallel, also it was normal hard disk, etc.  No idea whether anything
from this is relevant to that though.
Some CPU time goes into the expect processes, I can retry the build tonight
and grab also time(1) info from make -k check to see the user/sys times for
serial testing.

Jakub


Re: [PATCH] gcc parallel make check

2014-09-15 Thread Jakub Jelinek
On Fri, Sep 12, 2014 at 04:42:25PM -0700, Mike Stump wrote:
> On Sep 12, 2014, at 9:32 AM, Jakub Jelinek  wrote:
> > Here is my latest version of the patch.
> 
> I did a timing test:

Here is an updated version.
Changes since last version:
1) acats parallelized the same way (just, because it is in shell,
   using mkdir instead of open with O_EXCL|O_CREAT);
   also, as acats has pretty significant initial setup time
   (up to a minute or so on not really fast box), I'm now
   creating the /support stuff before spawning the parallel
   jobs and let the parallel jobs use the same shared support
   directory
2) I'm now using addprefix instead of patsubst where appropriate
3) I'm using $(if ...) instead of $(or ...) to make it usable
   with make 3.80 (3.81 already supports or)
4) parallelization is performed for any kinds of RUNTESTFLAGS arguments now
5) struct-layout-1.exp apparently doesn't have to be performed serially,
   and for gnu-encoding.exp I've used similar change as for go-test.exp
6) in libstdc++, abi.exp, pretty-printers.exp and xmethods.exp are performed
   together with conformance.exp, so again parallelization for any
   RUNTESTFLAGS flags; abi.exp and xmethods.exp are serially tested
   by the first runtest instance to encounter them

Regtested on x86_64-linux, without the patch toplevel make -k check
took 8hrs3minutes (don't have time data for that run), without the
patch toplevel make -j48 -k check took:
real40m21.984s
user341m51.675s
sys 112m46.993s
and with the patch make -j48 -k check took:
real32m22.066s
user355m1.788s
sys 117m5.809s
I saw over 45 jobs running pretty much as the point where all the
testing was done, and test_summary run from the non-parallel testing
is the same as test_summary from the -j48 testing with the patch.
Is this version ok for trunk?

2014-09-14  Jakub Jelinek  

gcc/
* Makefile.in (dg_target_exps): Remove.
(check_gcc_parallelize): Change to just an upper bound number.
(check-%-subtargets): Always print the non-parallelized goals.
(check_p_vars, check_p_comma, check_p_subwork): Remove.
(check_p_count, check_p_numbers0, check_p_numbers1, check_p_numbers2,
check_p_numbers3, check_p_numbers4, check_p_numbers5,
check_p_numbers6): New variables.
(check_p_numbers): Set to sequence from 1 to .
(check_p_subdirs): Set to sequence from 1 to minimum of
$(check_p_count) and either GCC_TEST_PARALLEL_SLOTS env var if set,
or 128.
(check-%, check-parallel-%): Rewritten so that for parallelized
testing each job runs all the *.exp files, with
GCC_RUNTEST_PARALLELIZE_DIR set in environment.
gcc/go/
* Make-lang.in (check_go_parallelize): Change to just an upper bound
number.
gcc/fortran/
* Make-lang.in (check_gfortran_parallelize): Change to just an upper
bound number.
gcc/cp/
* Make-lang.in (check_g++_parallelize): Change to just an upper bound
number.
gcc/objc/
* Make-lang.in (check_objc_parallelize): Change to just an upper
bound number.
gcc/ada/
* gcc-interface/Make-lang.in (check_acats_numbers0,
check_acats_numbers1, check_acats_numbers2, check_acats_numbers3,
check_acats_numbers4, check_acats_numbers5, check_acats_numbers6,
check_acats_numbers, check_acats_subdirs): New variables.
(check_acats_targets): Use $(check_acats_subdirs).
(check-acats, check-acats%): Rewritten so that for parallelized
testing each job runs all the chapters files, with
GCC_RUNTEST_PARALLELIZE_DIR set in environment.  Prepare the support
directory sequentially and share it.
(check-acats-subtargets): Always print just check-acats.
gcc/testsuite/
* lib/gcc-defs.exp (gcc_parallel_test_run_p,
gcc_parallel_test_enable): New procedures.  If
GCC_RUNTEST_PARALLELIZE_DIR is set in environment, override
runtest_file_p to invoke also gcc_parallel_test_run_p.
* g++.dg/guality/guality.exp (check_guality): Save/restore
test_counts array around the body of the procedure.
* gcc.dg/guality/guality.exp (check_guality): Likewise.
* g++.dg/plugin/plugin.exp: Run all the tests serially
by the first parallel runtest encountering it.
* gcc.dg/plugin/plugin.exp: Likewise.
* gcc.misc-tests/matrix1.exp: Likewise.
* gcc.misc-tests/dhry.exp: Likewise.
* gcc.misc-tests/acker1.exp: Likewise.
* gcc.misc-tests/linkage.exp: Likewise.
* gcc.misc-tests/mg.exp: Likewise.
* gcc.misc-tests/mg-2.exp: Likewise.
* gcc.misc-tests/sort2.exp: Likewise.
* gcc.misc-tests/sieve.exp: Likewise.
* gcc.misc-tests/options.exp: Likewise.
* gcc.misc-tests/help.exp: Likewise.
* go.test/go-test.exp (go-gc-tests): Use
gcc_parallel_test_enable {0

<    4   5   6   7   8   9   10   11   12   13   >