date:20140903

RE: selective linking of floating point support for printf / scanf

2014-09-03 Thread Thomas Preud'homme

 From: Joseph S. Myers [mailto:jos...@codesourcery.com]
 Sent: Tuesday, September 02, 2014 11:29 PM
 
 Identifiers beginning with a single underscore are reserved with file
 scope.  This means an application cannot provide an external definition of
 them, because such an external definition would have file scope.  So it's
 fine for the implementation to define such identifiers and use them in the
 implementation of standard functions.

Ah yes, I mistook file scope with file scope with internal linkage. So then 
there
shouldn't be any problem since _printf_float and _scanf_float are only used
for external linkage, no macro refer to them.

Best regards,

Thomas

Re: Some questions about pass web

2014-09-03 Thread Bin.Cheng

On Wed, Sep 3, 2014 at 7:35 AM, Carrot Wei car...@google.com wrote:
 Hi

 I have following questions about web (pseudo register renaming) pass:

 1. It is well known that register renaming is a big help to register
 allocation, but in gcc's backend, the web pass is far before RA, there
 are about 20 passes between them. Does it mean register renaming can
 also heavily benefit other optimizations? And the passes between them
 usually don't generate more register renaming chances?
I think one purpose is to break long dependency chain into short ones.
For example, with below code

   use(i)
   i = i + 1;
   ...
   use(i)
   i = i + 1;
   ...
   use(i)
   i = i + 1;
   ...

Pass fweb could change it into below form

   use(i)
   i0 = i + 1
   ...
   use(i0)
   i1 = i0 + 1
   ...
   use(i1)
   i = i0 + 2
   ...

Apparently, latter form has shorter chains, which makes df stuff more efficient.


 2. It looks current web pass can't handle AUTOINC expressions, a reg
 operand is used as both use and def reference in an AUTOINC
 expression, so this def side should not be renamed. Pass web doesn't
 explicitly check this case, may rename the reg operand of AUTOINC
 expression. Is this expected because it is before auto_inc_dec pass?

Last time I tried, there are several passes after loop_done and before
auto-inc-dec can't handle auto-increment addressing mode, including
fweb.

 3. Are AUTOINC expressions only generated by auto_inc_dec pass? All
 passes before auto_inc_dec including expand should not generate
 AUTOINC expressions, otherwise it will break web.
Yes.  Yet other passes may generate auto-inc friendly instruction
patterns thus auto-inc-dec can capture more opportunities.  IVOPT is a
typical example.

Thanks,
bin


 Could anybody help to answer these questions?

 thanks a lot
 Guozhi Wei

Re: GCC ARM: aligned access

2014-09-03 Thread Bin.Cheng

On Mon, Sep 1, 2014 at 9:14 AM, Peng Fan van.free...@gmail.com wrote:


 On 09/01/2014 08:09 AM, Matt Thomas wrote:

 On Aug 31, 2014, at 11:32 AM, Joel Sherrill joel.sherr...@oarcorp.com 
 wrote:

 Hi,

 I am writing some code and found that system crashed. I found it was
 unaligned access which causes `data abort` exception. I write a piece
 of code and objdump
 it. I am not sure this is right or not.

 command:
 arm-poky-linux-gnueabi-gcc -marm -mno-thumb-interwork -mabi=aapcs-linux
 -mword-relocations -march=armv7-a -mno-unaligned-access
 -ffunction-sections -fdata-sections -fno-common -ffixed-r9 -msoft-float
 -pipe  -O2 -c 2.c -o 2.o

 arch is armv7-a and used '-mno-unaligned access'

 I think this is totally expected. You were passed a u8 pointer which is 
 aligned for that type (no restrictions likely). You cast it to a type with 
 stricter alignment requirements. The code is just flawed. Some CPUs handle 
 unaligned accesses but not your ARM.

 armv7 and armv6 arch except armv6-m support unaligned access. a u8 pointer is 
 casted to u32 pointer, should gcc take the align problem into consideration 
 to avoid possible errors? because -mno-unaligned-access.
 While armv7 and armv6 supports unaligned access, that support has to be
 enabled by the underlying O/S.  Not knowing the underlying environment,
 I can't say whether that support is enabled.  One issue we had in NetBSD
 in moving to gcc4.8 was that the NetBSD/arm kernel didn't enable unaligned
 access for armv[67] CPUs.  We quickly changed things so unaligned access
 is supported.

 Yeah. by set a hardware bit in arm coprocessor, unaligned access will not 
 cause data abort exception.
 I just wonder is the following correct? should gcc take the responsibility to 
 take care possible unaligned pointer `u8 *data`? because 
 -mno-unaligned-access is passed to gcc.
I suppose no.  It explicit type conversion, the compiler assumes you
take the responsibility I think.
Actually you can dump the final rtl using -fdump-rtl-shorten,look at
the memory alignment information.  In my experiment, it's A32 with
-mno-unaligned-access, which means it's 32 bits aligned.

Thanks,
bin

 int func(u8 *data)
 {
 return *(unsigned int *)data;
 }

  func:
0: e590  ldr r0, [r0]
4: e12fff1e  bx  lr

 Regards,
 Peng.

Re: Bounded array type?

2014-09-03 Thread Florian Weimer


On 09/02/2014 11:22 PM, James Nelson wrote:


This is error-prone because even though a size parameter is given, the code
in the function has no requirement to enforce it. With a bounded array
type, the prototype looks like this:

buf *foo(char buf[sz], size_t sz);


GCC already has a syntax extension to support this: 
https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html



The compiler now knows how large `buf` is, and it can put bounds checks
into the code (which may be disabled with -O3).


We tried this, but it is hard to find information about it, see “Bounded 
Pointers”.


Nowdays, there is -fsanitize=object-size, but I don't know if it uses 
VLA lengths: https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00923.html


Historically, propagation of object sizes from malloc and VLAs to 
__builtin_object_size was rather incomplete.


--
Florian Weimer / Red Hat Product Security

Re: Some questions about pass web

2014-09-03 Thread Steven Bosscher

On Wed, Sep 3, 2014 at 1:35 AM, Carrot Wei wrote:
 1. It is well known that register renaming is a big help to register
 allocation, but in gcc's backend, the web pass is far before RA, there
 are about 20 passes between them. Does it mean register renaming can
 also heavily benefit other optimizations?

Yes - sometimes anyway. Most non-SSA data flow analyses can't look
through pseudos that have multiple non-overlapping live ranges. Think
constant/copy propagation, value numbering, etc. After passes that
duplicate basic blocks, and not renaming registers, you get false
dependencies that hide RA and scheduling opportunities. This is why
pass_web is after code-duplication transformations like (RTL) loop
unrolling but before the last RTL CPROP pass. The old RA couldn't do
register renaming, so something before RA had to take care of it.
Enter pass_web.

But this is less relevant in GCC today, where RTL code transformations
basically should be limited to simple local transformations, with the
more difficult global transformations already done on GIMPLE
(including live range splitting, part of out-of-SSA). On top of that,
IRA knows how to do some forms of live range splitting (but not within
loops, AFAIU, because a loop is a single region in IRA). If someone
would give some TLC to the RTL loop-unroll.c code for
IV-splitting/accumulator-expanding and make them enabled by default, I
doubt pass_web would be doing much at all.


 And the passes between them
 usually don't generate more register renaming chances?

Usually not. Most of them create new pseudos for newly inserted expressions.

Some passes are actually harmed by pass_web. auto_inc_dec is one of those.


 2. It looks current web pass can't handle AUTOINC expressions, a reg
 operand is used as both use and def reference in an AUTOINC
 expression, so this def side should not be renamed. Pass web doesn't
 explicitly check this case, may rename the reg operand of AUTOINC
 expression. Is this expected because it is before auto_inc_dec pass?

You already found the DF_REF_READ_WRITE bits. pass_web also handles
match_dup constraints. That should be enough. If it is not, then
please file a bug report (and feel free to assign it to me).


 3. Are AUTOINC expressions only generated by auto_inc_dec pass? All
 passes before auto_inc_dec including expand should not generate
 AUTOINC expressions, otherwise it will break web.

IIRC, it used to be that only push/pop could be AUTOINC before
auto_inc_dec. I'm not sure if this is still true today.

Ciao!
Steven

Re: Some questions about pass web

2014-09-03 Thread Steven Bosscher

On Wed, Sep 3, 2014 at 9:17 AM, Bin.Cheng wrote:
 Last time I tried, there are several passes after loop_done and before
 auto-inc-dec can't handle auto-increment addressing mode, including
 fweb.

It surprises me that pass_web can't handle AUTOINC. Perhaps I'm off my
rocker, but it's always been my understanding that almost all passes
handle AUTOINC just fine (or at least conservatively: punt if you see
an AUTOINC), and that only CSE really doesn't know about AUTOINC at
all.

Ciao!
Steven

gcc parallel make check

2014-09-03 Thread VandeVondele Joost

I've noticed that

make -j -k check-fortran

results in a serialized checking, while

make -j32 -k check-fortran

goes parallel. Somehow the explicit 'N' in -jN seems to be needed for the check 
target, while the other targets seem to do just fine. Is that a feature, or 
should I file a PR for that... ?

Somewhat related is there a rule of thumb on how is the granularity of 
parallel check decided ? E.g. check-fortran seems to be limited to about ~5 
parallel targets, which is few for a typical server (but of course a welcome 
speedup already).

Thanks,

Joost

Re: gcc parallel make check

2014-09-03 Thread Marc Glisse


On Wed, 3 Sep 2014, VandeVondele  Joost wrote:


I've noticed that

make -j -k check-fortran

results in a serialized checking, while

make -j32 -k check-fortran

goes parallel. Somehow the explicit 'N' in -jN seems to be needed for the check 
target, while the other targets seem to do just fine. Is that a feature, or 
should I file a PR for that... ?


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53155

--
Marc Glisse

Re: gcc parallel make check

2014-09-03 Thread Jakub Jelinek

On Wed, Sep 03, 2014 at 09:15:51AM +, VandeVondele  Joost wrote:
 I've noticed that
 
 make -j -k check-fortran
 
 results in a serialized checking, while
 
 make -j32 -k check-fortran
 
 goes parallel. Somehow the explicit 'N' in -jN seems to be needed for the
 check target, while the other targets seem to do just fine.  Is that a
 feature, or should I file a PR for that...  ?

It is intentional.  With -j it is essentially a fork bomb, just don't use
it.

 Somewhat related is there a rule of thumb on how is the granularity of
 parallel check decided ?  E.g.  check-fortran seems to be limited to about
 ~5 parallel targets, which is few for a typical server (but of course a
 welcome speedup already).

The splitting has some cost (e.g. lots of various checks are cached, with
split jobs they need to be done in each separate goal), and the goal of the
split is toplevel make check parallelization, not individual directory or
language testing.  For the latter perhaps more fine grained split could be
useful, but how would one find out if it is a toplevel make check, or say
make -C gcc check where you test many languages, or check-gfortran?

Jakub

RE: gcc parallel make check

2014-09-03 Thread VandeVondele Joost

 It is intentional.  With -j it is essentially a fork bomb, just don't use it.

well, silently ignoring it for just this target did cost me a lot of time, 
while an eventual fork bomb would have been dealt with much more quickly.

 Somewhat related is there a rule of thumb on how is the granularity of
 parallel check decided ?  E.g.  check-fortran seems to be limited to about
 ~5 parallel targets, which is few for a typical server (but of course a
 welcome speedup already).

The splitting has some cost (e.g. lots of various checks are cached, with
split jobs they need to be done in each separate goal), and the goal of the
split is toplevel make check parallelization, not individual directory or
language testing.  For the latter perhaps more fine grained split could be
useful, but how would one find out if it is a toplevel make check, or say
make -C gcc check where you test many languages, or check-gfortran?

the cost must be small compared to the possible gain... on a 32 core server, 
testing of fortran FE changes would be 4x larger. I notice that even on a full 
check, the Fortran tests are still running when the number of processes is 
already way below 32. However, the longest running (by a few minutes) are those:

expect -- /usr/share/dejagnu/runtest.exp --tool gcc lto.exp weak.exp tls.exp 
ipa.exp tree-ssa.exp debug.exp dwarf2.exp fixed-point.exp vxworks.exp 
cilk-plus.exp vmx.exp pch.exp simulate-thread.exp x86_64-costmodel-vect.exp 
i386-costmodel-vect.exp spu-costmodel-vect.exp ppc-costmodel-vect.exp 
charset.exp noncompile.exp tsan.exp graphite.exp compat.exp
expect -- /usr/share/dejagnu/runtest.exp --tool g++ lto.exp tls.exp gcov.exp 
debug.exp dwarf2.exp cilk-plus.exp pch.exp bprob.exp simulate-thread.exp 
vect.exp charset.exp tsan.exp graphite.exp compat.exp struct-layout-1.exp 
ubsan.exp tm.exp gomp.exp dfp.exp tree-prof.exp stackalign.exp plugin.exp 
guality.exp asan.exp ecos.exp

so can those be run more independently ?

RE: gcc parallel make check

2014-09-03 Thread VandeVondele Joost


 What did you expect for -j alone? an error?

No, as is standard in gnu make, a new process for any target that can be 
processed (i.e. unlimited).

 ... check-fortran seems to be limited to about ~5 parallel targets ...

Running the make with -j8 gives 7 directories gfortran[1-6]? in gcc/testsuite/.
Note that the load balancing could be improved: few minutes with a single 
thread
over ~20 minutes.

I'd like to have roughly 32 directories (or as many of the -jN allows for).

Re: gcc parallel make check

2014-09-03 Thread Jakub Jelinek

On Wed, Sep 03, 2014 at 09:37:19AM +, VandeVondele  Joost wrote:
  It is intentional.  With -j it is essentially a fork bomb, just don't use 
  it.
 
 well, silently ignoring it for just this target did cost me a lot of time, 
 while an eventual fork bomb would have been dealt with much more quickly.
 
  Somewhat related is there a rule of thumb on how is the granularity of
  parallel check decided ?  E.g.  check-fortran seems to be limited to about
  ~5 parallel targets, which is few for a typical server (but of course a
  welcome speedup already).
 
 The splitting has some cost (e.g. lots of various checks are cached, with
 split jobs they need to be done in each separate goal), and the goal of the
 split is toplevel make check parallelization, not individual directory or
 language testing.  For the latter perhaps more fine grained split could be
 useful, but how would one find out if it is a toplevel make check, or say
 make -C gcc check where you test many languages, or check-gfortran?
 
 the cost must be small compared to the possible gain... on a 32 core
 server, testing of fortran FE changes would be 4x larger.  I notice that

It depends.  For make -j2 if you split check-gfortran alone into 32 pieces,
check-gcc into 32 pieces, check-g++ into 32 pieces, libstdc++ into 32 pieces
etc., it might be too much.
The problem with too fine-grained split beyond some cost to start the
testing in the goal, and running various cached tests, is also that once you
want to split a single *.exp job into smaller parts, you need to use
wildcards, but then it is a maintainance problem, you don't want to test
anything more than once, or not at all, even if new tests with weirdo names
are added later.

 even on a full check, the Fortran tests are still running when the number
 of processes is already way below 32.  However, the longest running (by a
 few minutes) are those:
 
 expect -- /usr/share/dejagnu/runtest.exp --tool gcc lto.exp weak.exp tls.exp 
 ipa.exp tree-ssa.exp debug.exp dwarf2.exp fixed-point.exp vxworks.exp 
 cilk-plus.exp vmx.exp pch.exp simulate-thread.exp x86_64-costmodel-vect.exp 
 i386-costmodel-vect.exp spu-costmodel-vect.exp ppc-costmodel-vect.exp 
 charset.exp noncompile.exp tsan.exp graphite.exp compat.exp
 expect -- /usr/share/dejagnu/runtest.exp --tool g++ lto.exp tls.exp gcov.exp 
 debug.exp dwarf2.exp cilk-plus.exp pch.exp bprob.exp simulate-thread.exp 
 vect.exp charset.exp tsan.exp graphite.exp compat.exp struct-layout-1.exp 
 ubsan.exp tm.exp gomp.exp dfp.exp tree-prof.exp stackalign.exp plugin.exp 
 guality.exp asan.exp ecos.exp
 
 so can those be run more independently ?

It is a moving target, new tests are added every day.  I'm trying to adjust
it during stage3/stage4 occassionally, but it also very much depends on
which target it is (e.g. i?86/x86_64 has many more tests in i386.exp then
other targets in their gcc.target), how fast the compiler is on the target
(e.g. on some targets -g is much slower than on others, etc.).

Jakub

Re: gcc parallel make check

2014-09-03 Thread Tobias Burnus

Hi Joost,

VandeVondele Joost wrote:
 I've noticed that 
 make -j -k check-fortran
 results in a serialized checking, while 
 make -j32 -k check-fortran
 goes parallel.

I have to admit that I don't know why that's the case. However, I can answer
the next question, is presumably related to this one:

 Somewhat related is there a rule of thumb on how is the granularitys
 of parallel check decided ?

DejaGNU is not able to run checks in parallel - thus, we have only makefile
parallelization (check-gcc, check-gfortran). As that wasn't suifficient,
Jakub (?) split the single tests into multiple ones, trying to do ensure that
those subtargets all take about the same time.

See: gcc/fortran/Make-lang.in, which has:

# For description see comment above check_gcc_parallelize in gcc/Makefile.in.
check_gfortran_parallelize = dg.exp=gfortran.dg/\[adAD\]* \
 dg.exp=gfortran.dg/\[bcBC\]* \
 dg.exp=gfortran.dg/\[nopNOP\]* \
 dg.exp=gfortran.dg/\[isuvISUV\]* \
 dg.exp=gfortran.dg/\[efhkqrxzEFHKQRXZ\]* \
 dg.exp=gfortran.dg/\[0-9gjlmtwyGJLMTWY\]*

Thus, you currently get 6 parallel check-gfortran checks - followed by one
which tries to combine the results.


I think Diego has some means to run GCC's in a vastly parallel way, which
break due to a test-framework issue / gfortran.dg-dependency issues. See
PR56408. Thus, you could asks him how he does it.  Additionally, I wouldn't
mind if some lispy person could look at the PR - my attempts failed, but,
admittedly, I didn't spend much time on it.

Tobias

PS: There was/is the reoccuring thought of replacing DejaGNU by a different
framework or to enhance it, but not much substantial work has happened,
despite some occasional effort.
At least DejaGNU is now back under maintaince, cf.
http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=summary

RE: gcc parallel make check

2014-09-03 Thread VandeVondele Joost

 I have to admit that I don't know why that's the case. 

Actually Marc answered that one (I had the wrong mail address for gcc@ so 
repeat here):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53155

 See: gcc/fortran/Make-lang.in, which has:

I'll have a look and do some testing what the gains/costs of a further split 
are.

Joost

RE: gcc parallel make check

2014-09-03 Thread VandeVondele Joost

 expect -- /usr/share/dejagnu/runtest.exp --tool gcc lto.exp weak.exp tls.exp 
 ipa.exp tree-ssa.exp debug.exp dwarf2.exp fixed-point.exp vxworks.exp 
 cilk-plus.exp vmx.exp pch.exp simulate-thread.exp x86_64-costmodel-vect.exp 
 i386-costmodel-vect.exp spu-costmodel-vect.exp ppc-costmodel-vect.exp 
 charset.exp noncompile.exp tsan.exp graphite.exp compat.exp
 expect -- /usr/share/dejagnu/runtest.exp --tool g++ lto.exp tls.exp gcov.exp 
 debug.exp dwarf2.exp cilk-plus.exp pch.exp bprob.exp simulate-thread.exp 
 vect.exp charset.exp tsan.exp graphite.exp compat.exp struct-layout-1.exp 
 ubsan.exp tm.exp gomp.exp dfp.exp tree-prof.exp stackalign.exp plugin.exp 
 guality.exp asan.exp ecos.exp

 so can those be run more independently ?

It is a moving target, new tests are added every day.  I'm trying to adjust
it during stage3/stage4 occassionally, but it also very much depends on
which target it is (e.g. i?86/x86_64 has many more tests in i386.exp then
other targets in their gcc.target), how fast the compiler is on the target
(e.g. on some targets -g is much slower than on others, etc.).

could you point me to the right file (or example commit) for trying to adjust 
this ? I can try to do some testing and come back with some numbers.

Re: gcc parallel make check

2014-09-03 Thread Jakub Jelinek

On Wed, Sep 03, 2014 at 10:35:41AM +, VandeVondele  Joost wrote:
  expect -- /usr/share/dejagnu/runtest.exp --tool gcc lto.exp weak.exp 
  tls.exp ipa.exp tree-ssa.exp debug.exp dwarf2.exp fixed-point.exp 
  vxworks.exp cilk-plus.exp vmx.exp pch.exp simulate-thread.exp 
  x86_64-costmodel-vect.exp i386-costmodel-vect.exp spu-costmodel-vect.exp 
  ppc-costmodel-vect.exp charset.exp noncompile.exp tsan.exp graphite.exp 
  compat.exp
  expect -- /usr/share/dejagnu/runtest.exp --tool g++ lto.exp tls.exp 
  gcov.exp debug.exp dwarf2.exp cilk-plus.exp pch.exp bprob.exp 
  simulate-thread.exp vect.exp charset.exp tsan.exp graphite.exp compat.exp 
  struct-layout-1.exp ubsan.exp tm.exp gomp.exp dfp.exp tree-prof.exp 
  stackalign.exp plugin.exp guality.exp asan.exp ecos.exp
 
  so can those be run more independently ?
 
 It is a moving target, new tests are added every day.  I'm trying to adjust
 it during stage3/stage4 occassionally, but it also very much depends on
 which target it is (e.g. i?86/x86_64 has many more tests in i386.exp then
 other targets in their gcc.target), how fast the compiler is on the target
 (e.g. on some targets -g is much slower than on others, etc.).
 
 could you point me to the right file (or example commit) for trying to adjust 
 this ? I can try to do some testing and come back with some numbers.

The splits are in the Makefiles, see check_gcc_parallelize
var in gcc/Makefile.in (where there is a big comment with documentation),
check_g++_parallelize var in gcc/cp/Make-lang.in, check_gfortran_parallelize
var in gcc/fortran/Make-lang.in, or check-DEJAGNU goal in
libstdc++-v3/testsuite/Makefile.am.

Jakub

optimization for simd intrinsics vld2_dup_* on aarch64-none-elf

2014-09-03 Thread shanyao chen

Hi,
I found there is a performance problem with some simd intrinsics
(vld2_dup_*) on aarch64-none-elf. Now the vld2_dup_* are defined as
follows:

#define __LD2R_FUNC(rettype, structtype, ptrtype, \
regsuffix, funcsuffix, Q) \
  __extension__ static __inline rettype \
  __attribute__ ((__always_inline__))  \
  vld2 ## Q ## _dup_ ## funcsuffix (const ptrtype *ptr) \
  { \
rettype result; \
__asm__ (ld2r {v16. #regsuffix , v17. #regsuffix }, %1\n\t \
 st1 {v16. #regsuffix , v17. #regsuffix }, %0\n\t \
 : =Q(result) \
 : Q(*(const structtype *)ptr) \
 : memory, v16, v17); \
return result; \
  }

It loads from memory to registers, and then store the value of
registers to memory as a result. Such code is terribly low in
performance because of redundant memory visit and limited registers
allocation.

Some intinsics like vld2_* were similar to vld2_dup_*, but now they
are realized by builtin functions.

__extension__ static __inline int16x4x2_t __attribute__ ((__always_inline__))
vld2_s16 (const int16_t * __a)
{
  int16x4x2_t ret;
  __builtin_aarch64_simd_oi __o;
  __o = __builtin_aarch64_ld2v4hi ((const __builtin_aarch64_simd_hi *) __a);
  ret.val[0] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0);
  ret.val[1] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1);
  return ret;
}

Could vld2_dup_* also be written as builtin ?  If not, i think the
inline assembler can be optimized as follows :

#define __LD2R_FUNC(rettype, structtype, ptrtype, \
regsuffix, funcsuffix, Q) \
  __extension__ static __inline rettype \
  __attribute__ ((__always_inline__))  \
  vld2 ## Q ## _dup_ ## funcsuffix (const ptrtype *ptr) \
  { \
rettype result; \
__asm__ (
 ld2r {%0.4h, %1.4h}, %2   \
 : =V16(result.val[0]), =V17(result.val[1]) \
 : Q(*(const structtype *)ptr) \
 : memory, v16, v17); \
return result; \
  }

It need to add a reg_class_name v16v17 and add constraints V16   V17
for them. For this, aarch64.h、aarch64.c、constraints.md should be
modified.
-- 
Shanyao Chen

Re: Bounded array type?

2014-09-03 Thread Joseph S. Myers

On Wed, 3 Sep 2014, Florian Weimer wrote:

 On 09/02/2014 11:22 PM, James Nelson wrote:
 
  This is error-prone because even though a size parameter is given, the code
  in the function has no requirement to enforce it. With a bounded array
  type, the prototype looks like this:
  
  buf *foo(char buf[sz], size_t sz);
 
 GCC already has a syntax extension to support this:
 https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html

But the size declared in a parameter declaration has no semantic 
significance; there is no requirement that the pointer passed does point 
to an array of that size.  If you declare the size as [static sz] then 
that means it points to an array of at least that size, but it could be 
larger.

Thus, any option for any sort of bounds checks based on parameter array 
sizes (constant or non-constant) would be an option that explicitly 
produces errors for valid C code.  (You could always have a function 
attribute to enable checking based on parameter array sizes - such an 
attribute would declare that the function should never access the 
parameter array outside the bounds given by the size, even if the array 
passed by the caller is larger, and maybe also that the caller must not 
pass an array smaller than the size given.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Some questions about pass web

2014-09-03 Thread Jeff Law


On 09/03/14 02:35, Steven Bosscher wrote:

On Wed, Sep 3, 2014 at 9:17 AM, Bin.Cheng wrote:

Last time I tried, there are several passes after loop_done and before
auto-inc-dec can't handle auto-increment addressing mode, including
fweb.


It surprises me that pass_web can't handle AUTOINC. Perhaps I'm off my
rocker, but it's always been my understanding that almost all passes
handle AUTOINC just fine (or at least conservatively: punt if you see
an AUTOINC), and that only CSE really doesn't know about AUTOINC at
all.
In the past autoinc instructions didn't appear until flow (just prior to 
combine) and that was documented behaviour.  So anything which was run 
strictly prior to flow/combine wasn't autoinc aware.  That may have 
changed somewhat with the autoinc rewrite.


jeff

Re: optimization for simd intrinsics vld2_dup_* on aarch64-none-elf

2014-09-03 Thread Kyrill Tkachov


Hi Shanyao,

On 03/09/14 16:02, shanyao chen wrote:

Hi,
I found there is a performance problem with some simd intrinsics
(vld2_dup_*) on aarch64-none-elf. Now the vld2_dup_* are defined as
follows:

#define __LD2R_FUNC(rettype, structtype, ptrtype, \
 regsuffix, funcsuffix, Q) \
   __extension__ static __inline rettype \
   __attribute__ ((__always_inline__))  \
   vld2 ## Q ## _dup_ ## funcsuffix (const ptrtype *ptr) \
   { \
 rettype result; \
 __asm__ (ld2r {v16. #regsuffix , v17. #regsuffix }, %1\n\t \
  st1 {v16. #regsuffix , v17. #regsuffix }, %0\n\t \
  : =Q(result) \
  : Q(*(const structtype *)ptr) \
  : memory, v16, v17); \
 return result; \
   }

It loads from memory to registers, and then store the value of
registers to memory as a result. Such code is terribly low in
performance because of redundant memory visit and limited registers
allocation.

Some intinsics like vld2_* were similar to vld2_dup_*, but now they
are realized by builtin functions.

__extension__ static __inline int16x4x2_t __attribute__ ((__always_inline__))
vld2_s16 (const int16_t * __a)
{
   int16x4x2_t ret;
   __builtin_aarch64_simd_oi __o;
   __o = __builtin_aarch64_ld2v4hi ((const __builtin_aarch64_simd_hi *) __a);
   ret.val[0] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0);
   ret.val[1] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1);
   return ret;
}

Could vld2_dup_* also be written as builtin ?  If not, i think the
inline assembler can be optimized as follows :


The arm port implements these with builtins, it should possible to 
implement them that way on aarch64 as well.


Could you log an issue in bugzilla please, including some source code 
demonstrating the poor

codegen if possible?

Thanks,
Kyrill


#define __LD2R_FUNC(rettype, structtype, ptrtype, \
 regsuffix, funcsuffix, Q) \
   __extension__ static __inline rettype \
   __attribute__ ((__always_inline__))  \
   vld2 ## Q ## _dup_ ## funcsuffix (const ptrtype *ptr) \
   { \
 rettype result; \
 __asm__ (
  ld2r {%0.4h, %1.4h}, %2   \
  : =V16(result.val[0]), =V17(result.val[1]) \
  : Q(*(const structtype *)ptr) \
  : memory, v16, v17); \
 return result; \
   }

It need to add a reg_class_name v16v17 and add constraints V16   V17
for them. For this, aarch64.h、aarch64.c、constraints.md should be
modified.

stack_pointer_delta related ICE in libgcc on 4.9.1

2014-09-03 Thread Bernhard Reutner-Fischer

Trying to bootstrap m68k i hit an assert in emit_library_call_value_1
that wants to assure that the stack is aligned properly.

PUSH_ROUNDING(GET_MODE_SIZE(QImode)) for m5206 is currently 1 so
the testcase below has stack_pointer_delta = 1 + 1 + 4
but emit_library_call_value_1() has this:

  /* Stack must be properly aligned now.  */
  gcc_assert (!(stack_pointer_delta
 (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT - 1)));

where 6  3 != 0 and ICEs

I am not familier with m68k so i would be glad for any help!

Should the arg be partial? Doesn't look like, no.
Does m68k use stack save area? From the looks it doesn't.
Is the alignment_pad for QImode arg wrong?
Should PUSH_ROUNDING be changed back to the !CF variant?
Or is the alignment assert too strict?
Perhaps m5206 is not TARGET_CAS and should not compile this linux-atomic
in the first place?

(is emit_library_call_value_1 missing a do_pending_stack_adjust() before
NO_DEFER_POP ? Does not seem relevant for this case though)

Slightly simplified reproducer:

$ cat x.i ; echo EOF; # see libgcc/config/m68k/linux-atomic.c
unsigned char
  __attribute__ ((visibility (hidden)))
__sync_val_compare_and_swap_1 (unsigned char *ptr, unsigned char soldval,
   unsigned char snewval)
{
  unsigned *wordptr = (unsigned *) ((unsigned long) ptr  ~3);
  unsigned int mask, shift, woldval, wnewval;
  unsigned oldval, newval, cmpval;
  shift = (((unsigned long) ptr  3)  3) ^ 24;
  mask = 0xffu  shift;
  woldval = (soldval  0xffu)  shift;
  wnewval = (snewval  0xffu)  shift;
  cmpval = *wordptr;
  do
{
  oldval = cmpval;
  if ((oldval  mask) != woldval)
break;
  newval = (oldval  ~mask) | wnewval;
  {
register unsigned *a0 asm (a0) = wordptr;
register unsigned d2 asm (d2) = oldval;
register unsigned d1 asm (d1) = newval;
register unsigned d0 asm (d0) = 335;
asm volatile (trap #0:=r (d0), =r (d1), =r (a0):r (d0),
  r (d1), r (d2), r (a0):memory, a1);
cmpval = d0;
  }
}
  while (__builtin_expect (oldval != cmpval, 0));
  return (oldval  shift)  0xffu;
}

_Bool
  __attribute__ ((visibility (hidden)))
__sync_bool_compare_and_swap_1 (unsigned char *ptr, unsigned char oldval,
unsigned char newval)
{
  return (__sync_val_compare_and_swap_1 (ptr, oldval, newval) == oldval);
}
EOF


/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/bin/m68k-oe-linux.gcc-cross-initial-m68k/m68k-oe-linux-gcc
  -mcpu=5206 
--sysroot=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/qemum68k 
-O2 -pipe -g -feliminate-unused-debug-types -O2  -g -Os -DIN_GCC  
-DCROSS_DIRECTORY_STRUCTURE  -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition  
-isystem ./include   -fPIC -g -DIN_LIBGCC2 -fbuilding-libgcc 
-fno-stack-protector -Dinhibit_libc  -fPIC -I. -I. 
-I/home/me/src/oe/openembedded-core/build/tmp-glibc/work/m5206-oe-linux/libgcc-initial/4.9.1-r0/gcc-4.9.1/build.m68k-oe-linux.m68k-oe-linux/m68k-oe-linux/libgcc/../.././gcc
 
-I/home/me/src/oe/openembedded-core/build/tmp-glibc/work-shared/gcc-4.9.1-r0/gcc-4.9.1/libgcc
 
-I/home/me/src/oe/openembedded-core/build/tmp-glibc/work-shared/gcc-4.9.1-r0/gcc-4.9.1/libgcc/.
 
-I/home/me/src/oe/openembedded-core/build/tmp-glibc/work-shared/gcc-4.9.1-r0/gcc-4.9.1/libgcc/../gcc
 
-I/home/me/src/oe/openembedded-core/build/tmp-glibc/work-shared/gcc-4.9.1-r0/gcc-4.9.1/libgcc/../include
  -DHAVE_CC_TLS  -o o.o -MT linux-atomic.i -MD -MP -MF linux-atomic.dep  -c x.i 
-fvisibility=hidden -DHIDE_EXPORTS -v
Using built-in specs.
COLLECT_GCC=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/bin/m68k-oe-linux.gcc-cross-initial-m68k/m68k-oe-linux-gcc
Target: m68k-oe-linux
Configured with: 
/home/me/src/oe/openembedded-core/build/tmp-glibc/work-shared/gcc-4.9.1-r0/gcc-4.9.1/configure
 --build=x86_64-linux --host=x86_64-linux --target=m68k-oe-linux 
--prefix=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr
 
--exec_prefix=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr
 
--bindir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/bin/m68k-oe-linux.gcc-cross-initial-m68k
 
--sbindir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/bin/m68k-oe-linux.gcc-cross-initial-m68k
 
--libexecdir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/libexec/m68k-oe-linux.gcc-cross-initial-m68k
 
--datadir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/usr/share
 
--sysconfdir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/etc
 
--sharedstatedir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/com
 
--localstatedir=/home/me/src/oe/openembedded-core/build/tmp-glibc/sysroots/x86_64-linux/var

Re: Bounded array type?

2014-09-03 Thread Florian Weimer


On 09/03/2014 05:20 PM, Joseph S. Myers wrote:

On Wed, 3 Sep 2014, Florian Weimer wrote:


On 09/02/2014 11:22 PM, James Nelson wrote:


This is error-prone because even though a size parameter is given, the code
in the function has no requirement to enforce it. With a bounded array
type, the prototype looks like this:

buf *foo(char buf[sz], size_t sz);


GCC already has a syntax extension to support this:
https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html


But the size declared in a parameter declaration has no semantic
significance; there is no requirement that the pointer passed does point
to an array of that size.


I believe this was different with the bounded pointer extension.  But I 
might misremember how things worked.  I've never used it (I think), I 
only recall reading some documentation which has now vanished.



If you declare the size as [static sz] then
that means it points to an array of at least that size, but it could be
larger.


GCC does not seem to enforce that.  This compiles without errors:

int foo(char [static 5]);

int
bar(char *p)
{
  return foo(p);
}

This could be

--
Florian Weimer / Red Hat Product Security

Re: Bounded array type?

2014-09-03 Thread Joseph S. Myers

On Wed, 3 Sep 2014, Florian Weimer wrote:

  If you declare the size as [static sz] then
  that means it points to an array of at least that size, but it could be
  larger.
 
 GCC does not seem to enforce that.  This compiles without errors:

[static] is about optimization (but GCC doesn't optimize using it either).  
It's only undefined behavior if a call with a too-small array is actually 
executed.

 int foo(char [static 5]);
 
 int
 bar(char *p)
 {
   return foo(p);
 }

That's perfectly valid, as long as every call to bar is with an argument 
that does in fact point to at least 5 chars (if a call doesn't, there's 
undefined behavior when that call is executed).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Some questions about pass web

2014-09-03 Thread Carrot Wei

On Wed, Sep 3, 2014 at 1:29 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Wed, Sep 3, 2014 at 1:35 AM, Carrot Wei wrote:
 1. It is well known that register renaming is a big help to register
 allocation, but in gcc's backend, the web pass is far before RA, there
 are about 20 passes between them. Does it mean register renaming can
 also heavily benefit other optimizations?

 Yes - sometimes anyway. Most non-SSA data flow analyses can't look
 through pseudos that have multiple non-overlapping live ranges. Think
 constant/copy propagation, value numbering, etc. After passes that
 duplicate basic blocks, and not renaming registers, you get false
 dependencies that hide RA and scheduling opportunities. This is why
 pass_web is after code-duplication transformations like (RTL) loop
 unrolling but before the last RTL CPROP pass. The old RA couldn't do
 register renaming, so something before RA had to take care of it.
 Enter pass_web.

 But this is less relevant in GCC today, where RTL code transformations
 basically should be limited to simple local transformations, with the
 more difficult global transformations already done on GIMPLE
 (including live range splitting, part of out-of-SSA). On top of that,
 IRA knows how to do some forms of live range splitting (but not within
 loops, AFAIU, because a loop is a single region in IRA). If someone
 would give some TLC to the RTL loop-unroll.c code for
 IV-splitting/accumulator-expanding and make them enabled by default, I
 doubt pass_web would be doing much at all.


 And the passes between them
 usually don't generate more register renaming chances?

 Usually not. Most of them create new pseudos for newly inserted expressions.

 Some passes are actually harmed by pass_web. auto_inc_dec is one of those.


 2. It looks current web pass can't handle AUTOINC expressions, a reg
 operand is used as both use and def reference in an AUTOINC
 expression, so this def side should not be renamed. Pass web doesn't
 explicitly check this case, may rename the reg operand of AUTOINC
 expression. Is this expected because it is before auto_inc_dec pass?

 You already found the DF_REF_READ_WRITE bits. pass_web also handles
 match_dup constraints. That should be enough. If it is not, then
 please file a bug report (and feel free to assign it to me).

Bug entry filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63156.
Debugging shows that DF_REF_READ_WRITE is not set for the operand of
post_inc, may be a dataflow problem?

thanks
Guozhi Wei

Re: [RFC] Don't inline builtin memory functions when ASan is enabled.

2014-09-03 Thread Konstantin Serebryany

On Tue, Sep 2, 2014 at 7:32 AM, Maxim Ostapenko
m.ostape...@partner.samsung.com wrote:
 Hi,

 At this moment, most of GCC builtin memory functions (for example strcpy,
 stpcpy, wcpcpy, strdup, etc) are not instrumented by GCC, however some of
 them are rather dangerous. If GCC inlines these builtin functions, we will
 miss important checks for arguments, and possible overflow won't be
 detected. I know, that Clang ASan team simply disable inlining of builtin
 functions in Clang if -fsanitize=address is enabled and rely on
 libsanitizer's hooks.

Correct, that's what we do.


 The main benefit of this approach is that we won't miss overflow in
 builtins, that can significantly increase target programs safety. Also, some
 redundant checks will be removed for builtin functions, that are
 instrumented and are not inlined for some reasons.

 The potential disadvantage of this approach is performance decreasing for
 sanitized programs.

 Does disabling of builtin functions inlining look sane in this case? If yes,
 I can provide performance investigation and prepare the patch.

 What do you think?

 -Maxim

Re: stack_pointer_delta related ICE in libgcc on 4.9.1

2014-09-03 Thread Jeff Law


On 09/03/14 09:56, Bernhard Reutner-Fischer wrote:

Trying to bootstrap m68k i hit an assert in emit_library_call_value_1
that wants to assure that the stack is aligned properly.

PUSH_ROUNDING(GET_MODE_SIZE(QImode)) for m5206 is currently 1 so
the testcase below has stack_pointer_delta = 1 + 1 + 4
but emit_library_call_value_1() has this:

   /* Stack must be properly aligned now.  */
   gcc_assert (!(stack_pointer_delta
  (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT - 1)));

where 6  3 != 0 and ICEs

I am not familier with m68k so i would be glad for any help!

Should the arg be partial? Doesn't look like, no.

No.  m68k doesn't pass args in registers.


Does m68k use stack save area? From the looks it doesn't.
No.  m68k does not pass args in registers, I believe that's a 
requirement for needing a stack save area.



Is the alignment_pad for QImode arg wrong?
Possibly.  Clearly if the stack is to be aligned to larger than a byte 
and PUSH_ROUNDING has no adjustment for QImode, then padding is needed 
somewhere.  And both the caller and callee need to agree on the padding.



Should PUSH_ROUNDING be changed back to the !CF variant?
Possibly.  It's unfortunate the CF chips do something different than 
other m68k variants here.  The change in behaviour would seem to imply 
it's impossible to mix traditional m68k code with CF code, though 
perhaps nobody cares about that.




Or is the alignment assert too strict?

I don't think so.


Perhaps m5206 is not TARGET_CAS and should not compile this linux-atomic
in the first place?

No, I don't think so.

You might ping Jim Wilson or Richard Sandiford who have both done 
coldfire work in the past.  I really don't have any experience with the 
coldfire bits.




(is emit_library_call_value_1 missing a do_pending_stack_adjust() before
NO_DEFER_POP ? Does not seem relevant for this case though)
Unsure.  I haven't done significant work on the m68k in decades, so the 
rules around defer_pop have long since been dropped from my memory.  If 
you can describe why you think it might be missing it'd be helpful for 
evaluation.


My recommendation would be to file a bug report with the reproducer. 
m68k isn't nearly as important today as it has been in the past, so 
getting developer time to hash through how all this should work for the 
coldfire may be difficult.


jeff

Re: stack_pointer_delta related ICE in libgcc on 4.9.1

2014-09-03 Thread Joel Sherrill


On 9/3/2014 1:24 PM, Jeff Law wrote:
 On 09/03/14 09:56, Bernhard Reutner-Fischer wrote:
 Trying to bootstrap m68k i hit an assert in emit_library_call_value_1
 that wants to assure that the stack is aligned properly.

 PUSH_ROUNDING(GET_MODE_SIZE(QImode)) for m5206 is currently 1 so
 the testcase below has stack_pointer_delta = 1 + 1 + 4
 but emit_library_call_value_1() has this:

/* Stack must be properly aligned now.  */
gcc_assert (!(stack_pointer_delta
   (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT - 1)));

 where 6  3 != 0 and ICEs

 I am not familier with m68k so i would be glad for any help!

 Should the arg be partial? Doesn't look like, no.
 No.  m68k doesn't pass args in registers.

 Does m68k use stack save area? From the looks it doesn't.
 No.  m68k does not pass args in registers, I believe that's a 
 requirement for needing a stack save area.

 Is the alignment_pad for QImode arg wrong?
 Possibly.  Clearly if the stack is to be aligned to larger than a byte 
 and PUSH_ROUNDING has no adjustment for QImode, then padding is needed 
 somewhere.  And both the caller and callee need to agree on the padding.
FWIW For stack alignment RTEMS  does not distinguish between any m68k or
Coldfire variant. The note says that what comes from the allocator is
sufficiently
aligned.  And that's on a 4-byte boundary.

My recollection is that was selected in the m68020 days to avoid the penalty
of unaligned accesses -- not to avoid faults. I don't recall if
Coldfires fault or
handle the unaligned accesses but either way, there is a penalty incurred
and you want to avoid it.
 Should PUSH_ROUNDING be changed back to the !CF variant?
 Possibly.  It's unfortunate the CF chips do something different than 
 other m68k variants here.  
If that gives you 4-byte stack alignment, then yes. I think the same stack
alignment rules should apply.
 The change in behaviour would seem to imply 
 it's impossible to mix traditional m68k code with CF code, though 
 perhaps nobody cares about that.
I would bet that myself also. I know we don't care. But we provide
source and
our users compile it themselves with the best options. :)

 Or is the alignment assert too strict?
 I don't think so.

 Perhaps m5206 is not TARGET_CAS and should not compile this linux-atomic
 in the first place?
 No, I don't think so.
Coldfire does not have the CAS instruction per
http://www.freescale.com/files/dsp/doc/ref_manual/CFPRM.pdf
 You might ping Jim Wilson or Richard Sandiford who have both done 
 coldfire work in the past.  I really don't have any experience with the 
 coldfire bits.

 (is emit_library_call_value_1 missing a do_pending_stack_adjust() before
 NO_DEFER_POP ? Does not seem relevant for this case though)
 Unsure.  I haven't done significant work on the m68k in decades, so the 
 rules around defer_pop have long since been dropped from my memory.  If 
 you can describe why you think it might be missing it'd be helpful for 
 evaluation.

 My recommendation would be to file a bug report with the reproducer. 
 m68k isn't nearly as important today as it has been in the past, so 
 getting developer time to hash through how all this should work for the 
 coldfire may be difficult.

 jeff


-- 
Joel Sherrill, Ph.D. Director of Research  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985

Re: selective linking of floating point support for printf / scanf

2014-09-03 Thread Joern Rennecke

On 2 September 2014 16:28, Joseph S. Myers jos...@codesourcery.com wrote:
 On Tue, 2 Sep 2014, Joey Ye wrote:

 Apparently newlib is not following this specification very well, as
 there are symbols like _abc_r defined every where in current newlib. I
 am not implying the spec should not be followed, but is newlib
 designed to have a loose spec for the single underscore?

 Identifiers beginning with a single underscore are reserved with file
 scope.  This means an application cannot provide an external definition of
 them, because such an external definition would have file scope.  So it's
 fine for the implementation to define such identifiers and use them in the
 implementation of standard functions.

Hmm, this trows up another question how in GNU C, extensions interact with
the putatively unchanged parts of the standard.
If a user program defines an assembler name for a global function which is
different from the name used in the source code, is that assembler name
used at file scope?  It would seem to me it's only used at global/link scope.
As such, is the use of _[a-z].* as assembly names then part of the
user namespace?

Re: selective linking of floating point support for printf / scanf

2014-09-03 Thread Joseph S. Myers

On Wed, 3 Sep 2014, Joern Rennecke wrote:

 On 2 September 2014 16:28, Joseph S. Myers jos...@codesourcery.com wrote:
  On Tue, 2 Sep 2014, Joey Ye wrote:
 
  Apparently newlib is not following this specification very well, as
  there are symbols like _abc_r defined every where in current newlib. I
  am not implying the spec should not be followed, but is newlib
  designed to have a loose spec for the single underscore?
 
  Identifiers beginning with a single underscore are reserved with file
  scope.  This means an application cannot provide an external definition of
  them, because such an external definition would have file scope.  So it's
  fine for the implementation to define such identifiers and use them in the
  implementation of standard functions.
 
 Hmm, this trows up another question how in GNU C, extensions interact with
 the putatively unchanged parts of the standard.
 If a user program defines an assembler name for a global function which is
 different from the name used in the source code, is that assembler name
 used at file scope?  It would seem to me it's only used at global/link scope.
 As such, is the use of _[a-z].* as assembly names then part of the
 user namespace?

I see no reason a standard header shouldn't be able to define _[a-z] 
static functions at file scope, so I think it should be presumed that such 
names as assembly names are part of the implementation namespace.  That's 
certainly the case for names such as _a.1 that GCC can generate for 
block-scope static variables called _a: if you generate such assembler 
names yourself, you risk conflicting with ones generated by GCC for 
block-scope statics.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Enable EBX for x86 in 32bits PIC code

2014-09-03 Thread Vladimir Makarov


On 2014-08-29 2:47 AM, Ilya Enkovich wrote:

Seems your patch doesn't cover all cases.  Attached is a modified
patch (with your changes included) and a test where double constant is
wrongly rematerialized.  I also see in ira dump that there is still a
copy of PIC reg created:

Initialization of original PIC reg:
(insn 23 22 24 2 (set (reg:SI 127)
 (reg:SI 3 bx)) test.cc:42 90 {*movsi_internal}
  (expr_list:REG_DEAD (reg:SI 3 bx)
 (nil)))
...
Copy is created:
(insn 135 37 25 3 (set (reg:SI 138 [127])
 (reg:SI 127)) 90 {*movsi_internal}
  (expr_list:REG_DEAD (reg:SI 127)
 (nil)))
...
Copy is used:
(insn 119 25 122 3 (set (reg:DF 134)
 (mem/u/c:DF (plus:SI (reg:SI 138 [127])
 (const:SI (unspec:SI [
 (symbol_ref/u:SI (*.LC0) [flags 0x2])
 ] UNSPEC_GOTOFF))) [5  S8 A64])) 128 {*movdf_internal}
  (expr_list:REG_EQUIV (const_double:DF
2.9997371893933895137251965934410691261292e-4
[0x0.9d495182a99308p-11])
 (nil)))



The copy is created by a newer IRA optimization for function prologues.

The patch in the attachment should solve the problem.  I also added the 
code to prevent spilling the pic pseudo in LRA which could happen before 
theoretically.




After reload we have new usage of r127 which is allocated to ecx which
actually does not have any definition in this function at all.

(insn 151 42 44 4 (set (reg:SI 0 ax [147])
 (plus:SI (reg:SI 2 cx [127])
 (const:SI (unspec:SI [
 (symbol_ref/u:SI (*.LC0) [flags 0x2])
 ] UNSPEC_GOTOFF test.cc:44 213 {*leasi}
  (expr_list:REG_EQUAL (symbol_ref/u:SI (*.LC0) [flags 0x2])
 (nil)))
(insn 44 151 45 4 (set (reg:DF 21 xmm0 [orig:129 D.2450 ] [129])
 (mult:DF (reg:DF 21 xmm0 [orig:128 D.2450 ] [128])
 (mem/u/c:DF (reg:SI 0 ax [147]) [5  S8 A64]))) test.cc:44
790 {*fop_df_comm_sse}
  (expr_list:REG_EQUAL (mult:DF (reg:DF 21 xmm0 [orig:128 D.2450 ] [128])
 (const_double:DF
2.9997371893933895137251965934410691261292e-4
[0x0.9d495182a99308p-11]))
 (nil)))

Compilation string: g++ -m32 -O2 -mfpmath=sse -fPIE -S test.cc


Index: ira.c
===
--- ira.c   (revision 214576)
+++ ira.c   (working copy)
@@ -4887,7 +4887,7 @@ split_live_ranges_for_shrink_wrap (void)
   FOR_BB_INSNS (first, insn)
 {
   rtx dest = interesting_dest_for_shprep (insn, call_dom);
-  if (!dest)
+  if (!dest || dest == pic_offset_table_rtx)
continue;
 
   rtx newreg = NULL_RTX;
Index: lra-assigns.c
===
--- lra-assigns.c   (revision 214576)
+++ lra-assigns.c   (working copy)
@@ -879,11 +879,13 @@ spill_for (int regno, bitmap spilled_pse
}
   /* Spill pseudos. */
   EXECUTE_IF_SET_IN_BITMAP (spill_pseudos_bitmap, 0, spill_regno, bi)
-   if ((int) spill_regno = lra_constraint_new_regno_start
-! bitmap_bit_p (lra_inheritance_pseudos, spill_regno)
-! bitmap_bit_p (lra_split_regs, spill_regno)
-! bitmap_bit_p (lra_subreg_reload_pseudos, spill_regno)
-! bitmap_bit_p (lra_optional_reload_pseudos, spill_regno))
+   if ((pic_offset_table_rtx != NULL
+ spill_regno == REGNO (pic_offset_table_rtx))
+   || ((int) spill_regno = lra_constraint_new_regno_start
+! bitmap_bit_p (lra_inheritance_pseudos, spill_regno)
+! bitmap_bit_p (lra_split_regs, spill_regno)
+! bitmap_bit_p (lra_subreg_reload_pseudos, spill_regno)
+! bitmap_bit_p (lra_optional_reload_pseudos, spill_regno)))
  goto fail;
   insn_pseudos_num = 0;
   if (lra_dump_file != NULL)
@@ -1053,7 +1055,9 @@ setup_live_pseudos_and_spill_after_risky
   return;
 }
   for (n = 0, i = FIRST_PSEUDO_REGISTER; i  max_regno; i++)
-if (reg_renumber[i] = 0  lra_reg_info[i].nrefs  0)
+if ((pic_offset_table_rtx == NULL_RTX
+|| i != (int) REGNO (pic_offset_table_rtx))
+reg_renumber[i] = 0  lra_reg_info[i].nrefs  0)
   sorted_pseudos[n++] = i;
   qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func);
   for (i = n - 1; i = 0; i--)
@@ -1360,6 +1364,8 @@ assign_by_spills (void)
}
   EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos, conflict_regno)
{
+ gcc_assert (pic_offset_table_rtx == NULL
+ || conflict_regno != REGNO (pic_offset_table_rtx));
  if ((int) conflict_regno = lra_constraint_new_regno_start)
sorted_pseudos[nfails++] = conflict_regno;
  if (lra_dump_file != NULL)

Re: stack_pointer_delta related ICE in libgcc on 4.9.1

2014-09-03 Thread Andreas Schwab

Joel Sherrill joel.sherr...@oarcorp.com writes:

 On 9/3/2014 1:24 PM, Jeff Law wrote:
 On 09/03/14 09:56, Bernhard Reutner-Fischer wrote:
 Perhaps m5206 is not TARGET_CAS and should not compile this linux-atomic
 in the first place?
 No, I don't think so.
 Coldfire does not have the CAS instruction per
 http://www.freescale.com/files/dsp/doc/ref_manual/CFPRM.pdf

On Linux it uses a kernel helper for atomic operations.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Compare Elimination problems

2014-09-03 Thread Paul Shortis



For a 16 bit CPU the cmpelim pass is changing

(insn 33 84 85 6 (parallel [
(set (reg:HI 1 r1)
(ashift:HI (reg:HI 1 r1)
(const_int 1 [0x1])))
(clobber (reg:CC_NOOV 7 flags))
]) ../gcc/testsuite/gcc.c-torture/execute/960311-3.c:18 
33 {ashlhi3}


(insn 34 87 35 6 (set (reg:CC_NOOV 7 flags)
(compare:CC_NOOV (reg:SI 0 r0)
(const_int 0 [0]))) 
../gcc/testsuite/gcc.c-torture/execute/960311-3.c:20 39 
{*comparesi3_nov}


(jump_insn 35 34 36 6 (set (pc)
(if_then_else (ge (reg:CC_NOOV 7 flags)

to

(insn 33 84 85 6 (parallel [
(set (reg:HI 1 r1)
(ashift:HI (reg:HI 1 r1)
(const_int 1 [0x1])))
(set (reg:CC_NOOV 7 flags)
(compare:CC_NOOV (ashift:HI (reg:HI 1 r1)
(const_int 1 [0x1]))
(const_int 0 [0])))
]) ../gcc/testsuite/gcc.c-torture/execute/960311-3.c:18 
29 {ashlhi3_cc}


(jump_insn 35 87 36 6 (set (pc)
(if_then_else (ge (reg:CC_NOOV 7 flags)


(reg:HI r1) is a subreg of (reg:SI r0) however the cmpelim seems 
to be substituting the compare of (reg:HI r1 and 0) for the 
compare of (reg:SI r0 and 0) ?


While I'm here, in i386.md some of the flag setting operations 
specify a mode and some don't . Eg


(define_expand cmpmode_1
  [(set (reg:CC FLAGS_REG)
(compare:CC (match_operand:SWI48 0 nonimmediate_operand)


(define_insn *addmode_3
  [(set (reg FLAGS_REG)
(compare

Can anyone explain the significance of this ?

Thanks, Paul.

gcc-4.9-20140903 is now available

2014-09-03 Thread gccadmin

Snapshot gcc-4.9-20140903 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20140903/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 214893

You'll find:

 gcc-4.9-20140903.tar.bz2 Complete GCC

  MD5=24dfd67139fda4746d2deff18182611d
  SHA1=d5bf2b1fba133bef433d459a3add44fec262ab20

Diffs from 4.9-20140827 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

PATCH for Re: New GCC mirror

2014-09-03 Thread Gerald Pfeifer

On Fri, 29 Aug 2014, ConcertPass Mirrors Admin wrote:
 we set up a new GCC mirror for the community. 
 
 URL: http://mirrors.concertpass.com/gcc/
 Organization/Contact: ConcertPass (ad...@mirrors.concertpass.com)
 Location: United States, Michigan
 
 Please, add it to your mirror list page.

Done thusly.  

Note that your page claims this to be a Sudo mirror; you may want
to make that read GCC mirror, or better gcc.gnu.org mirror.

(Out of curiosity, this is not for the sake of search engine 
optimization, is it?)

Gerald

Index: mirrors.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v
retrieving revision 1.226
diff -u -r1.226 mirrors.html
--- mirrors.html28 Jul 2014 23:02:56 -  1.226
+++ mirrors.html31 Aug 2014 11:14:17 -
@@ -57,6 +57,7 @@
   a 
href=http://mirrors-usa.go-parts.com/gcc/;http://mirrors-usa.go-parts.com/gcc/a
 | a 
href=ftp://mirrors-usa.go-parts.com/gcc;ftp://mirrors-usa.go-parts.com/gcc/a
 | a 
href=rsync://mirrors-usa.go-parts.com/gccrsync://mirrors-usa.go-parts.com/gcc/a/li
+liUS, Michigan: a 
href=http://mirrors.concertpass.com/gcc/;http://mirrors.concertpass.com/gcc//a,
 thanks to ad...@mirrors.concertpass.com./li
 /ul
 
 pThe archives there will be signed by one of the following GnuPG keys:/p

[Bug target/62308] A bug with aarch64 big-endian

2014-09-03 Thread yroux at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62308

Yvan Roux yroux at gcc dot gnu.org changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org,
   ||yroux at gcc dot gnu.org

--- Comment #4 from Yvan Roux yroux at gcc dot gnu.org ---
yes, and there is no issue when we use reload instead of LRA (flag -mno-lra).

[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2014-09-03 Thread zhenqiang.chen at arm dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

Zhenqiang Chen zhenqiang.chen at arm dot com changed:

   What|Removed |Added

 CC||zhenqiang.chen at arm dot com

--- Comment #20 from Zhenqiang Chen zhenqiang.chen at arm dot com ---
Here is a small case to show lra introduces one more register copy (tested with
trunk and 4.9).

int isascii (int c)
{
  return c = 0  c  128;
}
With options: -Os -mthumb -mcpu=cortex-m0, I got

isascii:
movr3, #0
movr2, #127
movr1, r3   //???
cmpr2, r0
adcr1, r1, r3
movr0, r1
bxlr

With options: -Os -mthumb -mcpu=cortex-m0 -mno-lra, I got

isascii:
movr2, #127
movr3, #0
cmpr2, r0
adcr3, r3, r3
movr0, r3
bxlr

[Bug libstdc++/55409] std::list not properly wrapping access to custom allocator through allocator_traits

2014-09-03 Thread freddie_chopin at op dot pl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55409

--- Comment #7 from Freddie Chopin freddie_chopin at op dot pl ---
Great (; Do you have some timeline? I'm not trying to rush you - I'm just
working on a project in which this feature would be beneficial, so I'm
wondering whether I should wait a bit (this particular requirement is not
top-priority) or maybe just implement the allocator the old way for now.

Thanks in advance!

[Bug middle-end/61848] [5 Regression] a previous declaration causes the section attribute to be lost

2014-09-03 Thread ryabinin.a.a at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61848

--- Comment #8 from Andrey Ryabinin ryabinin.a.a at gmail dot com ---
Hi, may I ask what's the status of this?

Besides of section mismatches in linux kernel it also breaks kernel's modules.
Variable __this_module doesn't get into section .gnu.linkonce.this_module,
therefore module refuses to load.

[Bug fortran/61881] ICE in gfc_conv_intrinsic_to_class with assumed-rank CLASS(*)

2014-09-03 Thread burnus at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61881

--- Comment #6 from Tobias Burnus burnus at gcc dot gnu.org ---
Author: burnus
Date: Wed Sep  3 06:41:37 2014
New Revision: 214843

URL: https://gcc.gnu.org/viewcvs?rev=214843root=gccview=rev
Log:
Missed that file in r213079 of 2014-07-26

2014-09-03  Tobias Burnus  bur...@net-b.de

PR fortran/61881
PR fortran/61888
PR fortran/57305
* gfortran.dg/sizeof_4.f90: New.


Added:
trunk/gcc/testsuite/gfortran.dg/sizeof_4.f90
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug fortran/61888] Wrong results with SIZEOF and assumed-rank arrays

2014-09-03 Thread burnus at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61888

--- Comment #3 from Tobias Burnus burnus at gcc dot gnu.org ---
Author: burnus
Date: Wed Sep  3 06:41:37 2014
New Revision: 214843

URL: https://gcc.gnu.org/viewcvs?rev=214843root=gccview=rev
Log:
Missed that file in r213079 of 2014-07-26

2014-09-03  Tobias Burnus  bur...@net-b.de

PR fortran/61881
PR fortran/61888
PR fortran/57305
* gfortran.dg/sizeof_4.f90: New.


Added:
trunk/gcc/testsuite/gfortran.dg/sizeof_4.f90
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug fortran/57305] [OOP] ICE when calling SIZEOF on an unlimited polymorphic variable

2014-09-03 Thread burnus at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57305

--- Comment #17 from Tobias Burnus burnus at gcc dot gnu.org ---
Author: burnus
Date: Wed Sep  3 06:41:37 2014
New Revision: 214843

URL: https://gcc.gnu.org/viewcvs?rev=214843root=gccview=rev
Log:
Missed that file in r213079 of 2014-07-26

2014-09-03  Tobias Burnus  bur...@net-b.de

PR fortran/61881
PR fortran/61888
PR fortran/57305
* gfortran.dg/sizeof_4.f90: New.


Added:
trunk/gcc/testsuite/gfortran.dg/sizeof_4.f90
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug target/62663] m68k / coldfire : compiling with -msep-data breaks the code

2014-09-03 Thread sch...@linux-m68k.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62663

--- Comment #4 from Andreas Schwab sch...@linux-m68k.org ---
Then this is most likely a linker bug, not setting up the GOT correctly.

[Bug fortran/63152] New: needless initialization of local pointer arrays.

2014-09-03 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63152

Bug ID: 63152
   Summary: needless initialization of local pointer arrays.
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch

I've noticed that for this code:

 SUBROUTINE S1()
   INTEGER, POINTER, DIMENSION(:) :: v
   INTERFACE
SUBROUTINE foo(v)
   INTEGER, POINTER, DIMENSION(:) :: v
END SUBROUTINE
   END INTERFACE
   CALL foo(v)
 END SUBROUTINE S1

gfortran initializes the pointer (to zero) even if '-fno-init-local-zero' :

s1 ()
{
  struct array1_integer(kind=4) v;

  v.data = 0B;
  foo (v);
}

I don't think this is mandated (other compilers don't)

I'm working on a patch.

[Bug fortran/63152] needless initialization of local pointer arrays.

2014-09-03 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63152

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-03
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch
   Assignee|unassigned at gcc dot gnu.org  |Joost.VandeVondele at 
mat dot ethz
   ||.ch
 Ever confirmed|0   |1

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
working on a patch.

[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820

2014-09-03 Thread trippels at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224

Markus Trippelsdorf trippels at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #11 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Here's a small testcase:

markus@x4 tmp % cat cppcodemodelinspectordialog.ii
namespace CppTools
{
class A
{
public:
  virtual void headerPaths () = 0;
};
namespace Internal
{
class CppModelManager : CppTools::A
{
  void
  headerPaths ()
  {
ensureUpdated ();
  }
  void ensureUpdated ();
};
}
}
CppTools::A *a;
void
fn1 ()
{
  a-headerPaths ();
}

(before r214208)
markus@x4 tmp % g++ -Wl,--no-undefined -shared -fPIC -O2
cppcodemodelinspectordialog.ii
markus@x4 tmp %

(after r214208)
markus@x4 tmp % g++ -Wl,--no-undefined -shared -fPIC -O2
cppcodemodelinspectordialog.ii
/tmp/ccMZQE0g.o:cppcodemodelinspectordialog.ii:function fn1(): error: undefined
reference to 'CppTools::Internal::CppModelManager::ensureUpdated()'
/tmp/ccMZQE0g.o:cppcodemodelinspectordialog.ii:function
CppTools::Internal::CppModelManager::headerPaths(): error: undefined reference
to 'CppTools::Internal::CppModelManager::ensureUpdated()'
collect2: error: ld returned 1 exit status

(one can use -fno-devirtualize-speculatively as a workaround)
markus@x4 tmp % g++ -Wl,--no-undefined -fno-devirtualize-speculatively -shared
-fPIC -O2 cppcodemodelinspectordialog.ii
markus@x4 tmp %

[Bug fortran/63153] New: pointers are not nullified with -finit-local-zero

2014-09-03 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63153

Bug ID: 63153
   Summary: pointers are not nullified with  -finit-local-zero
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch

scalar pointers are not nullified with -finit-local-zero . After the fix for
PR63152, also arrays with the pointer attribute might need this.

 cat bug.f90
 SUBROUTINE S1()
   INTEGER, POINTER :: w
   IF (ASSOCIATED(w)) CALL ABORT()
 END SUBROUTINE S1

 gfortran -fdump-tree-original -finit-local-zero -g -c bug.f90
 cat bug.f90.003t.original 
s1 ()
{
  integer(kind=4) * w;

  if (w != 0B)
{
  _gfortran_abort ();
}
  L.1:;
}

[Bug target/61330] [5 Regression] Thumb ICE for case 920507-1.c

2014-09-03 Thread yroux at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61330

--- Comment #9 from Yvan Roux yroux at gcc dot gnu.org ---
Author: yroux
Date: Wed Sep  3 07:23:01 2014
New Revision: 214847

URL: https://gcc.gnu.org/viewcvs?rev=214847root=gccview=rev
Log:
gcc/
2014-09-03  Yvan Roux  yvan.r...@linaro.org

Backport from trunk r214526.
2014-08-26  Joseph Myers  jos...@codesourcery.com

PR target/60606
PR target/61330
* varasm.c (make_decl_rtl): Clear DECL_ASSEMBLER_NAME and
DECL_HARD_REGISTER and return for invalid register specifications.
* cfgexpand.c (expand_one_var): If expand_one_hard_reg_var clears
DECL_HARD_REGISTER, call expand_one_error_var.
* config/arm/arm.c (arm_hard_regno_mode_ok): Do not allow
CC_REGNUM with non-MODE_CC modes.
(arm_regno_class): Return NO_REGS for PC_REGNUM.

gcc/testsuite/
2014-09-03  Yvan Roux  yvan.r...@linaro.org

Backport from trunk r214526.
2014-08-26  Joseph Myers  jos...@codesourcery.com

PR target/60606
PR target/61330
* gcc.dg/torture/pr60606-1.c, gcc.target/arm/pr60606-2.c,
gcc.target/arm/pr60606-3.c, gcc.target/arm/pr60606-4.c: New tests.


Added:
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr60606-1.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-2.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-3.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-4.c
Modified:
branches/linaro/gcc-4_9-branch/gcc/ChangeLog.linaro
branches/linaro/gcc-4_9-branch/gcc/cfgexpand.c
branches/linaro/gcc-4_9-branch/gcc/config/arm/arm.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/ChangeLog.linaro
branches/linaro/gcc-4_9-branch/gcc/varasm.c

[Bug target/60606] [ARM] ICE with asm (mov ..., pc)

2014-09-03 Thread yroux at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60606

--- Comment #9 from Yvan Roux yroux at gcc dot gnu.org ---
Author: yroux
Date: Wed Sep  3 07:23:01 2014
New Revision: 214847

URL: https://gcc.gnu.org/viewcvs?rev=214847root=gccview=rev
Log:
gcc/
2014-09-03  Yvan Roux  yvan.r...@linaro.org

Backport from trunk r214526.
2014-08-26  Joseph Myers  jos...@codesourcery.com

PR target/60606
PR target/61330
* varasm.c (make_decl_rtl): Clear DECL_ASSEMBLER_NAME and
DECL_HARD_REGISTER and return for invalid register specifications.
* cfgexpand.c (expand_one_var): If expand_one_hard_reg_var clears
DECL_HARD_REGISTER, call expand_one_error_var.
* config/arm/arm.c (arm_hard_regno_mode_ok): Do not allow
CC_REGNUM with non-MODE_CC modes.
(arm_regno_class): Return NO_REGS for PC_REGNUM.

gcc/testsuite/
2014-09-03  Yvan Roux  yvan.r...@linaro.org

Backport from trunk r214526.
2014-08-26  Joseph Myers  jos...@codesourcery.com

PR target/60606
PR target/61330
* gcc.dg/torture/pr60606-1.c, gcc.target/arm/pr60606-2.c,
gcc.target/arm/pr60606-3.c, gcc.target/arm/pr60606-4.c: New tests.


Added:
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr60606-1.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-2.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-3.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/gcc.target/arm/pr60606-4.c
Modified:
branches/linaro/gcc-4_9-branch/gcc/ChangeLog.linaro
branches/linaro/gcc-4_9-branch/gcc/cfgexpand.c
branches/linaro/gcc-4_9-branch/gcc/config/arm/arm.c
branches/linaro/gcc-4_9-branch/gcc/testsuite/ChangeLog.linaro
branches/linaro/gcc-4_9-branch/gcc/varasm.c

[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820

2014-09-03 Thread chris2553 at googlemail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224

--- Comment #12 from Chris Clayton chris2553 at googlemail dot com ---
Sorry, you'll have to stick with me here while a figure out what that means.

I think you are saying that prior to r214208, the symbols definedMacros() and
headerPaths() were present but effectively no-ops. Post r214208 they now
contain operations including calls to ensureUpdated().

Given that the symbol for ensureUpdated() appears to be present in
libCppTools.so (along with the symbols for its two post-r214208 callers), does
that suggest a problem with the linker, which is /usr/bin/ld from the latest
version (2.24) of binutils?

Or could it be anything to do with my system being a 32bit userspace on a 64bit
kernel? I usually build packages as rpms and have the rpm binary wrapped in a
script which uses prefixes the call to the actual rpm binary with setarch
i386. I've been careful whilst investigated this problem to make sure that I
prefix calls to qmake and make with setarch i386. I've built loads and loads
of packages with this setup (including gcc).

I'm just trying to figure out the next port of call with this problem. I note
that the Debian folks have a bug logged but seem to be waiting on resolution
via this bug report - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=759862.

[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820

2014-09-03 Thread trippels at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224

--- Comment #13 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
(In reply to Chris Clayton from comment #12)
 Sorry, you'll have to stick with me here while a figure out what that means.
 
 I think you are saying that prior to r214208, the symbols definedMacros()
 and headerPaths() were present but effectively no-ops. Post r214208 they now
 contain operations including calls to ensureUpdated().
 
 Given that the symbol for ensureUpdated() appears to be present in
 libCppTools.so (along with the symbols for its two post-r214208 callers),
 does that suggest a problem with the linker, which is /usr/bin/ld from the
 latest version (2.24) of binutils?

No. This has nothing to do with libCppTools.so. As I wrote before the 
build system of qt-creator must be changed to provide the missing symbol
by simply adding cppmodelmanager.o to the libCppEditor.so link command.

[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2014-09-03 Thread fredrik.hederstie...@securitas-direct.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #21 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
I filed this previously, maybe its duplicate

Bug 61578 - Code size increase for ARM thumb compared to 4.8.x when compiling
with -Os

BR Fredrik

[Bug target/62662] [4.9/5 Regression] Miscompilation of Qt on s390x

2014-09-03 Thread krebbel at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62662

--- Comment #5 from Andreas Krebbel krebbel at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #4)
I agree that this is something we need to fix in the back-end. I was just
curious about when this surfaced first and keep that info for the records.

[Bug bootstrap/61078] [5 Regression] ESA mode bootstrap failure since r209897

2014-09-03 Thread krebbel at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61078

--- Comment #8 from Andreas Krebbel krebbel at gcc dot gnu.org ---
Author: krebbel
Date: Wed Sep  3 08:06:09 2014
New Revision: 214850

URL: https://gcc.gnu.org/viewcvs?rev=214850root=gccview=rev
Log:
2014-09-03  Andreas Krebbel  andreas.kreb...@de.ibm.com

PR target/61078
* config/s390/s390.md (*negdi2_31): Add s390_split_ok_p check
and add a second splitter to handle the remaining cases.

2014-09-03  Andreas Krebbel  andreas.kreb...@de.ibm.com

PR target/61078
* gcc.target/s390/pr61078.c: New testcase.



Added:
trunk/gcc/testsuite/gcc.target/s390/pr61078.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md
trunk/gcc/testsuite/ChangeLog

[Bug fortran/63152] needless initialization of local pointer arrays.

2014-09-03 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63152

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

URL||https://gcc.gnu.org/ml/fort
   ||ran/2014-09/msg00016.html

--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
WIP patch at URL

[Bug bootstrap/61078] [5 Regression] ESA mode bootstrap failure since r209897

2014-09-03 Thread krebbel at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61078

Andreas Krebbel krebbel at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Andreas Krebbel krebbel at gcc dot gnu.org ---
Fixed per comment 8

[Bug middle-end/61654] [4.9/5 Regression] ICE in release_function_body, at cgraph.c:1699

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61654

Martin Jambor jamborm at gcc dot gnu.org changed:

   What|Removed |Added

  Known to fail|4.10.0  |5.0

--- Comment #12 from Martin Jambor jamborm at gcc dot gnu.org ---
I have proposed a fix on the mailing list:

https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00207.html

[Bug ipa/61986] ICE on valid code at -O3 on x86_64-linux-gnu indecide_about_value, at ipa-cp.c:3480

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61986

--- Comment #2 from Martin Jambor jamborm at gcc dot gnu.org ---
I have proposed a fix on the mailing list:

https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00209.html

[Bug ipa/62015] [4.8/4.9/5 Regression] ipa-cp-clone uses a clone that is too specialized for the call context

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62015

--- Comment #3 from Martin Jambor jamborm at gcc dot gnu.org ---
I have proposed a fix on the mailing list:

https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00210.html

[Bug regression/63150] [4.9/5 regression] FAIL: gcc.target/powerpc/pr53199.c scan-assembler-times *

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63150

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

  Known to work||4.8.3
   Target Milestone|--- |4.9.2
Summary|[4.9 regression] FAIL:  |[4.9/5 regression] FAIL:
   |gcc.target/powerpc/pr53199. |gcc.target/powerpc/pr53199.
   |c scan-assembler-times *|c scan-assembler-times *

[Bug tree-optimization/63148] r187042 causes auto-vectorization failure for X86 for -m32.

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
This has been fixed on the 4.8 branch already, I think this is a duplicate of
PR60276.

*** This bug has been marked as a duplicate of bug 60276 ***

[Bug tree-optimization/60276] [4.7 Regression] -O3 autovectorizer breaks on a particular loop

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60276

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||doug.gilmore at imgtec dot com

--- Comment #15 from Richard Biener rguenth at gcc dot gnu.org ---
*** Bug 63148 has been marked as a duplicate of this bug. ***

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #3 from Richard Biener rguenth at gcc dot gnu.org ---
You might want to try -fsanitize=undefined and/or -fno-strict-overflow as it
sounds like you may be invoking undefined behavior.

[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820

2014-09-03 Thread trippels at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224

--- Comment #14 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #13)
 (In reply to Chris Clayton from comment #12)
  Sorry, you'll have to stick with me here while a figure out what that means.
  
  I think you are saying that prior to r214208, the symbols definedMacros()
  and headerPaths() were present but effectively no-ops. Post r214208 they now
  contain operations including calls to ensureUpdated().
  
  Given that the symbol for ensureUpdated() appears to be present in
  libCppTools.so (along with the symbols for its two post-r214208 callers),
  does that suggest a problem with the linker, which is /usr/bin/ld from the
  latest version (2.24) of binutils?
 
 No. This has nothing to do with libCppTools.so. As I wrote before the 
 build system of qt-creator must be changed to provide the missing symbol
 by simply adding cppmodelmanager.o to the libCppEditor.so link command.

Out of curiosity, I have downloaded and tried to build qt-creator-3.2.0.
The build failed exactly as you described in commment 0.

The fix is simple, just add  __attribute__ ((visibility (default))) to
CppModelManager::ensureUpdated() in src/plugins/cpptools/cppmodelmanager.cpp:

294  __attribute__ ((visibility (default)))
295 void CppModelManager::ensureUpdated()
296 {

This will make _ZN8CppTools8Internal15CppModelManager13ensureUpdatedEv external
for libCppTools.so and everything is fine.

[Bug testsuite/53155] Not parallel: test for -j fails with new make

2014-09-03 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53155

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-03
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch
 Ever confirmed|0   |1

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
still fails. Honestly, this made contributing my first patches much slower, as
testing took ages to complete.

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Or -fno-aggressive-loop-optimizations.  From your description it is hard to
figure what exactly to look for in the assembly, so e.g. bisecting compiler
where it stopped working is not easy.

[Bug tree-optimization/55334] [4.8/4.9/5 Regression] mgrid regression (ipa-cp disables vectorization)

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334

--- Comment #39 from Martin Jambor jamborm at gcc dot gnu.org ---
(In reply to Vidya Praveen from comment #38)
 Until we fix this issue, could we have workaround posted by Martin Jambor
 (comment #29) applied again on 4.9 and trunk? 

No, not on trunk please.

As I said on IRC yesterday.  Before we even consider this for the 4.9
branch, please verify that inlining does not cause the same problems
with the benchmark (on the particular architecture you care for).  It
is certainly capable of doing that and we certainly do not want to
switch inlining off :-)

[Bug tree-optimization/49444] IV-OPTs changes an unaligned loads into aligned loads incorrectly

2014-09-03 Thread amker.cheng at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49444

bin.cheng amker.cheng at gmail dot com changed:

   What|Removed |Added

 CC||amker.cheng at gmail dot com

--- Comment #8 from bin.cheng amker.cheng at gmail dot com ---
This should be fixed on trunk now.  At least for r211210 and r214864.
For Andrew's test, the generated mips assmbly for kernel loop is as below.

$L3:
lwl$5,1($16)
lwl$4,5($16)
lwl$3,9($16)
lwr$5,4($16)
lwr$4,8($16)
lwr$3,12($16)
lw$2,%gp_rel(ss)($28)
addiu$16,$16,13
sw$5,0($2)
sw$4,4($2)
jalg
sw$3,8($2)

bne$16,$17,$L3
move$2,$0

For Richard's case (with an explicit conversion when calling foo), the
generated mips assembly is as below.
foo:
.frame$sp,0,$31# vars= 0, regs= 0/0, args= 0, gp= 0
.mask0x,0
.fmask0x,0
.setnoreorder
.setnomacro
lwl$2,0($4)
nop
lwr$2,3($4)
j$31
nop

.setmacro
.setreorder
.endfoo
.sizefoo, .-foo

Apparently, lwl/lwr are generated for unalgned memory access.

Thanks,
bin

[Bug tree-optimization/55334] [4.8/4.9/5 Regression] mgrid regression (ipa-cp disables vectorization)

2014-09-03 Thread rguenther at suse dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334

--- Comment #40 from rguenther at suse dot de rguenther at suse dot de ---
nOn Wed, 3 Sep 2014, jamborm at gcc dot gnu.org wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334
 
 --- Comment #39 from Martin Jambor jamborm at gcc dot gnu.org ---
 (In reply to Vidya Praveen from comment #38)
  Until we fix this issue, could we have workaround posted by Martin Jambor
  (comment #29) applied again on 4.9 and trunk? 
 
 No, not on trunk please.
 
 As I said on IRC yesterday.  Before we even consider this for the 4.9
 branch, please verify that inlining does not cause the same problems
 with the benchmark (on the particular architecture you care for).  It
 is certainly capable of doing that and we certainly do not want to
 switch inlining off :-)

Inlining will certainly cause the same problem.

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #5 from Ralf Hoffmann gcc at boomerangsworld dot de ---
Thanks for the feedback, I am also suspecting I have some problem in my code
regarding undefined behavior.

What I do for testing is to compile my tool Worker
(http://www.boomerangsworld.de/cms/worker/index.html, version 3.5.0) with

make clean
LDFLAGS=-fsanitize=undefined CPPFLAGS=-fsanitize=undefined ./configure
make

and then start the program (src/worker), click on top left A button for
the about dialog and click on the down arrow to scroll down the option list.
It then either works, or the process hangs in the endless loop.

I tried to use -fsanitize=undefined and it actually makes a difference. There
is no compiler output pointing out some problem and also no runtime output when
reaching the test point mentioned above. But with this option, it behaves
normally and the endless loop does not occur.

When using the options -fno-strict-overflow or
-fno-aggressive-loop-optimizations the problem still occurs.

I would like to help bisecting the compiler if you could give me
a hint where to start. As far as I see, there is no git repo which
would make it easier.

[Bug tree-optimization/49444] IV-OPTs changes an unaligned loads into aligned loads incorrectly

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49444

--- Comment #9 from Richard Biener rguenth at gcc dot gnu.org ---
Thus dup of PR61320?

[Bug c/62024] __atomic_always_lock_free is not a constant expression

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62024

--- Comment #8 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Wed Sep  3 11:16:29 2014
New Revision: 214871

URL: https://gcc.gnu.org/viewcvs?rev=214871root=gccview=rev
Log:
PR c/62024
* c-parser.c (c_parser_static_assert_declaration_no_semi): Strip no-op
conversions.

* g++.dg/cpp0x/pr62024.C: New test.
* gcc.dg/pr62024.c: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/pr62024.C
trunk/gcc/testsuite/gcc.dg/pr62024.c
Modified:
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-parser.c
trunk/gcc/testsuite/ChangeLog

[Bug c/62024] __atomic_always_lock_free is not a constant expression

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62024

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Marek Polacek mpolacek at gcc dot gnu.org ---
Should be fixed.

[Bug libstdc++/55409] std::list not properly wrapping access to custom allocator through allocator_traits

2014-09-03 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55409

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |5.0

--- Comment #8 from Jonathan Wakely redi at gcc dot gnu.org ---
It will be done for the GCC 5.0 release.

[Bug lto/62026] [4.9/5 Regression] Crash in lto_get_decl_name_mapping

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62026

--- Comment #8 from Martin Jambor jamborm at gcc dot gnu.org ---
I'm sorry but I cannot reproduce the problem with the attached testcase.  I
will try the libxul link.

[Bug other/63155] New: [4.9/5 Regression] memory hog

2014-09-03 Thread doko at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

Bug ID: 63155
   Summary: [4.9/5 Regression] memory hog
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: doko at gcc dot gnu.org

Created attachment 33441
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33441action=edit
preprocessed source

[forwarded from https://bugs.debian.org/759683]

compiling the attached test case with the 4.9 branch r214759 and trunk r213954
takes about 90sec on x86_64 and 10GB of memory. succeeds with the 4.8 branch in
less than a second.

$ gcc -std=c99 -c testunity_Runner.i


from the Debian issue:

  
  Notice that replacing _setjmp 
  (Unity.AbortFrame[Unity.CurrentAbortFrame]) in main function by 
  _setjmp (Unity.AbortFrame[0]), make gcc works normaly.
  After few tests it seems that gcc does not like having a variable in here.
  

I don't see the crash reported in the Debian issue.

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
There is a git mirror of the svn repo.
Anyway, -fsanitize=undefined enables -fno-delete-null-pointer-checks, perhaps
you could try that option alone if it makes a difference.

[Bug libstdc++/62259] atomic class doesn't enforce required alignment on powerpc64

2014-09-03 Thread uweigand at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259

Ulrich Weigand uweigand at gcc dot gnu.org changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #1 from Ulrich Weigand uweigand at gcc dot gnu.org ---
Indeed, when running a simple test program:

#include atomic
#include stdio.h

struct twoints {
  int a;
  int b;
};

int main(void) {
   printf(%d\n, __alignof__ (twoints));
   printf(%d\n, __alignof__ (std::atomictwoints));
   return 0;
}

we see that the GCC only requires 4 bytes of alignment for the atomic type.

However, with the equivalent C11 code using the _Atomic keyword

#include stdatomic.h
#include stdio.h

struct twoints {
  int a;
  int b;
};

int main() {
   printf(%d\n, __alignof__ (struct twoints));
   printf(%d\n, __alignof__ (_Atomic (struct twoints)));
   return 0;
}

we get an alignment requirement of 8 bytes for the atomic type.

In the C case, this is done by the compiler front-end where it implements the
_Atomic keyword.  In the C++ case, it seems the compiler doesn't really get
involved, as it's all done in plain C++ in standard library code ...

I suspect the intent was that for C++, we likewise ought to have an increased
alignment requirement for the type, but I'm not sure how to implement this in
the library.  Need some of the library experts to comment here.

[Bug tree-optimization/49444] IV-OPTs changes an unaligned loads into aligned loads incorrectly

2014-09-03 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49444

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #10 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Richard Biener from comment #9)
 Thus dup of PR61320?

Yes.

*** This bug has been marked as a duplicate of bug 61320 ***

[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-09-03 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #69 from Andrew Pinski pinskia at gcc dot gnu.org ---
*** Bug 49444 has been marked as a duplicate of this bug. ***

[Bug c/62294] [4.9 Regression] Missing passing argument [...] from incompatible pointer type warning.

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62294

--- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Wed Sep  3 12:54:06 2014
New Revision: 214874

URL: https://gcc.gnu.org/viewcvs?rev=214874root=gccview=rev
Log:
PR c/62294
* c-typeck.c (convert_arguments): Get location of a parameter.  Change
error and warning calls to error_at and warning_at.  Pass location of
a parameter to it.
(convert_for_assignment): Add parameter to WARN_FOR_ASSIGNMENT and
WARN_FOR_QUALIFIERS.  Pass expr_loc to those.

* gcc.dg/pr56724-1.c: New test.
* gcc.dg/pr56724-2.c: New test.
* gcc.dg/pr62294.c: New test.
* gcc.dg/pr62294.h: New file.

Added:
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr56724-1.c
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr56724-2.c
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr62294.c
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr62294.h
Modified:
branches/gcc-4_9-branch/gcc/c/ChangeLog
branches/gcc-4_9-branch/gcc/c/c-typeck.c
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog

[Bug c/62294] [4.9 Regression] Missing passing argument [...] from incompatible pointer type warning.

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62294

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Marek Polacek mpolacek at gcc dot gnu.org ---
Fixed.  I'll add the new test to trunk as well.

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #7 from Ralf Hoffmann gcc at boomerangsworld dot de ---
Created attachment 33442
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33442action=edit
simplified example file 1

simple example containing the code piece which triggers the behavior

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #8 from Ralf Hoffmann gcc at boomerangsworld dot de ---
Created attachment 33443
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33443action=edit
aguixtest.cc

file with helper functions, not related to the problem, but required to execute

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #9 from Ralf Hoffmann gcc at boomerangsworld dot de ---
Created attachment 33444
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33444action=edit
aguixtest.hh

file with helper functions, not related to the problem, but required to execute

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #10 from Ralf Hoffmann gcc at boomerangsworld dot de ---
Created attachment 33445
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33445action=edit
build

build script used to create executable test program

[Bug c++/63140] wrong code generation probably due to optimization problem

2014-09-03 Thread gcc at boomerangsworld dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63140

--- Comment #11 from Ralf Hoffmann gcc at boomerangsworld dot de ---
I managed to create a standalone test program. Attachment aguix.cc contains
the stripped down critical code segments. The two other files aguixtest.cc
and aguixtest.hh are just to make a runnable binary. The attached script
build can be used to create the binary.

The expected output is:

wait4mess2 called
waittime2: 5
Worker: msg lock element lost!
Worker: msg lock element lost!
wait4mess2 called

(this is what the binary does with gcc 4.8.1)

while with gcc 4.9.1 it will loop forever:


wait4mess2 called
waittime2: 5
waittime2: 5
waittime2: 5
waittime2: 5


Compiled with -O1 instead of -O2 the example program crashes.
Adding -fsanitize=undefined on the other hand will make it work again
regardless of O1 or O2.

[Bug libstdc++/62259] atomic class doesn't enforce required alignment on powerpc64

2014-09-03 Thread dje at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259

David Edelsohn dje at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-03
 CC||dje at gcc dot gnu.org,
   ||redi at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from David Edelsohn dje at gcc dot gnu.org ---
Confirmed.

[Bug c/62294] [4.9 Regression] Missing passing argument [...] from incompatible pointer type warning.

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62294

--- Comment #5 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Wed Sep  3 13:20:43 2014
New Revision: 214876

URL: https://gcc.gnu.org/viewcvs?rev=214876root=gccview=rev
Log:
PR c/62294
* gcc.dg/pr62294.c: New test.
* gcc.dg/pr62294.h: New file.

Added:
trunk/gcc/testsuite/gcc.dg/pr62294.c
trunk/gcc/testsuite/gcc.dg/pr62294.h
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug c/62294] [4.9 Regression] Missing passing argument [...] from incompatible pointer type warning.

2014-09-03 Thread Emmanuel.Thome at inria dot fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62294

--- Comment #6 from Emmanuel Thomé Emmanuel.Thome at inria dot fr ---
Thanks.

E.

[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-03
   Target Milestone|--- |4.9.2
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
Clearly caused by the correctness fix for setjmp to wire abnormal edges.

For me it is out-of-ssa which uses too much memory while building
the conflict graph.

We have gigantic PHI nodes here:

_10263(ab) = PHI _109925(D)(ab)(2),, _10592(ab)(1489)

it's fast when optimizing.

At -O0 we have a _lot_ more anonymous SSA names.

-O1:

  bb 4:
  # _1(ab) = PHI _1902(3), _2(ab)(5)
  _1905 = _setjmp (_1(ab));
  if (_1905 == 0)
goto bb 6;
  else
goto bb 8;

  bb 5
  # _2(ab) = PHI _1895(D),  single gigantic PHI

-O0:

  bb 4:
  # _1(ab) = PHI _398164(3), _2(ab)(5)
  # _632(ab) = PHI _397532(D)(ab)(3), _633(ab)(5)
  # _1263(ab) = PHI _397533(D)(ab)(3), _1264(ab)(5)
  # _1894(ab) = PHI _397534(D)(ab)(3), _1895(ab)(5)
  # _2525(ab) = PHI _397535(D)(ab)(3), _2526(ab)(5)
...
  # _396900(ab) = PHI _398160(D)(ab)(3), _396901(ab)(5)
  _398165 = _setjmp (_1(ab));
  if (_398165 == 0)
goto bb 6;
  else
goto bb 8;

  bb 5
  # _2(ab) = PHI _397531(D)(ab)(2)...

  # _396901(ab) = PHI _398160(D)(ab)(2), _3...

gazillion of gigantic PHIs.  And very many PHIs in every block.

It's into-SSA that introduces the difference for the PHI nodes
but already GIMPLIFICATION that introduces very many more
temporaries which is the underlying issue (lookup_tmp_var
!optimize check).

Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (revision 214810)
+++ gcc/gimplify.c  (working copy)
@@ -476,7 +476,7 @@ lookup_tmp_var (tree val, bool is_formal
  block, which means it will go into memory, causing much extra
  work in reload and final and poorer code generation, outweighing
  the extra memory allocation here.  */
-  if (!optimize || !is_formal || TREE_SIDE_EFFECTS (val))
+  if (!is_formal || TREE_SIDE_EFFECTS (val))
 ret = create_tmp_from_val (val);
   else
 {

fixes it (but it means that changing the testcase to use more distinct
user variables would produce the same issue even when optimizing).

[Bug tree-optimization/58526] Inlining looses restrict qualifier and leads to loop versioned vectorization

2014-09-03 Thread burnus at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58526

--- Comment #2 from Tobias Burnus burnus at gcc dot gnu.org ---
See also RFC patch at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00232.html

[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
I wonder why we need to explicitely represent abnormal PHIs in the dispatcher.
All incoming edges are abnormal and all SSA names have to be coalesced anyway.
Thus we could instead have

  bb 5:
/* Not: # _2(ab) = PHI _17(D)(ab)(2), _1(ab)(6), _1(ab)(7), _3(ab)(11),
_3(ab)(12), _4(ab)(15), _4(ab)(16), _5(ab)(20), _5(ab)(21), _5(ab)(22) */
  ABNORMAL_DISPATCHER (0);
  _2(ab) = D.12345;

or simply rewrite all must-coalesce vars out-of-SSA?  (or not into SSA
in the first place)

The question is whether accesses to them should be loads/stores (I think so)
and if that will cause other similar issues.

We'd have to factor abnormal edges into a block to a separate forwarder
of course, with a load of all abnormal vars.

Anyway, not sure why the gimplify code is disabled for -O0 (or why we
don't re-use formal temps more aggressively as they become anonymous
SSA names later anyway).

[Bug tree-optimization/55334] [4.8/4.9/5 Regression] mgrid regression (ipa-cp disables vectorization)

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334

--- Comment #41 from Richard Biener rguenth at gcc dot gnu.org ---
New attempt: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00232.html

[Bug ipa/61986] ICE on valid code at -O3 on x86_64-linux-gnu indecide_about_value, at ipa-cp.c:3480

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61986

--- Comment #3 from Martin Jambor jamborm at gcc dot gnu.org ---
Author: jamborm
Date: Wed Sep  3 14:16:54 2014
New Revision: 214877

URL: https://gcc.gnu.org/viewcvs?rev=214877root=gccview=rev
Log:
2014-09-03  Martin Jambor  mjam...@suse.cz

PR ipa/61986
* ipa-cp.c (find_aggregate_values_for_callers_subset): Chain
created replacements in ascending order of offsets.
(known_aggs_to_agg_replacement_list): Likewise.

gcc/testsuite/
* gcc.dg/ipa/pr61986.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/ipa/pr61986.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/testsuite/ChangeLog

[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #3 from Richard Biener rguenth at gcc dot gnu.org ---
So the issue is that the setjmp argument needs two temporaries:

  D.2832 = Unity.CurrentAbortFrame;
  D.2833 = Unity.AbortFrame[D.2832];

  bb 18:
  D.2834 = _setjmp (D.2833);

and the EH edge going into the _setjmp call has to merge those through
the abnormal dispatcher.  And that way it receives all of them.  Hmm.

Huh.  Without the abnormal dispatcher they should just get default defs
everywhere (but still many PHI nodes).  Maybe that would be more light-weight.

[Bug ipa/62015] [4.8/4.9/5 Regression] ipa-cp-clone uses a clone that is too specialized for the call context

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62015

--- Comment #4 from Martin Jambor jamborm at gcc dot gnu.org ---
Author: jamborm
Date: Wed Sep  3 14:26:38 2014
New Revision: 214878

URL: https://gcc.gnu.org/viewcvs?rev=214878root=gccview=rev
Log:
2014-09-03  Martin Jambor  mjam...@suse.cz

PR ipa/62015
* ipa-cp.c (intersect_aggregates_with_edge): Handle impermissible
pass-trough jump functions correctly.

testsuite/
* g++.dg/ipa/pr62015.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/ipa/pr62015.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/57335] internal compiler error: in cxx_eval_bit_field_ref, at cp/semantics.c:6977

2014-09-03 Thread paolo.carlini at oracle dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57335

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

   Keywords||ice-on-valid-code

--- Comment #3 from Paolo Carlini paolo.carlini at oracle dot com ---
... but we ICE with the testcase adjusted too.

[Bug ipa/61986] ICE on valid code at -O3 on x86_64-linux-gnu indecide_about_value, at ipa-cp.c:3480

2014-09-03 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61986

--- Comment #4 from Martin Jambor jamborm at gcc dot gnu.org ---
I can reproduce the bug on the 4.9 branch too and the code is the same
in 4.8 as well (although the bug does not manifest form me there), so
please keep this bug opened until I commit the same fix to the two
branches, which will happen right after my bootstrap and testing
finishes.

[Bug libstdc++/62259] atomic class doesn't enforce required alignment on powerpc64

2014-09-03 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259

--- Comment #3 from Jonathan Wakely redi at gcc dot gnu.org ---
(In reply to saugustine from comment #0)
 My uneducated guess is that the template at atomic:189 should either use
 _M_i in calls to __atomic_is_lock_free (instead of nullptr) or should add
 alignment as necessary. Not sure how that is intended to be done. If I fix
 atomic to pass the pointer, then gcc chooses to call out to an atomic
 library function, which gcc doesn't provide.

GCC does provide it, in libatomic, so -latomic should work.

But I just tried your suggested change and saw no effect: I didn't need
libatomic and I still got a bus error.

I suppose what we want is the equivalent of this, but the _Atomic keyword isn't
valid in C++:

--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -161,7 +161,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct atomic
 {
 private:
-  _Tp _M_i;
+  alignas(alignof(_Atomic _Tp)) _Tp _M_i;

   // TODO: static_assert(is_trivially_copyable_Tp::value, );

[Bug fortran/62270] -Wlogical-not-parentheses warnings

2014-09-03 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62270

--- Comment #7 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Wed Sep  3 16:04:27 2014
New Revision: 214881

URL: https://gcc.gnu.org/viewcvs?rev=214881root=gccview=rev
Log:
PR fortran/62270
* interface.c (compare_parameter): Fix condition.
* trans-expr.c (gfc_conv_procedure_call): Likewise.

* gfortran.dg/pointer_intent_7.f90: Adjust dg-error.

Modified:
branches/gcc-4_9-branch/gcc/fortran/ChangeLog
branches/gcc-4_9-branch/gcc/fortran/interface.c
branches/gcc-4_9-branch/gcc/fortran/trans-expr.c
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
branches/gcc-4_9-branch/gcc/testsuite/gfortran.dg/pointer_intent_7.f90

1 2 3 >

1 - 100 of 247 matches

Mail list logo