Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Peter Zijlstra
 I don't know the specifics of your example, but from how I understand
 it, I don't see a problem if the compiler can prove that the store will
 always happen.
 
 To be more specific, if the compiler can prove that the store will
 happen anyway, and the region of code can be assumed to always run
 atomically (e.g., there's no loop or such in there), then it is known
 that we have one atomic region of code that will always perform the
 store, so we might as well do the stuff in the region in some order.
 
 Now, if any of the memory accesses are atomic, then the whole region of
 code containing those accesses is often not atomic because other threads
 might observe intermediate results in a data-race-free way.
 
 (I know that this isn't a very precise formulation, but I hope it brings
 my line of reasoning across.)

So given something like:

if (x)
y = 3;

assuming both x and y are atomic (so don't gimme crap for now knowing
the C11 atomic incantations); and you can prove x is always true; you
don't see a problem with not emitting the conditional?

Avoiding the conditional changes the result; see that control dependency
email from earlier. In the above example the load of X and the store to
Y are strictly ordered, due to control dependencies. Not emitting the
condition and maybe not even emitting the load completely wrecks this.

Its therefore an invalid optimization to take out the conditional or
speculate the store, since it takes out the dependency.


gnat.dg test: div_no_warning.adb

2014-02-12 Thread PARAT Didier
Hi,

I'm trying to use the tests in gcc/testsuite/gnat.dg and I'm having trouble 
understanding one particular test: gnat.dg/div_no_warning.adb. If I 
understand correctly, this test is expected to compile without any error or 
warning. When I compile this test with a native compiler gnatmake, I get no 
error nor warning. But when I compile it with a gcc port to private target I 
get the following output:

div_no_warning.adb:13:20: warning: division by zero
div_no_warning.adb:13:20: warning: Constraint_Error will be raised at run time

The source looks like:
.
4  :    Flag : constant Boolean := False;
.
12:   if Flag and then F then
13:  Int := Int / 0;
14:   end if;
.

I checked the assembler produced by my compiler, it does not contain any code 
corresponding to the division or if code shown before. So I guess the 
compiler has detected that the if False and then . is dead code, but still I 
get the warnings.

I'm having a hard time finding what kind of option is activated/suppressed in 
the native compiler so it does not output those warnings.

I'm working on GCC 4.7.3 and GNAT 7.1.2.

Any help appreciated.

Thanks,
Didier


m68k optimisation for beginners?

2014-02-12 Thread Fredrik Olsson
Hi.

I would like to get started with how to improve code generation for a
backend. Any pointers, especially to good documentation is welcome.

For this example consider this C function for a reference counted type:
void TCRelease(TCTypeRef tc) {
  if (--tc-retainCount == 0) {
if (tc-destroy) {
  tc-destroy(tc);
}
free((void *)tc);
  }
}

The generated m68k asm is this:
_TCRelease:
move.l %a2,-(%sp)
move.l 8(%sp),%a2
move.w (%a2),%d0  ; Question 1:
subq.w #1,%d0
move.w %d0,(%a2)
jne .L7
move.l 4(%a2),%a0  ; Question 2:
cmp.w #0,%a0
jeq .L9
move.l %a2,-(%sp)   ; Question 3:
jsr (%a0)
addq.l #4,%sp
.L9:
move.l %a2,8(%sp)
move.l (%sp)+,%a2
jra _free
.L7:
move.l (%sp)+,%a2
rts

Question 1:
This could be done as one instructions sub.l #1, (%a2), the result
in d0 is never used again, and adding directly to memory will update
the status flags. Would save 4 bytes, and 8 cycles on a 68000.
How would I attack this problem? Peephole optimisation, or maybe the
gcc is not aware that the instruction updates flags?

Question 2:
Doing this as a move.l 4(%a2), %d0 to a temporary data register
would update the status register, allowing for the branch without the
compare with immediate instruction. Obviously requiring an extra move
%d0, %a0 if the branch is not taken to be able to make the jump. But
still 2 bytes, and 8 cycles saved in work case (12 cycles is best
case).
Is this a peephole optimisation? Or is it about providing accurate
instruction costs for inst?

Question 3:
Storing a2 on the stack is only ever needed if this code path is
taken. Is this even worth to bother with? And is this something that
moving from reload to LRA for the m68k target solves?

// Fredrik Olsson


Re: Fwd: LLVM collaboration?

2014-02-12 Thread Richard Biener
On Tue, Feb 11, 2014 at 10:20 PM, Jan Hubicka hubi...@ucw.cz wrote:
  Since both toolchains do the magic, binutils has no incentive to
  create any automatic detection of objects.

 It is mostly a historical decision. At the time the design was for the
 plugin to be matched to the compiler, and so the compiler could pass
 that information down to the linker.

  The trouble however is that one needs to pass explicit --plugin argument
  specifying the particular plugin to load and so GCC ships with its own 
  wrappers
  (gcc-nm/gcc-ld/gcc-ar and the gcc driver itself) while LLVM does similar 
  thing.

 These wrappers should not be necessary. While the linker currently
 requires a command line option, bfd has support for searching for a
 plugin. It will search inst/lib/bfd-plugin. See for example the
 instructions at http://llvm.org/docs/GoldPlugin.html.

 My reading of bfd/plugin.c is that it basically walks the directory and looks
 for first plugin that returns OK for onload. (that is always the case for
 GCC/LLVM plugins).  So if I instlal GCC and llvm plugin there it will
 depend who will end up being first and only that plugin will be used.

 We need multiple plugin support as suggested by the directory name ;)

 Also it sems that currently plugin is not used if file is ELF for ar/nm/ranlib
 (as mentioned by Markus) and also GNU-ld seems to choke on LLVM object files
 even if it has plugin.

 This probably needs ot be sanitized.


 This was done because ar and nm are not normally bound to any
 compiler. Had we realized this issue earlier we would probably have
 supported searching for plugins in the linker too.

 So it seems that what you want could be done by

 * having bfd-ld and gold search bfd-plugins (maybe rename the directory?)
 * support loading multiple plugins, and asking each to see if it
 supports a given file. That ways we could LTO when having a part GCC
 and part LLVM build.

 Yes, that is what I have in mind.

 Plus perhaps additional configuration file to avoid loading everything.  Say
 user instealls 3 versions of LLVM, open64 and ICC. If all of them loads as a
 shared library, like LLVM does, it will probably slow down the tools
 measurably.

What about instead of our current odd way of identifying LTO objects
simply add a special ELF note telling the linker the plugin to use?

.note._linker_plugin '/./libltoplugin.so'

that way the linker should try 1) loading that plugin, 2) register the
specific object with that plugin.

If a full path is undesired (depends on install setup) then specifying
the plugin SONAME might also work (we'd of course need to bump
our plugins SONAME for each release to allow parallel install
of multiple versions or make the plugin contain all the
dispatch-to-different-GCC-version-lto-wrapper code).

Richard.

 * maybe be smart about version and load new ones first? (libLLVM-3.4
 before libLLVM-3.3 for example). Probably the first one should always
 be the one given in the command line.

 Yes, i think we may want to prioritize the list.  So user can prevail
 his own version of GCC over the system one, for example.

 For OS X the situation is a bit different. There instead of a plugin
 the linker loads a library: libLTO.dylib. When doing LTO with a newer
 llvm, one needs to set DYLD_LIBRARY_PATH. I think I proposed setting
 that from clang some time ago, but I don't remember the outcome.

 In theory GCC could implement a libLTO.dylib and set
 DYLD_LIBRARY_PATH. The gold/bfd plugin that LLVM uses is basically a
 API mapping the other way, so the job would be inverting it. The LTO
 model ld64 is a bit more strict about knowing all symbol definitions
 and uses (including inline asm), so there would be work to be done to
 cover that, but the simple cases shouldn't be too hard.

 I would not care that much about symbols in asm definitions to start with.
 Even if we will force users to non-LTO those object files, it would be an
 improvement over what we have now.

 One problem is that we need a volunteer to implement the reverse glue
 (libLTO-plugin API), since I do not have an OS X box (well, have an old G5,
 but even that is quite far from me right now)

 Why complete symbol tables are required? Can't ld64 be changed to ignore
 unresolved symbols in the first stage just like gold/gnu-ld does?

 Honza

 Cheers,
 Rafael


Re: sparse overlapping structs for vectorization

2014-02-12 Thread Richard Biener
On Wed, Feb 12, 2014 at 7:21 AM, Albert Cahalan acaha...@gmail.com wrote:
 I had a problem that got solved in an ugly way. I think gcc ought
 to provide a few ways to make a nicer solution.

 There was an array of structs roughly like so:

 struct{int w;float x;char y[4];short z[2];}foo[512][4];

 The types within the struct are 4 bytes each; I don't actually
 remember anything else and it doesn't matter except that they
 are distinct. I think it was bitfields actually, neatly grouped
 into groups of 32 bits. In other words, like 4 4-byte values
 but with more-or-less incompatible types.

 Note that 4 of the structs neatly fill a 64-byte cache line.
 An alignment attribute was used to ensure 64-byte alignment.

 The most common operation needed on this array is to compare
 the first struct member of 4 of the structs against a given
 value, looking to see if there is a match. SSE would be good.
 This would then be followed by using the matching entry if
 there is one, else picking one of the 4 to recycle and thus use.

 First bad solution:

 One could load up 4 SSE registers, shuffle things around... NO.

 Second bad solution:

 One could simply have 4 distinct arrays. This is bad because
 there are different cache lines for w, x, y, and z.

 Third bad solution:

 The array can be viewed as int foo[512][4][4] instead, with
 the struct forming the third array index. Note that the last two
 array indexes are both 4, so you can kind of swap them around.
 This groups 4 fields of each type together, allowing SSE. The
 problem here is loss of type safety; one must use array indexes
 instead of struct field names. Like so: foo[idx][WHERE_W_IS][i]

 Fourth bad solution:

 We lay things out as in the third solution, but we cast pointers
 to effectively lay sparse structs over each other like shingles.
 {
 int w;
 int pad_wx[3];
 float x;
 int pad_xy[3];
 char y[4];
 int pad_yz[3];
 short z[2];
 }
 Performance is hurt by the need for __may_alias__ and of course
 the result is painful to look at. We went with this anyway, using
 SSE intrinsics, and performance was great. Maintainability... not
 so much.

 BTW, an array of 512 structs containing 4-entry arrays was not used
 because we wanted to have a simple normal pointer to indicate the
 item being operated on. We didn't want to need a pointer,index pair.

 Can something be done to help out here? The first thing that pops
 into mind is the ability to tell gcc that the struct-to-struct
 byte offset for array indexing is a user-specified value instead
 of simply the struct size.

 It's possible we could have safely ignored the warning about aliasing.
 I don't know. Perhaps that would give even better performance, but
 the casting would still be very ugly.

 Solutions that that be defined away for non-gcc compilers are better.

Do the overlay but use an overlay of type char[large enough] and
load from that.  Should be more maintainable than using may_alias
and also work with other compilers.

Richard.


Re: Fwd: LLVM collaboration?

2014-02-12 Thread Rafael Espíndola
 What about instead of our current odd way of identifying LTO objects
 simply add a special ELF note telling the linker the plugin to use?

 .note._linker_plugin '/./libltoplugin.so'

 that way the linker should try 1) loading that plugin, 2) register the
 specific object with that plugin.

 If a full path is undesired (depends on install setup) then specifying
 the plugin SONAME might also work (we'd of course need to bump
 our plugins SONAME for each release to allow parallel install
 of multiple versions or make the plugin contain all the
 dispatch-to-different-GCC-version-lto-wrapper code).

Might be an interesting addition to what we have, but keep in mind
that LLVM uses thin non-ELF files. It is also able to load IR from
previous versions, so for LLVM at least, using the newest plugin is
probably the best default.

 Richard.

Cheers,
Rafael


Re: m68k optimisation for beginners?

2014-02-12 Thread Jeff Law

On 02/12/14 02:37, Fredrik Olsson wrote:

Hi.

I would like to get started with how to improve code generation for a
backend. Any pointers, especially to good documentation is welcome.

For this example consider this C function for a reference counted type:
void TCRelease(TCTypeRef tc) {
   if (--tc-retainCount == 0) {
 if (tc-destroy) {
   tc-destroy(tc);
 }
 free((void *)tc);
   }
}

The generated m68k asm is this:
_TCRelease:
 move.l %a2,-(%sp)
 move.l 8(%sp),%a2
 move.w (%a2),%d0  ; Question 1:
 subq.w #1,%d0
 move.w %d0,(%a2)
jne .L7
 move.l 4(%a2),%a0  ; Question 2:
 cmp.w #0,%a0
jeq .L9
 move.l %a2,-(%sp)   ; Question 3:
 jsr (%a0)
 addq.l #4,%sp
.L9:
 move.l %a2,8(%sp)
 move.l (%sp)+,%a2
 jra _free
.L7:
 move.l (%sp)+,%a2
 rts

Question 1:
This could be done as one instructions sub.l #1, (%a2), the result
in d0 is never used again, and adding directly to memory will update
the status flags. Would save 4 bytes, and 8 cycles on a 68000.
How would I attack this problem? Peephole optimisation, or maybe the
gcc is not aware that the instruction updates flags?
Most likely an issue in the combiner.  Prior to conversion to RTL the 
decrement is turned into a three statement format (load from mem, 
decrement, store back to memory).  The decremented value is used in the 
comparison.  So I can reasonably guess the combiner is unable to squash 
all that back into a single insn.


Also note that flags are effectively not exposed on the m68k. Instead a 
conditional branch is modeled as two insns.  One which sets a special 
register, cc0 and one that uses the cc0 register.  Those two insns are 
kept consecutive throughout the RTL optimizers and only during final 
assembly do we try to eliminate the compare by tracking the state of the 
flags register.


There are better ways to do that, but nobody has converted the m68k to 
the newer style.  It's a fair amount of work and not a high priority.




Question 2:
Doing this as a move.l 4(%a2), %d0 to a temporary data register
would update the status register, allowing for the branch without the
compare with immediate instruction. Obviously requiring an extra move
%d0, %a0 if the branch is not taken to be able to make the jump. But
still 2 bytes, and 8 cycles saved in work case (12 cycles is best
case).
Is this a peephole optimisation? Or is it about providing accurate
instruction costs for inst?

Can't be tackled without first fixing how we track the flags register.



Question 3:
Storing a2 on the stack is only ever needed if this code path is
taken. Is this even worth to bother with? And is this something that
moving from reload to LRA for the m68k target solves?
This is called shrink wrapping.  GCC has some limited support for 
shrink-wrapping these days.  Someone would have to look into why the 
shrink-wrapping optimization did not apply here.


Jeff




Re: Fwd: LLVM collaboration?

2014-02-12 Thread Joseph S. Myers
On Wed, 12 Feb 2014, Richard Biener wrote:

 What about instead of our current odd way of identifying LTO objects
 simply add a special ELF note telling the linker the plugin to use?
 
 .note._linker_plugin '/./libltoplugin.so'
 
 that way the linker should try 1) loading that plugin, 2) register the
 specific object with that plugin.

Unless this is only allowed for a whitelist of known-good plugins in 
known-good directories, it's a clear security hole for the linker to 
execute code in arbitrary files named by linker input.  The linker should 
be safe to run on untrusted input files.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Fwd: LLVM collaboration?

2014-02-12 Thread Jan Hubicka
 On Wed, 12 Feb 2014, Richard Biener wrote:
 
  What about instead of our current odd way of identifying LTO objects
  simply add a special ELF note telling the linker the plugin to use?
  
  .note._linker_plugin '/./libltoplugin.so'
  
  that way the linker should try 1) loading that plugin, 2) register the
  specific object with that plugin.
 
 Unless this is only allowed for a whitelist of known-good plugins in 
 known-good directories, it's a clear security hole for the linker to 
 execute code in arbitrary files named by linker input.  The linker should 
 be safe to run on untrusted input files.

Also I believe the flies should be independent of particular setup (that is not
contain a path) and probably host OS (that is not having .so extension) at 
least.
We need some versioning scheme for different versions of compilers.
Finally we need a solution for non-ELF LTO objects (like LLVM)

But yes, having an compiler independent way of declaring that plugin is needed
and what plugin should be uses seems possible.

Honza
 
 -- 
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Paul E. McKenney
On Tue, Feb 11, 2014 at 10:06:34PM -0800, Torvald Riegel wrote:
 On Tue, 2014-02-11 at 07:59 -0800, Paul E. McKenney wrote:
  On Mon, Feb 10, 2014 at 11:09:24AM -0800, Linus Torvalds wrote:
   On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel trie...@redhat.com wrote:
   
Intuitively, this is wrong because this let's the program take a step
the abstract machine wouldn't do.  This is different to the sequential
code that Peter posted because it uses atomics, and thus one can't
easily assume that the difference is not observable.
   
   Btw, what is the definition of observable for the atomics?
   
   Because I'm hoping that it's not the same as for volatiles, where
   observable is about the virtual machine itself, and as such volatile
   accesses cannot be combined or optimized at all.
   
   Now, I claim that atomic accesses cannot be done speculatively for
   writes, and not re-done for reads (because the value could change),
   but *combining* them would be possible and good.
   
   For example, we often have multiple independent atomic accesses that
   could certainly be combined: testing the individual bits of an atomic
   value with helper functions, causing things like load atomic, test
   bit, load same atomic, test another bit. The two atomic loads could
   be done as a single load without possibly changing semantics on a real
   machine, but if visibility is defined in the same way it is for
   volatile, that wouldn't be a valid transformation. Right now we use
   volatile semantics for these kinds of things, and they really can
   hurt.
   
   Same goes for multiple writes (possibly due to setting bits):
   combining multiple accesses into a single one is generally fine, it's
   *adding* write accesses speculatively that is broken by design..
   
   At the same time, you can't combine atomic loads or stores infinitely
   - visibility on a real machine definitely is about timeliness.
   Removing all but the last write when there are multiple consecutive
   writes is generally fine, even if you unroll a loop to generate those
   writes. But if what remains is a loop, it might be a busy-loop
   basically waiting for something, so it would be wrong (untimely) to
   hoist a store in a loop entirely past the end of the loop, or hoist a
   load in a loop to before the loop.
   
   Does the standard allow for that kind of behavior?
  
  You asked!  ;-)
  
  So the current standard allows merging of both loads and stores, unless of
  course ordring constraints prevent the merging.  Volatile semantics may be
  used to prevent this merging, if desired, for example, for real-time code.
 
 Agreed.
 
  Infinite merging is intended to be prohibited, but I am not certain that
  the current wording is bullet-proof (1.10p24 and 1.10p25).
 
 Yeah, maybe not.  But it at least seems to rather clearly indicate the
 intent ;)

That is my hope.  ;-)

  The only prohibition against speculative stores that I can see is in a
  non-normative note, and it can be argued to apply only to things that are
  not atomics (1.10p22).
 
 I think this one is specifically about speculative stores that would
 affect memory locations that the abstract machine would not write to,
 and that might be observable or create data races.  While a compiler
 could potentially prove that such stores aren't leading to a difference
 in the behavior of the program (e.g., by proving that there are no
 observers anywhere and this isn't overlapping with any volatile
 locations), I think that this is hard in general and most compilers will
 just not do such things.  In GCC, bugs in that category were fixed after
 researchers doing fuzz-testing found them (IIRC, speculative stores by
 loops).

And that is my fear.  ;-)

  I don't see any prohibition against reordering
  a store to precede a load preceding a conditional branch -- which would
  not be speculative if the branch was know to be taken and the load
  hit in the store buffer.  In a system where stores could be reordered,
  some other CPU might perceive the store as happening before the load
  that controlled the conditional branch.  This needs to be addressed.
 
 I don't know the specifics of your example, but from how I understand
 it, I don't see a problem if the compiler can prove that the store will
 always happen.

The current Documentation/memory-barriers.txt formulation requires
that both the load and the store have volatile semantics.  Does
that help?

 To be more specific, if the compiler can prove that the store will
 happen anyway, and the region of code can be assumed to always run
 atomically (e.g., there's no loop or such in there), then it is known
 that we have one atomic region of code that will always perform the
 store, so we might as well do the stuff in the region in some order.

And it would be very hard to write a program that proved that the
store had been reordered prior to the load in this case.

 Now, if any of the memory accesses are atomic, then the 

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Paul E. McKenney
On Wed, Feb 12, 2014 at 10:19:07AM +0100, Peter Zijlstra wrote:
  I don't know the specifics of your example, but from how I understand
  it, I don't see a problem if the compiler can prove that the store will
  always happen.
  
  To be more specific, if the compiler can prove that the store will
  happen anyway, and the region of code can be assumed to always run
  atomically (e.g., there's no loop or such in there), then it is known
  that we have one atomic region of code that will always perform the
  store, so we might as well do the stuff in the region in some order.
  
  Now, if any of the memory accesses are atomic, then the whole region of
  code containing those accesses is often not atomic because other threads
  might observe intermediate results in a data-race-free way.
  
  (I know that this isn't a very precise formulation, but I hope it brings
  my line of reasoning across.)
 
 So given something like:
 
   if (x)
   y = 3;
 
 assuming both x and y are atomic (so don't gimme crap for now knowing
 the C11 atomic incantations); and you can prove x is always true; you
 don't see a problem with not emitting the conditional?

You need volatile semantics to force the compiler to ignore any proofs
it might otherwise attempt to construct.  Hence all the ACCESS_ONCE()
calls in my email to Torvald.  (Hopefully I translated your example
reasonably.)

Thanx, Paul

 Avoiding the conditional changes the result; see that control dependency
 email from earlier. In the above example the load of X and the store to
 Y are strictly ordered, due to control dependencies. Not emitting the
 condition and maybe not even emitting the load completely wrecks this.
 
 Its therefore an invalid optimization to take out the conditional or
 speculate the store, since it takes out the dependency.
 



Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Peter Zijlstra
On Wed, Feb 12, 2014 at 09:42:09AM -0800, Paul E. McKenney wrote:
 You need volatile semantics to force the compiler to ignore any proofs
 it might otherwise attempt to construct.  Hence all the ACCESS_ONCE()
 calls in my email to Torvald.  (Hopefully I translated your example
 reasonably.)

My brain gave out for today; but it did appear to have the right
structure.

I would prefer it C11 would not require the volatile casts. It should
simply _never_ speculate with atomic writes, volatile or not.





Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Paul E. McKenney
On Tue, Feb 11, 2014 at 09:39:24PM -0800, Torvald Riegel wrote:
 On Mon, 2014-02-10 at 11:09 -0800, Linus Torvalds wrote:
  On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel trie...@redhat.com wrote:
  
   Intuitively, this is wrong because this let's the program take a step
   the abstract machine wouldn't do.  This is different to the sequential
   code that Peter posted because it uses atomics, and thus one can't
   easily assume that the difference is not observable.
  
  Btw, what is the definition of observable for the atomics?
  
  Because I'm hoping that it's not the same as for volatiles, where
  observable is about the virtual machine itself, and as such volatile
  accesses cannot be combined or optimized at all.
 
 No, atomics aren't an observable behavior of the abstract machine
 (unless they are volatile).  See 1.8.p8 (citing the C++ standard).

Us Linux-kernel hackers will often need to use volatile semantics in
combination with C11 atomics in most cases.  The C11 atomics do cover
some of the reasons we currently use ACCESS_ONCE(), but not all of them --
in particular, it allows load/store merging.

  Now, I claim that atomic accesses cannot be done speculatively for
  writes, and not re-done for reads (because the value could change),
 
 Agreed, unless the compiler can prove that this doesn't make a
 difference in the program at hand and it's not volatile atomics.  In
 general, that will be hard and thus won't happen often I suppose, but if
 correctly proved it would fall under the as-if rule I think.
 
  but *combining* them would be possible and good.
 
 Agreed.

In some cases, agreed.  But many uses in the Linux kernel will need
volatile semantics in combination with C11 atomics.  Which is OK, for
the foreseeable future, anyway.

  For example, we often have multiple independent atomic accesses that
  could certainly be combined: testing the individual bits of an atomic
  value with helper functions, causing things like load atomic, test
  bit, load same atomic, test another bit. The two atomic loads could
  be done as a single load without possibly changing semantics on a real
  machine, but if visibility is defined in the same way it is for
  volatile, that wouldn't be a valid transformation. Right now we use
  volatile semantics for these kinds of things, and they really can
  hurt.
 
 Agreed.  In your example, the compiler would have to prove that the
 abstract machine would always be able to run the two loads atomically
 (ie, as one load) without running into impossible/disallowed behavior of
 the program.  But if there's no loop or branch or such in-between, this
 should be straight-forward because any hardware oddity or similar could
 merge those loads and it wouldn't be disallowed by the standard
 (considering that we're talking about a finite number of loads), so the
 compiler would be allowed to do it as well.

As long as they are not marked volatile, agreed.

Thanx, Paul

  Same goes for multiple writes (possibly due to setting bits):
  combining multiple accesses into a single one is generally fine, it's
  *adding* write accesses speculatively that is broken by design..
 
 Agreed.  As Paul points out, this being correct assumes that there are
 no other ordering guarantees or memory accesses interfering, but if
 the stores are to the same memory location and adjacent to each other in
 the program, then I don't see a reason why they wouldn't be combinable.
 
  At the same time, you can't combine atomic loads or stores infinitely
  - visibility on a real machine definitely is about timeliness.
  Removing all but the last write when there are multiple consecutive
  writes is generally fine, even if you unroll a loop to generate those
  writes. But if what remains is a loop, it might be a busy-loop
  basically waiting for something, so it would be wrong (untimely) to
  hoist a store in a loop entirely past the end of the loop, or hoist a
  load in a loop to before the loop.
 
 Agreed.  That's what 1.10p24 and 1.10p25 are meant to specify for loads,
 although those might not be bullet-proof as Paul points out.  Forward
 progress is rather vaguely specified in the standard, but at least parts
 of the committee (and people in ISO C++ SG1, in particular) are working
 on trying to improve this.
 
  Does the standard allow for that kind of behavior?
 
 I think the standard requires (or intends to require) the behavior that
 you (and I) seem to prefer in these examples.
 
 



Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Paul E. McKenney
On Tue, Feb 11, 2014 at 09:13:34PM -0800, Torvald Riegel wrote:
 On Sun, 2014-02-09 at 19:51 -0800, Paul E. McKenney wrote:
  On Mon, Feb 10, 2014 at 01:06:48AM +0100, Torvald Riegel wrote:
   On Thu, 2014-02-06 at 20:20 -0800, Paul E. McKenney wrote:
On Fri, Feb 07, 2014 at 12:44:48AM +0100, Torvald Riegel wrote:
 On Thu, 2014-02-06 at 14:11 -0800, Paul E. McKenney wrote:
  On Thu, Feb 06, 2014 at 10:17:03PM +0100, Torvald Riegel wrote:
   On Thu, 2014-02-06 at 11:27 -0800, Paul E. McKenney wrote:
On Thu, Feb 06, 2014 at 06:59:10PM +, Will Deacon wrote:
 There are also so many ways to blow your head off it's 
 untrue. For example,
 cmpxchg takes a separate memory model parameter for failure 
 and success, but
 then there are restrictions on the sets you can use for each. 
 It's not hard
 to find well-known memory-ordering experts shouting Just use
 memory_model_seq_cst for everything, it's too hard 
 otherwise. Then there's
 the fun of load-consume vs load-acquire (arm64 GCC completely 
 ignores consume
 atm and optimises all of the data dependencies away) as well 
 as the definition
 of data races, which seem to be used as an excuse to 
 miscompile a program
 at the earliest opportunity.

Trust me, rcu_dereference() is not going to be defined in terms 
of
memory_order_consume until the compilers implement it both 
correctly and
efficiently.  They are not there yet, and there is currently no 
shortage
of compiler writers who would prefer to ignore 
memory_order_consume.
   
   Do you have any input on
   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448?  In 
   particular, the
   language standard's definition of dependencies?
  
  Let's see...  1.10p9 says that a dependency must be carried unless:
  
  — B is an invocation of any specialization of std::kill_dependency 
  (29.3), or
  — A is the left operand of a built-in logical AND (, see 5.14) or 
  logical OR (||, see 5.15) operator,
  or
  — A is the left operand of a conditional (?:, see 5.16) operator, or
  — A is the left operand of the built-in comma (,) operator (5.18);
  
  So the use of flag before the ? is ignored.  But the flag - 
  flag
  after the ? will carry a dependency, so the code fragment in 59448
  needs to do the ordering rather than just optimizing flag - flag 
  out
  of existence.  One way to do that on both ARM and Power is to 
  actually
  emit code for flag - flag, but there are a number of other ways to
  make that work.
 
 And that's what would concern me, considering that these requirements
 seem to be able to creep out easily.  Also, whereas the other atomics
 just constrain compilers wrt. reordering across atomic accesses or
 changes to the atomic accesses themselves, the dependencies are new
 requirements on pieces of otherwise non-synchronizing code.  The 
 latter
 seems far more involved to me.

Well, the wording of 1.10p9 is pretty explicit on this point.
There are only a few exceptions to the rule that dependencies from
memory_order_consume loads must be tracked.  And to your point about
requirements being placed on pieces of otherwise non-synchronizing code,
we already have that with plain old load acquire and store release --
both of these put ordering constraints that affect the surrounding
non-synchronizing code.
   
   I think there's a significant difference.  With acquire/release or more
   general memory orders, it's true that we can't order _across_ the atomic
   access.  However, we can reorder and optimize without additional
   constraints if we do not reorder.  This is not the case with consume
   memory order, as the (p + flag - flag) example shows.
  
  Agreed, memory_order_consume does introduce additional restrictions.
  
This issue got a lot of discussion, and the compromise is that
dependencies cannot leak into or out of functions unless the relevant
parameters or return values are annotated with [[carries_dependency]].
This means that the compiler can see all the places where dependencies
must be tracked.  This is described in 7.6.4.
   
   I wasn't aware of 7.6.4 (but it isn't referred to as an additional
   constraint--what it is--in 1.10, so I guess at least that should be
   fixed).
   Also, AFAIU, 7.6.4p3 is wrong in that the attribute does make a semantic
   difference, at least if one is assuming that normal optimization of
   sequential code is the default, and that maintaining things such as
   (flag-flag) is not; if optimizing away (flag-flag) would require the
   insertion of fences unless there is the carries_dependency attribute,
   then this would be bad I think.
  
  No, the 

Re: [LLVMdev] Zero-cost toolchain standardization process

2014-02-12 Thread Chris Lattner
On Feb 11, 2014, at 10:59 AM, Renato Golin renato.go...@linaro.org wrote:
 Hi Folks,
 
 First of all, I'd like to thank everyone for their great responses and
 heart warming encouragement for such an enterprise. This will be my
 last email about this subject on these lists, so I'd like to just let
 everyone know what (and where) I'll be heading next with this topic.
 Feel free to reply to me personally, I don't want to span an ugly
 two-list thread.

Renato, thank you for spearheading this, but please do not cross post to both 
lists like this.  Among other problems it is a severe pain for moderation.

I’m a fan of your goals, but I’d like to point out that we have already solved 
this problem in various ways.  For example, C++ ABI issues are dealt with quite 
well across GCC, LLVM, and many other compilers on the “itanium” ABI mailing 
list.  It’s a great example of a list hosted in a “neutral” place that many 
compiler vendors are on, including commercial ones.

Why don’t you just set up a few similar mailing lists to cover related topics 
(toolchain topics, language extensions, etc) and encourage the right people to 
join them?  I feel like you’re turning a simple problem into a complex one.

-Chris

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Linus Torvalds
On Wed, Feb 12, 2014 at 10:07 AM, Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:

 Us Linux-kernel hackers will often need to use volatile semantics in
 combination with C11 atomics in most cases.  The C11 atomics do cover
 some of the reasons we currently use ACCESS_ONCE(), but not all of them --
 in particular, it allows load/store merging.

I really disagree with the will need to use volatile.

We should never need to use volatile (outside of whatever MMIO we do
using C) if C11 defines atomics correctly.

Allowing load/store merging is *fine*. All sane CPU's do that anyway -
it's called a cache - and there's no actual reason to think that
ACCESS_ONCE() has to mean our current volatile.

Now, it's possible that the C standards simply get atomics _wrong_, so
that they create visible semantics that are different from what a CPU
cache already does, but that's a plain bug in the standard if so.

But merging loads and stores is fine. And I *guarantee* it is fine,
exactly because CPU's already do it, so claiming that the compiler
couldn't do it is just insanity.

Now, there are things that are *not* fine, like speculative stores
that could be visible to other threads. Those are *bugs* (either in
the compiler or in the standard), and anybody who claims otherwise is
not worth discussing with.

But I really really disagree with the we might have to use
'volatile'. Because if we *ever* have to use 'volatile' with the
standard C atomic types, then we're just better off ignoring the
atomic types entirely, because they are obviously broken shit - and
we're better off doing it ourselves the way we have forever.

Seriously. This is not even hyperbole. It really is as simple as that.

  Linus


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Paul E. McKenney
On Wed, Feb 12, 2014 at 12:22:53PM -0800, Linus Torvalds wrote:
 On Wed, Feb 12, 2014 at 10:07 AM, Paul E. McKenney
 paul...@linux.vnet.ibm.com wrote:
 
  Us Linux-kernel hackers will often need to use volatile semantics in
  combination with C11 atomics in most cases.  The C11 atomics do cover
  some of the reasons we currently use ACCESS_ONCE(), but not all of them --
  in particular, it allows load/store merging.
 
 I really disagree with the will need to use volatile.
 
 We should never need to use volatile (outside of whatever MMIO we do
 using C) if C11 defines atomics correctly.
 
 Allowing load/store merging is *fine*. All sane CPU's do that anyway -
 it's called a cache - and there's no actual reason to think that
 ACCESS_ONCE() has to mean our current volatile.
 
 Now, it's possible that the C standards simply get atomics _wrong_, so
 that they create visible semantics that are different from what a CPU
 cache already does, but that's a plain bug in the standard if so.
 
 But merging loads and stores is fine. And I *guarantee* it is fine,
 exactly because CPU's already do it, so claiming that the compiler
 couldn't do it is just insanity.

Agreed, both CPUs and compilers can merge loads and stores.  But CPUs
normally get their stores pushed through the store buffer in reasonable
time, and CPUs also use things like invalidations to ensure that a
store is seen in reasonable time by readers.  Compilers don't always
have these two properties, so we do need to be more careful of load
and store merging by compilers.

 Now, there are things that are *not* fine, like speculative stores
 that could be visible to other threads. Those are *bugs* (either in
 the compiler or in the standard), and anybody who claims otherwise is
 not worth discussing with.

And as near as I can tell, volatile semantics are required in C11 to
avoid speculative stores.  I might be wrong about this, and hope that
I am wrong.  But I am currently not seeing it in the current standard.
(Though I expect that most compilers would avoid speculating stores,
especially in the near term.

 But I really really disagree with the we might have to use
 'volatile'. Because if we *ever* have to use 'volatile' with the
 standard C atomic types, then we're just better off ignoring the
 atomic types entirely, because they are obviously broken shit - and
 we're better off doing it ourselves the way we have forever.
 
 Seriously. This is not even hyperbole. It really is as simple as that.

Agreed, if we are talking about replacing ACCESS_ONCE() with C11
relaxed atomics any time soon.  But someone porting Linux to a
new CPU architecture might use a carefully chosen subset of C11
atomics to implement some of the Linux atomic operations, especially
non-value-returning atomics such as atomic_inc().

Thanx, Paul



Aarch64 implementation for dwarf exception handling

2014-02-12 Thread Shiva Chen
Hi,

I have a question about the implementation of

aarch64_final_eh_return_addr

which is used to point out the return address of the frame

According the source code

If FP is not needed

  return gen_frame_mem (DImode,
plus_constant (Pmode,
   stack_pointer_rtx,
   fp_offset
   + cfun-machine-frame.saved_regs_size
   - 2 * UNITS_PER_WORD));


According the frame layout

+---+ -- arg_pointer_rtx
|
|  callee-allocated save area
|  for register varargs
|
+---+
|
|  local variables
|
+---+ -- frame_pointer_rtx
|
|  callee-saved registers
|
+---+
|  LR'
+---+
|  FP'
   P+---+ -- hard_frame_pointer_rtx
|  dynamic allocation
+---+
|
|  outgoing stack arguments
|
+---+ -- stack_pointer_rtx

Shouldn't the return value be

  return gen_frame_mem (DImode,
plus_constant (Pmode,
   stack_pointer_rtx,
   fp_offset
   +  2* UNITS_PER_WORD));

Or I just mis-understanding something ?


Hope someone could give me a tip.

It would be very helpful.

Thanks

Shiva Chen


Dead code elimination PROBLEM

2014-02-12 Thread chronicle

Hi PPL i developed a plugin that  produces the following gimple

test ()
{
  int selected_fnc_var_.3;
  int random_Var.2;
  int D.2363;
  int _1;

  bb 2:
  random_Var.2_2 = rand ();
  selected_fnc_var_.3_3 = random_Var.2_2 %[fl] 5;
  if (selected_fnc_var_.3_3 == 4) goto L7;
  if (selected_fnc_var_.3_3 == 3) goto L6;
  if (selected_fnc_var_.3_3 == 2) goto L5;
  if (selected_fnc_var_.3_3 == 1) goto L4;
  if (selected_fnc_var_.3_3 == 0) goto L3;
L7:
  _1 = f.clone.4 (t, t);
  goto L8;
L6:
  _1 = f.clone.3 (t, t);
  goto L8;
L5:
  _1 = f.clone.2 (t, t);
  goto L8;
L4:
  _1 =f.clone.1 (t, t);
  goto L8;

L8:
  if (_1 != 0)
goto bb 3;
  else
goto bb 4;

  bb 3:
  __builtin_puts ( f success [0]);
  goto bb 5;

  bb 4:
  __builtin_puts ( f failed [0]);

  bb 5:
  return;

}

with this final code

004005c6 test:
  4005c6:55   push   %rbp
  4005c7:48 89 e5 mov%rsp,%rbp
  4005ca:53   push   %rbx
  4005cb:48 83 ec 08  sub$0x8,%rsp
  4005cf:e8 6c fe ff ff   callq  400440 rand@plt
  4005d4:89 d9mov%ebx,%ecx
  4005d6:c1 f9 1f sar$0x1f,%ecx
  4005d9:89 d8mov%ebx,%eax
  4005db:31 c8xor%ecx,%eax
  4005dd:ba 67 66 66 66   mov$0x6667,%edx
  4005e2:f7 e2mul%edx
  4005e4:89 d0mov%edx,%eax
  4005e6:d1 e8shr%eax
  4005e8:31 c8xor%ecx,%eax
  4005ea:89 c2mov%eax,%edx
  4005ec:c1 e2 02 shl$0x2,%edx
  4005ef:01 c2add%eax,%edx
  4005f1:89 d8mov%ebx,%eax
  4005f3:29 d0sub%edx,%eax
  4005f5:83 f8 04 cmp$0x4,%eax
  4005f8:75 32jne40062c test+0x66
  4005fa:83 f8 03 cmp$0x3,%eax
  4005fd:74 2dje 40062c test+0x66
  4005ff:83 f8 02 cmp$0x2,%eax
  400602:74 28je 40062c test+0x66
  400604:83 f8 01 cmp$0x1,%eax
  400607:74 23je 40062c test+0x66
  400609:85 c0test   %eax,%eax
  40060b:74 1fje 40062c test+0x66
  40060d:be bc 09 40 00   mov$0x4009bc,%esi
  400612:bf c6 09 40 00   mov$0x4009c6,%edi
  400617:e8 7d 02 00 00   callq  400899 f.clone.4
  40061c:85 c0test   %eax,%eax
  40061e:75 0cjne40062c test+0x66
  400620:bf d0 09 40 00   mov$0x4009d0,%edi
  400625:e8 e6 fd ff ff   callq  400410 puts@plt
  40062a:eb 0ajmp400636 test+0x70
  40062c:bf e8 09 40 00   mov$0x4009e8,%edi
  400631:e8 da fd ff ff   callq  400410 puts@plt
  400636:48 83 c4 08  add$0x8,%rsp
  40063a:5b   pop%rbx
  40063b:5d   pop%rbp
  40063c:c3   retq


from this gimple

test(){

int D.2363;
  int _1;

  bb 2:
  _1 = f(t, t);
  if (_1 != 0)
goto bb 3;
  else
goto bb 4;

  bb 3:
  __builtin_puts ( f [0]);
  goto bb 5;

  bb 4:
  __builtin_puts ( f [0]);

  bb 5:
  return;
}

as you can see in the dis output code, its only make call to f.clone.4 
(  callq  400899 f.clone.4 ), i suppose is the dead code elimination 
pass is the responsable of this action, i tryed to disable it using -O0 
compilation option but without success. my question is how can i make 
the compiler produce the final code without deleting those dead codes 
portion ( do i need to make any kind of PHI nodes in the labels to 
achive that, if so how could i do that ? )


thanks in advance


[Bug c/60156] New: GCC doesn't warn about variadic main

2014-02-12 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60156

Bug ID: 60156
   Summary: GCC doesn't warn about variadic main
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org

E.g. on
int main (int argc, char *argv[], ...) { }
with -Wpedantic.


[Bug c/60156] GCC doesn't warn about variadic main

2014-02-12 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60156

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2014-02-12
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org ---
I have a patch for 5.0.


[Bug c++/60047] [4.7/4.8/4.9 Regression] ICE with defaulted copy constructor and virtual base class

2014-02-12 Thread paolo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60047

--- Comment #3 from paolo at gcc dot gnu.org paolo at gcc dot gnu.org ---
Author: paolo
Date: Wed Feb 12 08:45:46 2014
New Revision: 207712

URL: http://gcc.gnu.org/viewcvs?rev=207712root=gccview=rev
Log:
/cp
2014-02-12  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60047
* method.c (implicitly_declare_fn): A constructor of a class with
virtual base classes isn't constexpr (7.1.5p4).

/testsuite
2014-02-12  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60047
* g++.dg/cpp0x/pr60047.C: New.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/pr60047.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/method.c
trunk/gcc/testsuite/ChangeLog


[Bug c++/60047] [4.7/4.8 Regression] ICE with defaulted copy constructor and virtual base class

2014-02-12 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60047

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC|jason at gcc dot gnu.org   |
 Resolution|--- |FIXED
   Assignee|paolo.carlini at oracle dot com|unassigned at gcc dot 
gnu.org
   Target Milestone|--- |4.9.0
Summary|[4.7/4.8/4.9 Regression]|[4.7/4.8 Regression] ICE
   |ICE with defaulted copy |with defaulted copy
   |constructor and virtual |constructor and virtual
   |base class  |base class

--- Comment #4 from Paolo Carlini paolo.carlini at oracle dot com ---
Fixed for 4.9.0. Note that there is no ICE in release mode anyway.


[Bug target/60157] New: adding -mstrict-align for i386 and x86_64 architecture

2014-02-12 Thread vinxxe at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60157

Bug ID: 60157
   Summary: adding -mstrict-align for i386 and x86_64 architecture
   Product: gcc
   Version: 4.4.6
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vinxxe at gmail dot com

Created attachment 32113
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32113action=edit
source code to reproduce the problem

the pthread_cond_wait nptl function enters an infinite loop, never suspending
the calling thread, if the address of the condition variable is misaligned.
Now, this is not a gcc bug, obviously, but my question is: 
does it make sense to add the target option 

-mstrict-align

to the i386 and x86_64 architectures, so that these kind of problem can be
detected at compilation time?

attached you will find a source code example to reproduce the problem
execute the program with 

strace -f exec_name

to see a neverending series of 

[pid  2922] futex(0x80499fd, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EINVAL (Invalid
argument)

here follows some info of my linux machine


cat /proc/version
Linux version 2.6.32-220.7.1.el6.centos.plus.i686 (root@thalix11dev) (gcc
version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) ) #1 SMP Mon Oct 21 07:05:28
UTC 2013

rpm -qa | grep glibc
glibc-devel-2.12-1.47.i686
glibc-common-2.12-1.47.i686
glibc-2.12-1.47.i686
glibc-debuginfo-2.12-1.47.i686
glibc-headers-2.12-1.47.i686
glibc-utils-2.12-1.47.i686
glibc-debuginfo-common-2.12-1.47.i686
glibc-static-2.12-1.47.i686

cat /proc/cpuinfo
processor: 0
vendor_id: GenuineIntel
cpu family: 15
model: 3
model name: Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping: 4
cpu MHz: 2799.930
cache size: 1024 KB
fdiv_bug: no
hlt_bug: no
f00f_bug: no
coma_bug: no
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc up pebs bts
pni dtes64 monitor ds_cpl cid xtpr
bogomips: 5586.31
clflush size: 64
cache_alignment: 128
address sizes: 36 bits physical, 32 bits virtual
power management:

rpm -qa | grep gcc
gcc-4.4.6-3.el6.i686
libgcc-4.4.6-3.el6.i686
gcc-c++-4.4.6-3.el6.i686


[Bug rtl-optimization/60116] [4.8/4.9 Regression] wrong code at -Os on x86_64-linux-gnu in 32-bit mode

2014-02-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60116

--- Comment #16 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Author: ebotcazou
Date: Wed Feb 12 08:49:55 2014
New Revision: 207713

URL: http://gcc.gnu.org/viewcvs?rev=207713root=gccview=rev
Log:
PR rtl-optimization/60116
* combine.c (try_combine): Also remove dangling REG_DEAD notes on the
other_insn once the combination has been validated.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/20140212-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/combine.c
trunk/gcc/testsuite/ChangeLog


[Bug target/60157] adding -mstrict-align for i386 and x86_64 architecture

2014-02-12 Thread vinxxe at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60157

vinxxe at gmail dot com changed:

   What|Removed |Added

   Severity|normal  |enhancement


[Bug rtl-optimization/60116] [4.8/4.9 Regression] wrong code at -Os on x86_64-linux-gnu in 32-bit mode

2014-02-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60116

--- Comment #17 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Author: ebotcazou
Date: Wed Feb 12 08:51:57 2014
New Revision: 207714

URL: http://gcc.gnu.org/viewcvs?rev=207714root=gccview=rev
Log:
PR rtl-optimization/60116
* combine.c (try_combine): Also remove dangling REG_DEAD notes on the
other_insn once the combination has been validated.

Added:
branches/gcc-4_8-branch/gcc/testsuite/gcc.c-torture/execute/20140212-1.c
  - copied unchanged from r207713,
trunk/gcc/testsuite/gcc.c-torture/execute/20140212-1.c
Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/combine.c
branches/gcc-4_8-branch/gcc/testsuite/ChangeLog


[Bug rtl-optimization/60116] [4.8/4.9 Regression] wrong code at -Os on x86_64-linux-gnu in 32-bit mode

2014-02-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60116

Eric Botcazou ebotcazou at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Thanks for reporting the problem.


[Bug fortran/60060] [4.9 Regression] lto1: internal compiler error: in add_AT_specification, at dwarf2out.c:4096

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60060

--- Comment #9 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Wed Feb 12 09:01:30 2014
New Revision: 207715

URL: http://gcc.gnu.org/viewcvs?rev=207715root=gccview=rev
Log:
2014-02-12  Richard Biener  rguent...@suse.de

PR lto/60060
* lto-lang.c (lto_write_globals): Do not call
wrapup_global_declarations or emit_debug_global_declarations
but emit debug info for non-function scope variables
directly.

Modified:
trunk/gcc/lto/ChangeLog
trunk/gcc/lto/lto-lang.c


[Bug fortran/49636] [F03] ASSOCIATE construct confused with slightly complicated case

2014-02-12 Thread paul.richard.thomas at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49636

--- Comment #6 from paul.richard.thomas at gmail dot com paul.richard.thomas 
at gmail dot com ---
 Dear Dominique,

Thanks for the heads-up about -m32 - I thought that the code would be
immune to word length changes ***sigh***

Cheers

Paul

On 12 February 2014 00:40, dominiq at lps dot ens.fr
gcc-bugzi...@gcc.gnu.org wrote:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49636

 --- Comment #5 from Dominique d'Humieres dominiq at lps dot ens.fr ---
 Created attachment 32098 [details]
 A fix for this problem

 AFAICT it fixes the problem for 64 bit mode only. In 32 bit mode the ICE is
 gone, but I get at run time

 i_good= 1 3 5
  i_bad= 1** 3

 I am sure that this trick will fix pr57019 too.  This latter is claimed
 to be a regression but I am sure that it never worked :-)  Nonetheless,
 I will take advantage of the regression label!

 I will work on it tomorrow night.

 By the way, this patch regtests OK on trunk.  I have to make sure
 that substrings of character arrays work OK with ASSOCIATE.

 Did you regtest with -m32? I see gfortran.dg/associated_target_5.f03 failing 
 at
 execution time with -m32, as well as the first test in pr57522

0   1   2   3
0   4   1   5

 --
 You are receiving this mail because:
 You are on the CC list for the bug.
 You are the assignee for the bug.


[Bug debug/60152] [4.9 Regression] multiple AT_calling_convention attributes generated after r205679

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60152

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-02-12
 CC|rguenth at gcc dot gnu.org |
   Target Milestone|--- |4.9.0
Summary|[4.9.0 Regression] multiple |[4.9 Regression] multiple
   | AT_calling_convention  |AT_calling_convention
   |attributes generated after  |attributes generated after
   |r205679 |r205679
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
Confirmed.


[Bug fortran/60060] [4.9 Regression] lto1: internal compiler error: in add_AT_specification, at dwarf2out.c:4096

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60060

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Richard Biener rguenth at gcc dot gnu.org ---
Fixed.


[Bug middle-end/60092] posix_memalign not recognized to derive alias and alignment info

2014-02-12 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60092

Tobias Burnus burnus at gcc dot gnu.org changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #18 from Tobias Burnus burnus at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
 We could lower
   posix_memalign (ptr, align, size);
 to
   posix_memalign (ptr, align, size);
   ptr = __builtin_assume_algined (ptr, align);
 and hope for FRE to fix things up enough to make that useful.


I wonder about mm_malloc. I assume for config/i386/pmm_malloc.h, it is already
handled via posix_memalign, but shouldn't one also handle
config/i386/gmm_malloc.h? For instance via

--- a/gcc/config/i386/gmm_malloc.h
+++ b/gcc/config/i386/gmm_malloc.h
@@ -61,7 +61,11 @@ _mm_malloc (size_t size, size_t align)
   /* Store the original pointer just before p.  */
   ((void **) aligned_ptr) [-1] = malloc_ptr;

+#if defined(__GNUC__)  __GNUC__ = 4  __GNUC_MINOR__ = 7
+  return __builtin_assume_aligned(aligned_ptr, align);
+#else
   return aligned_ptr;
+#endif
 }

 static __inline__ void


[Bug debug/60152] [4.9 Regression] multiple AT_calling_convention attributes generated after r205679

2014-02-12 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60152

--- Comment #2 from Tobias Burnus burnus at gcc dot gnu.org ---
See PR 60060 comment 7 for some details and a backtrace of the two
add_calling_convention_attribute calls.


[Bug rtl-optimization/60116] [4.8/4.9 Regression] wrong code at -Os on x86_64-linux-gnu in 32-bit mode

2014-02-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60116

--- Comment #19 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Author: ebotcazou
Date: Wed Feb 12 10:16:34 2014
New Revision: 207716

URL: http://gcc.gnu.org/viewcvs?rev=207716root=gccview=rev
Log:
PR rtl-optimization/60116
* combine.c (try_combine): Fix oversight in previous change.

Modified:
trunk/gcc/combine.c


[Bug rtl-optimization/60116] [4.8/4.9 Regression] wrong code at -Os on x86_64-linux-gnu in 32-bit mode

2014-02-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60116

--- Comment #20 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Author: ebotcazou
Date: Wed Feb 12 10:17:08 2014
New Revision: 207717

URL: http://gcc.gnu.org/viewcvs?rev=207717root=gccview=rev
Log:
PR rtl-optimization/60116
* combine.c (try_combine): Fix oversight in previous change.

Modified:
branches/gcc-4_8-branch/gcc/combine.c


[Bug middle-end/60092] posix_memalign not recognized to derive alias and alignment info

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60092

--- Comment #19 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to Tobias Burnus from comment #18)
 (In reply to Richard Biener from comment #1)
  We could lower
posix_memalign (ptr, align, size);
  to
posix_memalign (ptr, align, size);
ptr = __builtin_assume_algined (ptr, align);
  and hope for FRE to fix things up enough to make that useful.
 
 
 I wonder about mm_malloc. I assume for config/i386/pmm_malloc.h, it is
 already handled via posix_memalign, but shouldn't one also handle
 config/i386/gmm_malloc.h? For instance via
 
 --- a/gcc/config/i386/gmm_malloc.h
 +++ b/gcc/config/i386/gmm_malloc.h
 @@ -61,7 +61,11 @@ _mm_malloc (size_t size, size_t align)
/* Store the original pointer just before p.  */
((void **) aligned_ptr) [-1] = malloc_ptr;
 
 +#if defined(__GNUC__)  __GNUC__ = 4  __GNUC_MINOR__ = 7
 +  return __builtin_assume_aligned(aligned_ptr, align);
 +#else
return aligned_ptr;
 +#endif
  }
 
  static __inline__ void

No, why?  ccp of course understands the dynamic realignment:
  aligned_ptr = (void *) (((size_t) malloc_ptr + align)
   ~((size_t) (align) - 1));
so will know that aligned_ptr is align bytes aligned.


[Bug c/60158] New: powerpc: usage of the .data.rel.ro.local section

2014-02-12 Thread jal2 at gmx dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60158

Bug ID: 60158
   Summary: powerpc: usage of the .data.rel.ro.local section
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jal2 at gmx dot de

This bug may concern the gcc documentation on section usage only.

Crosscompiling Das U-Boot with gcc 4.8.2 for powerpc with
-fpic -mrelocatable, some addresses are put into a section .data.rel.ro.local,
e.g. the address of qwerty from

printf(%p\n, qwerty);

There is no corresponding entry in the .fixup section.
As Das U-Boot relocates itself to RAM using .got2/.got and .fixup sections
only, how shall the section .data.rel.ro.local be handled?

Currently it contains addresses only, but this may depend on the source code.
I put .data.rel.ro.local into the GOT which solved my problem, but I'm not sure
if this is the intention of the gcc developers.

I've tried gcc 4.7.3 which put the address of qwerty into the GOT directly,
i.e. there was no .data.rel.ro.local section and the string address was
accessed with one redirection less.

details:
- gcc version: powerpc-softfloat-linux-gnuspe-gcc (Gentoo 4.8.2 p1.3r1,
pie-0.5.8r1) 4.8.2
- gcc command line (some -I removed):
  -g -gdwarf-2  -Os   -fpic -mrelocatable \
  -meabi \
  -D__KERNEL__ -DCONFIG_SYS_TEXT_BASE=0xef77 \
  -fno-builtin  -ffreestanding \
  -isystem /usr/lib/gcc/powerpc-softfloat-linux-gnuspe/4.8.1/include \
  -nostdinc -pipe  -DCONFIG_PPC -D__powerpc__ -ffixed-r2 -Wa,-me500 \
  -msoft-float -mno-string -mspe=yes -mno-spe -Wall -Wstrict-prototypes \
  -fno-stack-protector -Wno-format-nonliteral -Wno-format-security \
  -fstack-usage


[Bug rtl-optimization/60159] New: improve code for conditional sibcall

2014-02-12 Thread jay.foad at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60159

Bug ID: 60159
   Summary: improve code for conditional sibcall
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jay.foad at gmail dot com

If I compile this code for x86-64 I get:

$ cat jcc.c
extern int f(int x);
int g(int x) { return x  3 ? f(x) : x; }

$ cc1 -quiet -O3 jcc.c -o -
...
g:
.LFB0:
.cfi_startproc
cmpl$3, %edi
jg  .L4
movl%edi, %eax
ret
.p2align 4,,10
.p2align 3
.L4:
jmp f
.cfi_endproc

This code would be simpler and shorter if the jg-to-jmp sequence was replaced
with a single jg f instruction.

I'm using gcc built from svn trunk r207717.


[Bug lto/60150] [4.9 Regression] ICE in function_and_variable_visibility, at ipa.c:1000

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60150

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0


[Bug target/43546] [4.7/4.8/4.9 Regression] ICE: in assign_stack_local_1, at function.c:353 with -mpreferred-stack-boundary=2 -msseregparm

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43546

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #14 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 32114
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32114action=edit
gcc49-pr43546.patch

This untested patch fixes this for me, the dynamic stack realignment code is
then aware of the DFmode that might need to be possibly spilled.
The cost patch isn't wrong either, but at that level we really can't determine
if the constant load will be zero cost (when we will attempt to load it into a
i387 stack register) or more expensive (if it is loaded into a SSE register).


[Bug target/43546] [4.7/4.8/4.9 Regression] ICE: in assign_stack_local_1, at function.c:353 with -mpreferred-stack-boundary=2 -msseregparm

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43546

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek jakub at gcc dot gnu.org ---
Yet another option, perhaps better, would be to add a new predicate, that would
return true for a MEM operand for which avoid_constant_pool_reference returns a
CONST_DOUBLE floating point constant (other than signalling NaN?), and add
another define_insn before *extendsfdf2_i387 that would use that predicate on
the second operand and would do what *extendsfdf2_i387 does, but have also a
=x, m alternative that would be later on split into a load of the constant
widened to DFmode in memory.  Then we should get better code when trying to
load a DFmode constant into a DFmode register and compress_float_constant
decided to compress it, while it isn't a win in the end.

Or both my patch and this change.


[Bug rtl-optimization/60159] improve code for conditional sibcall

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60159

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org ---
Not sure if that is desirable though, it will mess up debug/unwind info.


[Bug rtl-optimization/59999] [4.9 Regression] Sign extension in loop regression blocks generation of zero overhead loop

2014-02-12 Thread pa...@matos-sorge.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5

--- Comment #22 from Paulo J. Matos pa...@matos-sorge.com ---
After some thought, I am concluding this cannot actually be optimized and that
GCC 4.5.4 was better because it was taking advantage of an undefined behaviour
that doesn't exist.

The thought process is as follows. The whole process has to do with this type
of loop:
void foo (int loopCount)
{
  short i;
  for (i = 0; (int)i  loopCount; i++)
...
}

GCC 4.5.4 was assuming i++ could have undefined behaviour and the increment was
done in type short. Then i was promoted to int through a sign_extend and
compared to loopCount. This undefined behaviour allows GCC 4.5.4 to generate an
int scev for the loop.

In GCC 4.8 or later (haven't tested with 4.6 or 4.7), i++ is known not to have
undefined behaviour. i++ due to C integer promotion rules is: i = (short)
((int) i + 1). GCC validly simplifies to i = (short) ((unsigned short)i + 1).
This is then sign extended to int for comparison. GCC cannot generate an int
scev because it's not simple: (int) (short) {1, +, 1}_1.

This can validly loop forever if loopCount  SHORT_MAX.
For example, is loopCount is SHORT_MAX + 1, then when i reaches SHORT_MAX and
is incremented by one the addition is fine because is done in (unsigned short)
and then truncated using modulo 2 (implementation defined behaviour) to short,
therefore never reaching loopCount and looping forever.

In RTL we get the following sequence:
r4:SI - [loopCount]
r0:HI - 0

code label...

...

r2:HI - r1:HI + 1
r3:SI - sign_extend r2:HI

p0:BI - r3:SI  r4:SI
loop to code label if p0:BI

I was tempted to simplify this to:
r4:SI - [loopCount]
r0:SI - 0

code label...

...

r2:SI - r1:SI + 1

p0:BI - r2:SI  r4:SI
loop to code label if p0:BI

However this will never have an infinite loop behaviour if r4:SI == SHORT_MAX,
therefore I think that at least in this case this cannot be optimized.

I am tempted to close the bug report. Richard?


[Bug rtl-optimization/60155] ICE: in get_pressure_class_and_nregs at gcse.c:3438

2014-02-12 Thread danglin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60155

--- Comment #1 from John David Anglin danglin at gcc dot gnu.org ---
With 4.6 and 4.7 compilers, this appears as:

gcc-4.6 -g -O2 -Wall -Wpointer-arith -Wuninitialized -Wsign-compare
-Wformat-security -Wno-pointer-sign -Wno-unused-result -fno-strict-aliasing
-D_FORTIFY_SOURCE=2 -ftrapv -fno-builtin-memset -D_FORTIFY_SOURCE=2 -g -O2
-Wformat -Werror=format-security -DLOGIN_PROGRAM=\/bin/login\
-DLOGIN_NO_ENDOPT -DSSH_EXTRAVERSION=\Debian-2\  -I. -I.. 
-I/usr/include/editline -DSSHDIR=\/etc/ssh\
-D_PATH_SSH_PROGRAM=\/usr/bin/ssh\
-D_PATH_SSH_ASKPASS_DEFAULT=\/usr/bin/ssh-askpass\
-D_PATH_SFTP_SERVER=\/usr/lib/openssh/sftp-server\
-D_PATH_SSH_KEY_SIGN=\/usr/lib/openssh/ssh-keysign\
-D_PATH_SSH_PKCS11_HELPER=\/usr/lib/openssh/ssh-pkcs11-helper\
-D_PATH_SSH_PIDDIR=\/var/run\ -D_PATH_PRIVSEP_CHROOT_DIR=\/var/run/sshd\
-DHAVE_CONFIG_H -c ../ssh-keygen.c
../ssh-keygen.c: In function ‘do_fingerprint’:
../ssh-keygen.c:887:1: internal compiler error: in hoist_code, at gcse.c:4631

[Bug target/57202] Please make the intrinsics headers like immintrin.h be usable without compiler flags

2014-02-12 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57202

--- Comment #4 from Marc Glisse glisse at gcc dot gnu.org ---
Can this be closed?


[Bug rtl-optimization/59999] [4.9 Regression] Sign extension in loop regression blocks generation of zero overhead loop

2014-02-12 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5

--- Comment #23 from rguenther at suse dot de rguenther at suse dot de ---
On Wed, 12 Feb 2014, pa...@matos-sorge.com wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5
 
 --- Comment #22 from Paulo J. Matos pa...@matos-sorge.com ---
 After some thought, I am concluding this cannot actually be optimized and that
 GCC 4.5.4 was better because it was taking advantage of an undefined behaviour
 that doesn't exist.
 
 The thought process is as follows. The whole process has to do with this type
 of loop:
 void foo (int loopCount)
 {
   short i;
   for (i = 0; (int)i  loopCount; i++)
 ...
 }
 
 GCC 4.5.4 was assuming i++ could have undefined behaviour and the increment 
 was
 done in type short. Then i was promoted to int through a sign_extend and
 compared to loopCount. This undefined behaviour allows GCC 4.5.4 to generate 
 an
 int scev for the loop.
 
 In GCC 4.8 or later (haven't tested with 4.6 or 4.7), i++ is known not to have
 undefined behaviour. i++ due to C integer promotion rules is: i = (short)
 ((int) i + 1). GCC validly simplifies to i = (short) ((unsigned short)i + 1).
 This is then sign extended to int for comparison. GCC cannot generate an int
 scev because it's not simple: (int) (short) {1, +, 1}_1.
 
 This can validly loop forever if loopCount  SHORT_MAX.
 For example, is loopCount is SHORT_MAX + 1, then when i reaches SHORT_MAX and
 is incremented by one the addition is fine because is done in (unsigned short)
 and then truncated using modulo 2 (implementation defined behaviour) to short,
 therefore never reaching loopCount and looping forever.
 
 In RTL we get the following sequence:
 r4:SI - [loopCount]
 r0:HI - 0
 
 code label...
 
 ...
 
 r2:HI - r1:HI + 1
 r3:SI - sign_extend r2:HI
 
 p0:BI - r3:SI  r4:SI
 loop to code label if p0:BI
 
 I was tempted to simplify this to:
 r4:SI - [loopCount]
 r0:SI - 0
 
 code label...
 
 ...
 
 r2:SI - r1:SI + 1
 
 p0:BI - r2:SI  r4:SI
 loop to code label if p0:BI
 
 However this will never have an infinite loop behaviour if r4:SI == SHORT_MAX,
 therefore I think that at least in this case this cannot be optimized.
 
 I am tempted to close the bug report. Richard?

Yes.  That sounds correct.


[Bug middle-end/60092] posix_memalign not recognized to derive alias and alignment info

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60092

--- Comment #20 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Wed Feb 12 13:36:08 2014
New Revision: 207720

URL: http://gcc.gnu.org/viewcvs?rev=207720root=gccview=rev
Log:
2014-02-12  Richard Biener  rguent...@suse.de

PR middle-end/60092
* gimple-low.c (lower_builtin_posix_memalign): Lower conditional
of posix_memalign being successful.
(lower_stmt): Restrict lowering of posix_memalign to when
-ftree-bit-ccp is enabled.

* gcc.dg/torture/pr60092.c: New testcase.
* gcc.dg/tree-ssa/alias-31.c: Disable SRA.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr60092.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-low.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c


[Bug rtl-optimization/59999] [4.9 Regression] Sign extension in loop regression blocks generation of zero overhead loop

2014-02-12 Thread pmatos at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5

pmatos at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #24 from pmatos at gcc dot gnu.org ---
Closing as invalid. Thanks Richard.


[Bug sanitizer/60142] [4.9 Regression][asan] -fsanitize=address breaks debugging - stepping into functions no longer possible

2014-02-12 Thread jan.kratochvil at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60142

Jan Kratochvil jan.kratochvil at redhat dot com changed:

   What|Removed |Added

 CC||jan.kratochvil at redhat dot 
com

--- Comment #4 from Jan Kratochvil jan.kratochvil at redhat dot com ---
Verified GDB fails with it.
GDB puts breakpoint on second .loc (that is not the fist/initial .loc) in a
function as currently neither GCC nor GCC use DW_LNS_set_prologue_end.

g++ (GCC) 4.9.0 20140212 (experimental)
-S -g -fsanitize=address

.type   _Z4testv, @function
_Z4testv:
.LASANPC512:
.LFB512:
.file 2 asantest.C
.loc 2 4 0
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
.cfi_lsda 0x3,.LLSDA512
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
pushq   %r14
pushq   %r13
pushq   %r12
pushq   %rbx
subq$112, %rsp
.cfi_offset 14, -24
.cfi_offset 13, -32
.cfi_offset 12, -40
.cfi_offset 3, -48
leaq-128(%rbp), %rbx
movq%rbx, %r14
cmpl$0, __asan_option_detect_stack_use_after_return(%rip)
je  .L3
.loc 2 4 0
--- here GDB puts the breakpoint
movq%rbx, %rsi
movl$96, %edi
call__asan_stack_malloc_1
movq%rax, %rbx
.L3:

GDB already workarounds a similar case of GCC PR debug/48827, this asan
prologue may look standard enough it could be possibly also workarounded in
GDB.


[Bug target/57202] Please make the intrinsics headers like immintrin.h be usable without compiler flags

2014-02-12 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57202

--- Comment #5 from Thiago Macieira thiago at kde dot org ---
(In reply to Marc Glisse from comment #4)
 Can this be closed?

Oh, yeah, this is working fine in GCC 4.9.


[Bug bootstrap/60160] New: Building with -flto in CFLAGS_FOR_TARGET / CXXFLAGS_FOR_TARGET

2014-02-12 Thread d.g.gorbachev at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60160

Bug ID: 60160
   Summary: Building with -flto in CFLAGS_FOR_TARGET /
CXXFLAGS_FOR_TARGET
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: d.g.gorbachev at gmail dot com

Created attachment 32115
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32115action=edit
Tentative patch

Someone might want to build everything with LTO. Currently, I see two problems.

1. crtstuff.c: perhaps it'd be better to compile it with -fno-lto.
2. attribute used for _Unwind_* functions.


[Bug bootstrap/60160] Building with -flto in CFLAGS_FOR_TARGET / CXXFLAGS_FOR_TARGET

2014-02-12 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60160

--- Comment #1 from Marc Glisse glisse at gcc dot gnu.org ---
Note the related: http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01480.html (PR
43538) and PR 59893.


[Bug bootstrap/60160] Building with -flto in CFLAGS_FOR_TARGET / CXXFLAGS_FOR_TARGET

2014-02-12 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60160

--- Comment #2 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
libstdc++ also causes problems:

/var/tmp/gcc_build_dir_/./prev-gcc/xg++ -B/var/tmp/gcc_build_dir_/./prev-gcc/
-B/usr/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/l
ibstdc++-v3/src/.libs
-B/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/
x86_64-pc-linux-gnu
-I/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/var/tmp/gcc/libstdc++-v3/libsupc++
-L/var/tmp/gcc_build_dir_/prev-x86_64-pc-lin
ux-gnu/libstdc++-v3/src/.libs
-L/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
  -march=native -O3 -pipe -flto=jobserver -frandom-seed=1 -fprof
ile-generate -fno-lto -DIN_GCC-fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -pedan
tic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings  
-DHAVE_CONFIG_H -DGENERATOR_FILE
-Wl,-O1,--hash-style=gnu,--as-needed,--gc-sections,--icf=safe,--icf-iterati
ons=3  -o build/genconstants \
build/genconstants.o build/read-md.o build/errors.o
.././libiberty/libiberty.a

/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so:
error: undefined reference to 'std::istream::ignore(long)'
/var/tmp/gcc_build_dir_/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so:
error: undefined reference to 'std::basic_istreamwchar_t,
std::char_traitswchar_t ::
ignore(long)'


[Bug target/57202] Please make the intrinsics headers like immintrin.h be usable without compiler flags

2014-02-12 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57202

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Known to work||4.9.0
 Resolution|--- |FIXED
   Target Milestone|--- |4.9.0

--- Comment #6 from Marc Glisse glisse at gcc dot gnu.org ---
Thanks.


[Bug rtl-optimization/56965] nonoverlapping_component_refs_p is bogus and slow

2014-02-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56965

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #6 from Richard Biener rguenth at gcc dot gnu.org ---
Mine.


[Bug target/60151] HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread hjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60151

--- Comment #1 from hjl at gcc dot gnu.org hjl at gcc dot gnu.org ---
Author: hjl
Date: Wed Feb 12 16:12:36 2014
New Revision: 207731

URL: http://gcc.gnu.org/viewcvs?rev=207731root=gccview=rev
Log:
Pass --32 to GNU assembler for .long foo@GOTOFF check

PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to GNU
assembler.
* configure: Regenerated.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/configure
trunk/gcc/configure.ac


[Bug target/60151] HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread hjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60151

--- Comment #2 from hjl at gcc dot gnu.org hjl at gcc dot gnu.org ---
Author: hjl
Date: Wed Feb 12 16:38:50 2014
New Revision: 207733

URL: http://gcc.gnu.org/viewcvs?rev=207733root=gccview=rev
Log:
Pass --32 to GNU assembler for .long foo@GOTOFF check

Backport from mainline
PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to GNU
assembler.

Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/configure
branches/gcc-4_8-branch/gcc/configure.ac


[Bug target/60151] HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread hjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60151

--- Comment #3 from hjl at gcc dot gnu.org hjl at gcc dot gnu.org ---
Author: hjl
Date: Wed Feb 12 16:43:47 2014
New Revision: 207734

URL: http://gcc.gnu.org/viewcvs?rev=207734root=gccview=rev
Log:
Pass --32 to GNU assembler for .long foo@GOTOFF check

Backport from mainline
PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to GNU
assembler.

Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/configure
branches/gcc-4_7-branch/gcc/configure.ac


[Bug target/60151] HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60151

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from H.J. Lu hjl.tools at gmail dot com ---
Fixed in GCC 4.7.4/4.8.3/4.9.0.


[Bug other/59893] Use LTO for libgcc.a, libstdc++.a, etc

2014-02-12 Thread d.g.gorbachev at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59893

--- Comment #7 from Dmitry Gorbachev d.g.gorbachev at gmail dot com ---
I used to build GCC 4.8/4.9 with -flto in C(XX)FLAGS_FOR_TARGET for quite some
time (both native i686-pc-linux-gnu and a cross), and it seems to work.  I saw
two problems: PR 60160 (for which a patch exists), and PR 59472 (annoying, but
not fatal).


[Bug middle-end/59737] [4.9 Regression] ice from optimize_inline_calls

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59737

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: hubicka
Date: Tue Feb 11 22:54:21 2014
New Revision: 207702

URL: http://gcc.gnu.org/viewcvs?rev=207702root=gccview=rev
Log:

PR lto/59468
* ipa-utils.h (possible_polymorphic_call_targets): Update prototype
and wrapper.
* ipa-devirt.c: Include demangle.h
(odr_violation_reported): New static variable.
(add_type_duplicate): Update odr_violations.
(maybe_record_node): Add completep parameter; update it.
(record_target_from_binfo): Add COMPLETEP parameter;
update it as needed.
(possible_polymorphic_call_targets_1): Likewise.
(struct polymorphic_call_target_d): Add nonconstruction_targets;
rename FINAL to COMPLETE.
(record_targets_from_bases): Sanity check we found the binfo;
fix COMPLETEP updating.
(possible_polymorphic_call_targets): Add NONCONSTRUTION_TARGETSP
parameter, fix computing of COMPLETEP.
(dump_possible_polymorphic_call_targets): Imrove readability of dump; at
LTO time do demangling.
(ipa_devirt): Use nonconstruction_targets; Improve dumps.
* gimple-fold.c (gimple_get_virt_method_for_vtable): Add can_refer
parameter.
(gimple_get_virt_method_for_binfo): Likewise.
* gimple-fold.h (gimple_get_virt_method_for_binfo,
gimple_get_virt_method_for_vtable): Update prototypes.

PR lto/59468
* g++.dg/ipa/devirt-27.C: New testcase.
* g++.dg/ipa/devirt-26.C: New testcase.

Added:
trunk/gcc/testsuite/g++.dg/ipa/devirt-26.C
trunk/gcc/testsuite/g++.dg/ipa/devirt-27.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cp/decl2.c
trunk/gcc/gimple-fold.c
trunk/gcc/gimple-fold.h
trunk/gcc/ipa-devirt.c
trunk/gcc/ipa-utils.h
trunk/gcc/testsuite/ChangeLog


[Bug libgcc/60161] New: updating collapsed because of no authentified software packets (lib32cc1)

2014-02-12 Thread dierk.zeissler at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60161

Bug ID: 60161
   Summary: updating collapsed because of no authentified software
packets (lib32cc1)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: blocker
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dierk.zeissler at gmail dot com

Installation von Paketen erforderlich, denen nicht vertraut werden kann

Diese Aktion würde die Installation von Paketen aus nicht authentifizierten
Software-Paketquellen erfordern.

lib32gcc1

Installierte Version: 4:0.8.9-0ubuntu0.12.04.1

Hardware 64 Bit-Version

[Bug middle-end/59737] [4.9 Regression] ice from optimize_inline_calls

2014-02-12 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59737

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: jakub
Date: Wed Feb 12 16:55:51 2014
New Revision: 207735

URL: http://gcc.gnu.org/viewcvs?rev=207735root=gccview=rev
Log:
PR middle-end/59737
* g++.dg/ipa/pr59737.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/ipa/pr59737.C
Modified:
trunk/gcc/testsuite/ChangeLog


[Bug libgcc/60161] updating collapsed because of no authentified software packets (lib32cc1)

2014-02-12 Thread sch...@linux-m68k.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60161

Andreas Schwab sch...@linux-m68k.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andreas Schwab sch...@linux-m68k.org ---
Please report that to Ubuntu, this has nothing to do with gcc.


[Bug target/58115] testcase gcc.target/i386/intrinsics_4.c failure

2014-02-12 Thread bernd.edlinger at hotmail dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115

Bernd Edlinger bernd.edlinger at hotmail dot de changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #16 from Bernd Edlinger bernd.edlinger at hotmail dot de ---
fixed on trunk.

Thanks!


[Bug c++/43680] [DR 1022] G++ is too aggressive in optimizing away bounds checking with enums

2014-02-12 Thread jason at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43680

Jason Merrill jason at gcc dot gnu.org changed:

   What|Removed |Added

  Known to fail||

--- Comment #19 from Jason Merrill jason at gcc dot gnu.org ---
It looks like the committee is making this code undefined again:

http://open-std.org/jtc1/sc22/wg21/docs/cwg_toc.html#1766


[Bug target/59516] [4.9 Regression] Multiple definition of `X' / of `non-virtual thunk to X' errors with LTO

2014-02-12 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59516

Kai Tietz ktietz at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ktietz at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from Kai Tietz ktietz at gcc dot gnu.org ---
This issue is a known binutils' ld bug. Issue here is that object-file
arguments aren't treated proper for LTO-plugin.
Work-a-round for this is adding all files into library by ar-tool, and doing
linking via it (Side-note be aware that you will need to mark classes then via
dllexport).


[Bug c/59193] Unused postfix operator temporaries

2014-02-12 Thread mtewoodbury at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59193

Max TenEyck Woodbury mtewoodbury at gmail dot com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |---

--- Comment #2 from Max TenEyck Woodbury mtewoodbury at gmail dot com ---
The practice is very common in C (and the GCC code) and is NOT peculiar to C++.

The creation of temporary values that are never used is a waste of resources
and, even when removed by the optimizer, represent an, admittedly minor,
defect.
This may be a minor point but it is NOT controversial.  Also, it is not really
a
matter of style.  Your lack of insight on this is somewhat disturbing.  Marking
the argument as INVALID is just plain wrong.  It should be left open to provide
a reference for patches that address this problem.


[Bug c/59193] Unused postfix operator temporaries

2014-02-12 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59193

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org ---
a++ and ++a should be treated as similar and don't change the semantics of the
loading from the variable or increase the number of loads if never used for
scalar types.  Now in C++, they are different when you overload them for
classes but we don't use that feature yet.


[Bug rtl-optimization/57193] [4.7/4.8/4.9 Regression] suboptimal register allocation for SSE registers

2014-02-12 Thread rth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Henderson rth at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed|2013-05-07 00:00:00 |2014-2-12
 CC||rth at gcc dot gnu.org

--- Comment #4 from Richard Henderson rth at gcc dot gnu.org ---
It seems like incomplete reload inheritance:

(insn 19 16 21 2 (set (reg:V8HI 107)
  (truncate:V8HI
(lshiftrt:V8SI
  (mult:V8SI (zero_extend:V8SI (subreg:V8HI (reg:V16QI 105) 0))
 (zero_extend:V8SI (subreg:V8HI (reg/v:V2DI 101 [ f ]) 0)))
  (const_int 16 [0x10]
  include/emmintrin.h:1362 2134 {*umulv8hi3_highpart}
  (expr_list:REG_DEAD (reg:V16QI 105) (nil)))

  Creating newreg=111 from oldreg=107, assigning class SSE_REGS to r111
   19: r111:V8HI=trunc(zero_extend(r111:V8HI)*zero_extend(r101:V2DI#0) 00x10)
  REG_DEAD r105:V16QI
Inserting insn reload before:
   31: r111:V8HI=r105:V16QI#0
Inserting insn reload after:
   32: r107:V8HI=r111:V8HI

The new register r111 does wind up inheriting from r107, but not
transitively to r105.  Thus we wind up leaving the copy insn 31.


[Bug target/58158] [4.8/4.9 Regression] ICE with conditional moves on GPRs with a floating point conditional on mipsel with loongson2f

2014-02-12 Thread rth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58158

Richard Henderson rth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||rth at gcc dot gnu.org

--- Comment #13 from Richard Henderson rth at gcc dot gnu.org ---
(In reply to Tom Li from comment #12)
  {
 +  if (!ISA_HAS_FP_CONDMOVE 
 +  GET_MODE_CLASS (GET_MODE (XEXP (operands[1], 0))) != MODE_INT)
 +FAIL;

The patch is clearly wrong.  It's attempting to look through
a subreg around operands[1], but of course that subreg will
not always exist.


[Bug target/58158] [4.8/4.9 Regression] ICE with conditional moves on GPRs with a floating point conditional on mipsel with loongson2f

2014-02-12 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58158

--- Comment #14 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Richard Henderson from comment #13)
 (In reply to Tom Li from comment #12)
   {
  +  if (!ISA_HAS_FP_CONDMOVE 
  +  GET_MODE_CLASS (GET_MODE (XEXP (operands[1], 0))) != MODE_INT)
  +FAIL;
 
 The patch is clearly wrong.  It's attempting to look through
 a subreg around operands[1], but of course that subreg will
 not always exist.

Actually it is correct as operands[1] will be an comparison_operator which
always have two operands itself.


[Bug middle-end/60162] New: [4.9 lra regression] mlra appears to be using the FP registers as a set of spill registers for ARM.

2014-02-12 Thread ramana at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60162

Bug ID: 60162
   Summary: [4.9 lra regression] mlra appears to be using the FP
registers as a set of spill registers for ARM.
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ramana at gcc dot gnu.org

This is something that I've just noticed with spec2k gzip : longest_match. 

If the function is compiled for a cross arm-none-linux-gnueabihf toolchain with
the following parameters, 

--with-arch=armv7-a --with-fpu=neon --with-float=hard

With a cross toolchain using mlra by default I get code that loads a value into
an FP register and then moves this over to an integer register. While this is
not that big a problem on some of the newer cores, it will be an issue on older
cores where the latency of such transfers can be pretty high.

You can experiment with -mno-lra to see the difference in code generated and
this is essentially something that has shown up rather recently. 

Bisecting and will follow up in the morning with a testcase.


[Bug ada/60163] New: Ada style checks: token spacing enforces space only around the first of several multiplying operators

2014-02-12 Thread piotr.trojanek at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60163

Bug ID: 60163
   Summary: Ada style checks: token spacing enforces space only
around the first of several multiplying operators
   Product: gcc
   Version: 4.7.4
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: piotr.trojanek at gmail dot com

Created attachment 32116
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32116action=edit
Example of wrong

The -gnatyt option of the GNAT Ada compiler should check if binary operators
other than ** are surrounded by spaces. However, it works correctly only for
the first of several multiplying operators in an expression.

For example, expression x * x + x*x does not trigger any warning.

When compiling the attached file with gnatmake -gnatyt -gnatwe -gnatf style
there should be 4 warning messages, but currently there are only 2.

The problem occurs in the 4.7.4 version of the GNAT compiler; tested on Linux
x86_64, but probably is platform-independent.


[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-02-12 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

--- Comment #2 from Nathaniel J. Smith njs at pobox dot com ---
Good point -- sent.

http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00813.html


[Bug rtl-optimization/60162] [4.9 lra regression] mlra appears to be using the FP registers for integer values and then moving on to GPR registers.

2014-02-12 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60162

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

  Component|middle-end  |rtl-optimization

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
This sounds like the cost model of moving between register classes is not
correct for the arm backend.


[Bug ada/60163] Ada style checks: token spacing enforces space only around the first of several multiplying operators

2014-02-12 Thread piotr.trojanek at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60163

--- Comment #1 from Piotr Trojanek piotr.trojanek at gmail dot com ---
Created attachment 32117
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32117action=edit
Patch to solve the problem

The attached patch solves the problem. I tested it with GNAT GPL 2013, but the
file is against the latest FSF sources.


[Bug ada/60164] New: Missing parenthesis in the documentation

2014-02-12 Thread piotr.trojanek at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60164

Bug ID: 60164
   Summary: Missing parenthesis in the documentation
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: piotr.trojanek at gmail dot com

Created attachment 32118
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32118action=edit
Correct nested parentheses in the gnatmem documentation.

There are nested parentheses in the documentation of the gnatmem. The closing
parenthesis is missing. The attached patch solves the problem.


[Bug rtl-optimization/60155] ICE: in get_pressure_class_and_nregs at gcse.c:3438

2014-02-12 Thread danglin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60155

--- Comment #2 from John David Anglin danglin at gcc dot gnu.org ---
Breakpoint 1, get_pressure_class_and_nregs (insn=0xfab51d98, nregs=0xfaf028c0)
at ../../gcc/gcc/gcse.c:3459
3459  gcc_assert (set != NULL_RTX);
(gdb) p debug_rtx (insn)
(insn 212 211 213 18 (parallel [
(set (reg/v:SI 114 [ num ])
(plus:SI (reg/v:SI 114 [ num ])
(const_int 1 [0x1])))
(trap_if (ne (plus:DI (sign_extend:DI (reg/v:SI 114 [ num ]))
(sign_extend:DI (const_int 1 [0x1])))
(sign_extend:DI (plus:SI (reg/v:SI 114 [ num ])
(const_int 1 [0x1]
(const_int 0 [0]))
]) ../ssh-keygen.c:830 113 {addvsi3}
 (nil))
$1 = void


[Bug rtl-optimization/60155] ICE: in get_pressure_class_and_nregs at gcse.c:3438

2014-02-12 Thread danglin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60155

--- Comment #3 from John David Anglin danglin at gcc dot gnu.org ---
Function compiles without -ftrapv.


[Bug c/16602] Spurious warnings about pointer to array - const pointer to array conversion

2014-02-12 Thread sebunger44 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16602

Sebastian Unger sebunger44 at gmail dot com changed:

   What|Removed |Added

 CC||sebunger44 at gmail dot com

--- Comment #11 from Sebastian Unger sebunger44 at gmail dot com ---
(In reply to Joseph S. Myers from comment #6)
 When you apply const to array of int, the resulting type is array of
 const int not const array of int; that's how type qualifiers and arrays
 interact in C, there is no such thing as a qualified array type.  array of
 const int is not a const-qualified type in C.

Can anybody provide a reference to the standard to the effect of this claim?
Because I can't find any, and I do believe this statement is wrong. All other
comments claiming this issue to be invalid are based on this (as are all
examples claiming to show that the original issue breaks the constness
promise).

I'm inclined to reopen this issue unless someone can point me to the standard
for this.


[Bug c/16602] Spurious warnings about pointer to array - const pointer to array conversion

2014-02-12 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16602

--- Comment #12 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Sebastian Unger from comment #11)
 I'm inclined to reopen this issue unless someone can point me to the
 standard for this.

From 6.7.3/9 (in the C11 draft):
If the specification of an array type includes any type qualifiers, the element
type is so qualified, not the array type. If the specification of a function type
includes any type
qualifiers, the behavior is undefined. 136)

[Bug c/16602] Spurious warnings about pointer to array - const pointer to array conversion

2014-02-12 Thread sebunger44 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16602

--- Comment #13 from Sebastian Unger sebunger44 at gmail dot com ---
I believe the intent behind that is that the qualification of an array type is
identical to that of its element type.

I.e. the statement here is that an 'array of const ints' is identical to a
'const array of ints' rather than that the latter does not exist.

Thus a 'pointer to array of ints' is perfectly convertible to 'pointer to array
of const ints' which makes perfect sense. Note that this is completely
different from a 'pointer to pointer to int' or any such as has been given in
previous examples.

At the very least GCC should treat it such in Gnu99 mode, as it makes perfect
sense to have the following code compile successfully:


typedef int IntArray[3];

void foo(IntArray const* a);

void bar(IntArray* a)
{
   foo(a);
}


[Bug fortran/60148] strings in NAMELIST do not honor DELIM= in open statement

2014-02-12 Thread jvdelisle at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60148

--- Comment #7 from Jerry DeLisle jvdelisle at gcc dot gnu.org ---
The regressions are two fold:

1) Tests are specifically looking for a  or a ' when no longer generated,

and

2) We need to also revise namelist reading of character types which are no
longer delimited

namelist_18.f90, modify test
namelist_38.f90, error in read
namelist_56.f90, ...
namelist_70.f90, ...

My patch is as follows so far, a little different from Steve's. With this patch
I don't explicitly write a space delim because we write one in the first chunk
below.  This was needed for namelist_16.f90

Index: write.c
===
--- write.c(revision 206864)
+++ write.c(working copy)
@@ -1921,7 +1921,8 @@
  to column 2. Reset the repeat counter.  */

   dtp-u.p.no_leading_blank = 0;
-  write_character (dtp, semi_comma, 1, 1);
+  if (dtp-u.p.nml_delim || (obj-type != BT_CHARACTER))
+write_character (dtp, semi_comma, 1, 1);
   if (num  5)
 {
   num = 0;
@@ -1971,9 +1972,18 @@

   /* Set the delimiter for namelist output.  */
   tmp_delim = dtp-u.p.current_unit-delim_status;
+  switch (tmp_delim)
+{
+  case DELIM_APOSTROPHE:
+dtp-u.p.nml_delim = '\'';
+break;
+  case DELIM_QUOTE:
+dtp-u.p.nml_delim = '';
+break;
+  default:
+dtp-u.p.nml_delim = '\0';
+}

-  dtp-u.p.nml_delim = tmp_delim == DELIM_APOSTROPHE ? '\'' : '';
-
   /* Temporarily disable namelist delimters.  */
   dtp-u.p.current_unit-delim_status = DELIM_NONE;

I have not looked at read yet.


[Bug tree-optimization/60165] New: may be used uninitialized warning with -O3 but not with -O2

2014-02-12 Thread vincent-gcc at vinc17 dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60165

Bug ID: 60165
   Summary: may be used uninitialized warning with -O3 but not
with -O2
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincent-gcc at vinc17 dot net

With:

gcc (Debian 20140111-1) 4.9.0 20140111 (experimental) [trunk revision 206552]

I get the following inconsistency in the warnings:

ypig% cat out.i
int a, b;
int fn2 (int, int);
int fn1 (int *p1)
{
if (fn2 (a, 0))
*p1 = b;
int c;
fn1 (c);
return c;
}
ypig% gcc-snapshot -c -Wall -Werror=maybe-uninitialized -O2 out.i
ypig% gcc-snapshot -c -Wall -Werror=maybe-uninitialized -O3 out.i
out.i: In function 'fn1':
out.i:9:5: error: 'c' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
 return c;
 ^
cc1: some warnings being treated as errors
ypig% 

I don't know whether this is regarded as normal, but this looks strange.

Note: I got this problem when compiling round_prec.c from the GNU MPFR trunk. I
generated the preprocessed file with -E, then used creduce on the following
script:

#!/bin/sh
{
  gcc-snapshot -c -Wall -Werror=maybe-uninitialized -O2 out.i  \
  ! gcc-snapshot -c -Wall -Werror=maybe-uninitialized -O3 out.i
} /dev/null 21

to generate the simple testcase (and fixed the declarations to avoid additional
warnings -- I think I should have used -Werror in the script to avoid them in
the first place).


[Bug libgcc/60166] New: ARM default NAN encoding violates EABI

2014-02-12 Thread joey.ye at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60166

Bug ID: 60166
   Summary: ARM default NAN encoding violates EABI
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: joey.ye at arm dot com

#include stdio.h
#include string.h
#include math.h
int g;
float i = 0.0 ,j = 0.0 ;

int main()
{
float f = i / j;
memcpy(g, f, sizeof(g));
printf(f=%f, hex=%x\n, f, g);
return 0;
}

When built for ARM thumb1, result is:
f=nan, hex=7fff

While according to the RTABI
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf)
section 4.1.1.1:

When not otherwise specified by IEEE 754, the result on an invalid operation
should be the quiet NaN bit pattern with only the most significant bit of the
significand set, and all other significand bits zero.

So current libgcc is violating ARM EABI.

I have a patch under testing.


[Bug c++/60167] New: [4.9 regression] Bogus error: conflicting declaration

2014-02-12 Thread ppluzhnikov at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60167

Bug ID: 60167
   Summary: [4.9 regression] Bogus error: conflicting declaration
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ppluzhnikov at google dot com

The test case below fails to compile with current trunk:
g++ (GCC) 4.9.0 20140213 (experimental)

g++ -c t.cc
t.cc:10:48: error: conflicting declaration ‘typename FooF::Bar FooF::cache’
 template int F typename FooF::Bar FooF::cache;
^
t.cc:5:14: note: previous declaration as ‘FooF::Bar FooF::cache’
   static Bar cache;
  ^
t.cc:10:48: error: declaration of ‘FooF::Bar FooF::cache’ outside of class
is not definition [-fpermissive]
 template int F typename FooF::Bar FooF::cache;
^

/// --- cut ---
template int F
struct Foo {
  typedef int Bar;

  static Bar cache;
};

// template int F int FooF::cache;  // OK

template int F typename FooF::Bar FooF::cache;

/// --- cut ---

Removing reference (template int F struct Foo) also makes it compile.

Compiles fine with gcc-4.8 and Clang.

[Bug c++/60168] New: Incorrect check in ~unique_ptr() when Deleter::pointer type is not a pointer type

2014-02-12 Thread ashish.sadanandan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60168

Bug ID: 60168
   Summary: Incorrect check in ~unique_ptr() when Deleter::pointer
type is not a pointer type
   Product: gcc
   Version: 4.8.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ashish.sadanandan at gmail dot com

The following compiles on both VS2013 and ICC 13.0.1

#include memory

struct del
{
using pointer = int;
void operator()(int) {}
};

int main()
{
std::unique_ptrint, del p;
}

It fails on gcc4.8.1 with this error

/usr/include/c++/4.8/bits/unique_ptr.h: In instantiation of
'std::unique_ptr_Tp, _Dp::~unique_ptr() [with _Tp = int; _Dp = del]':

main.cpp:13:35:   required from here

/usr/include/c++/4.8/bits/unique_ptr.h:183:12: error: invalid operands of types
'int' and 'std::nullptr_t' to binary 'operator!='

  if (__ptr != nullptr)

I believe that last if statement should be

if (__ptr != pointer())


[Bug target/60169] New: ICE ARM thumb1 handles far jump

2014-02-12 Thread joey.ye at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60169

Bug ID: 60169
   Summary: ICE ARM thumb1 handles far jump
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: joey.ye at arm dot com

Created attachment 32119
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32119action=edit
testcase

Trunk gcc 20140210:

arm-none-eabi-gcc -mthumb -fomit-frame-pointer -mthumb -fPIC -mcpu=cortex-m0
-mno-lra png.c -c

png.c: In function 'png_do_read_swap_alpha':
png.c:104:1: internal compiler error: in reload, at reload1.c:1058
 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.


[Bug c++/60168] Incorrect check in ~unique_ptr() when Deleter::pointer type is not a pointer type

2014-02-12 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60168

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org ---
Not a bug, the type unique_ptrT,D::pointer must meet the requirements of a
NullablePointer which includes being comparable with nullptr, so int is not
allowed


[Bug target/60169] ICE ARM thumb1 handles far jump

2014-02-12 Thread joey.ye at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60169

--- Comment #1 from Joey Ye joey.ye at arm dot com ---
Caused by http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01229.html, reason is
that stack layout shouldn't change during and after reload.

I have a patch fixing it under testing.


[Bug c++/60168] Incorrect check in ~unique_ptr() when Deleter::pointer type is not a pointer type

2014-02-12 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60168

--- Comment #2 from Jonathan Wakely redi at gcc dot gnu.org ---
The standard also specifies the behaviour of ~unique_ptrT,D in terms of
comparing the stored pointer with nullptr.


[Bug plugins/59335] Plugin doesn't build on trunk

2014-02-12 Thread joey.ye at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59335

Joey Ye joey.ye at arm dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Joey Ye joey.ye at arm dot com ---
Resolved in trunk


[Bug c++/60168] Incorrect check in ~unique_ptr() when Deleter::pointer type is not a pointer type

2014-02-12 Thread ashish.sadanandan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60168

--- Comment #3 from Ashish Sadanandan ashish.sadanandan at gmail dot com ---
You are right, of course. Not a bug, but it's disappointing that it isn't. If
that comparison were against a value initialized unique_ptrT, D::pointer,
instead of nullptr, it'd allow unique_ptr to be used to manage any generic
`handle` type, which may not meet the requirements of NullablePointer.


[Bug c/60170] New: No -Wtype-limits warning with -O1

2014-02-12 Thread chengniansun at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60170

Bug ID: 60170
   Summary: No -Wtype-limits warning with -O1
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: chengniansun at gmail dot com

For the expression -4L == (*g = l == 0): 
1) gcc -O0 warns that the comparison is always false, which is desired. 
2) with gcc -O1, the expression is optimized away, and the function fn1
directly returns 0. No -Wtype-limits warning for this expression. 

Since warning is a way to notify the programmers of potential bugs, IMHO it may
still be necessary to report the warning at -O1, even though Gcc has no policy
to ensure the warning consistency between -O0 and -O. 



$: cat s.c
unsigned short *g;
int fn1() {
  unsigned char ***const l = 0;
  return -4L == (*g = l == 0);
}
$: gcc-trunk -Wtype-limits -c s.c
s.c: In function ‘fn1’:
s.c:4:14: warning: comparison is always false due to limited range of data type
[-Wtype-limits]
   return -4L == (*g = l == 0);
  ^
$: gcc-trunk -Wtype-limits -c -O1 s.c
$: gcc-trunk --version
gcc-trunk (GCC) 4.9.0 20140210 (experimental)
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Re: Warn about virtual table mismatches

2014-02-12 Thread Jason Merrill

On 02/11/2014 10:27 PM, Jan Hubicka wrote:

On 02/11/2014 07:54 PM, Jan Hubicka wrote:

+ /* Allow combining RTTI and non-RTTI is OK.  */


You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine,
though you need to prefer the -frtti version in case code from that
translation unit uses the RTTI info.


Is there some mechanism that linker will do so? At the moment we just chose 
variant
that would be selected by linker.  I can make the choice, but what happens with 
non-LTO?


Hmm, the linker might well make the wrong choice.  Might be worth 
warning about this as well.



+   a type with the same name but number of virtual methods is 


but different number

Jason



Re: [C++ Patch/RFC] PR 60047

2014-02-12 Thread Jason Merrill

On 02/06/2014 02:59 AM, Paolo Carlini wrote:

-  if (vec_safe_is_empty (vbases))
+  if (vbases == NULL)


vec_safe_is_empty is still more correct here.

The rest of the patch is OK.

Jason



Fix broken build for AVR and SPU targets

2014-02-12 Thread Senthil Kumar Selvaraj
The below patch fixes the build for AVR and SPU targets, which got broken
when the signature of build_function_call_vec changed. The patch passes
vNULL for the extra parameter added (arg_loc), which I hope is ok for
builtins?

If ok, could someone commit please? I don't have commit access.

Regards
Senthil

gcc/ChangeLog

2014-02-12  Senthil Kumar Selvaraj  senthil_kumar.selva...@atmel.com

* config/avr/avr-c.c (avr_resolve_overloaded_builtin): Pass vNULL for 
arg_loc.
* config/spu/spu-c.c (spu_resolve_overloaded_builtin): Likewise.


diff --git gcc/config/avr/avr-c.c gcc/config/avr/avr-c.c
index 98650e0..101d280 100644
--- gcc/config/avr/avr-c.c
+++ gcc/config/avr/avr-c.c
@@ -115,7 +115,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, args, NULL);
 
   break; // absfx
 
@@ -181,7 +181,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, args, NULL);
 
   break; // roundfx
 
@@ -238,7 +238,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, args, NULL);
 
   break; // countlsfx
 }
diff --git gcc/config/spu/spu-c.c gcc/config/spu/spu-c.c
index 411496d..9d7aa5a 100644
--- gcc/config/spu/spu-c.c
+++ gcc/config/spu/spu-c.c
@@ -181,7 +181,7 @@ spu_resolve_overloaded_builtin (location_t loc, tree 
fndecl, void *passed_args)
   return error_mark_node;
 }
 
-  return build_function_call_vec (loc, match, fnargs, NULL);
+  return build_function_call_vec (loc, vNULL, match, fnargs, NULL);
 #undef SCALAR_TYPE_P
 }
 


Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread Uros Bizjak
On Tue, Feb 11, 2014 at 9:41 PM, H.J. Lu hjl.to...@gmail.com wrote:

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

 This would break Solaris/x86 with as configurations, where this test
 currently passes, but would fail since as doesn't understand --32.


 How about passing --32 to as only for Linux?  OK to install?

 I'd rather do it for gas instead, which can be used on non-Linux
 systems, too.


 Sure.  Here is the new patch.  OK to install?

Attached is slightly changed patch that follows established
configure.ac code formatting. Please check if this version works for
you.

The patch is OK for mainline and release branches.

Thanks,
Uros.
Index: configure
===
--- configure   (revision 207710)
+++ configure   (working copy)
@@ -25028,6 +25028,10 @@
 
 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+as_ix86_gotoff_in_data_opt=
+if test x$gas = xyes; then
+  as_ix86_gotoff_in_data_opt=--32
+fi
 { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for GOTOFF in 
data 5
 $as_echo_n checking assembler for GOTOFF in data...  6; }
 if test ${gcc_cv_as_ix86_gotoff_in_data+set} = set; then :
@@ -25044,7 +25048,7 @@
nop
.data
.long .L0@GOTOFF'  conftest.s
-if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s 5'
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags $as_ix86_gotoff_in_data_opt -o 
conftest.o conftest.s 5'
   { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5
   (eval $ac_try) 25
   ac_status=$?
Index: configure.ac
===
--- configure.ac(revision 207710)
+++ configure.ac(working copy)
@@ -3867,8 +3867,13 @@
 
 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+as_ix86_gotoff_in_data_opt=
+if test x$gas = xyes; then
+  as_ix86_gotoff_in_data_opt=--32
+fi
 gcc_GAS_CHECK_FEATURE([GOTOFF in data],
-gcc_cv_as_ix86_gotoff_in_data, [2,11,0],,
+  gcc_cv_as_ix86_gotoff_in_data, [2,11,0],
+  [$as_ix86_gotoff_in_data_opt],
 [  .text
 .L0:
nop


  1   2   >