Re: RFC: Add 32bit x86-64 support to binutils

2011-01-04 Thread Jan Beulich
>>> On 04.01.11 at 21:02, Jakub Jelinek  wrote:
> On Tue, Jan 04, 2011 at 10:35:42AM -0800, H. Peter Anvin wrote:
>> On 01/04/2011 09:56 AM, H.J. Lu wrote:
>> >>
>> >> I think it is a gross misconception to tie the ABI to the ELF class of
>> >> an object. Specifying the ABI should imo be done via e_flags or
>> >> one of the unused bytes of e_ident, and in all reality the ELF class
>> >> should *only* affect the file layout (and 64-bit should never have
>> >> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects
>> >> may have uses for 32-bit architectures/ABIs, e.g. when debug
>> >> information exceeds the 4G boundary).
>> > 
>> > I agree with you in principle. But I think it should be done via
>> > a new attribute section, similar to ARM.
>> > 
>> 
>> Oh god, please, no.
>> 
>> I have to say I'm highly questioning to Jan's statement in the first
>> place.  Crossing 32- and 64-bit ELF like that sounds like a kernel
>> security hole waiting to happen.

A particular OS/kernel has the freedom to not implement support for
other than the default format. But having the ABI disallow it
altogether certainly isn't the right choice. And yes, we had been
allowing cross-bitness ELF in an experimental (long canceled) OS
of ours.

> Yeah, and there are other targets where the elf class determines ABI
> too (e.g. EM_S390 is used for both 31-bit and 64-bit binaries and
> the ELF class determines which).

So the usual thing is going to happen - someone made a mistake (I'm
convinced the ELF class was never meant to affect anything but the
file format), and this gets taken as an excuse to let the mistake
spread.

Jan



gcc-4.4-20110104 is now available

2011-01-04 Thread gccadmin
Snapshot gcc-4.4-20110104 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20110104/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 168486

You'll find:

 gcc-4.4-20110104.tar.bz2 Complete GCC (includes all of below)

  MD5=a20926b23c217d847349975fcfcebf39
  SHA1=fd2690c821e3a0ec46ca0b29262ae4672b721ad1

 gcc-core-4.4-20110104.tar.bz2C front end and core compiler

  MD5=d474a5cdff19ffb203479ec88940a83f
  SHA1=9f91e276364365ee812326bb13461096e5fc68bb

 gcc-ada-4.4-20110104.tar.bz2 Ada front end and runtime

  MD5=6a26c6e5b934f3ea20a1d0a26cf235ef
  SHA1=bfaaa020e1fa0fd6caf01471ed6be270c1a739c4

 gcc-fortran-4.4-20110104.tar.bz2 Fortran front end and runtime

  MD5=0da0a7ebab5ff18cce43dd9501a72f36
  SHA1=f29287caa48799a221bee6ee81ee15a8c1c10a9e

 gcc-g++-4.4-20110104.tar.bz2 C++ front end and runtime

  MD5=656d80428c6a7ddb3a11bab87b08881a
  SHA1=8cf5c78dcb8242195a288df1986ef59f170c278e

 gcc-go-4.4-20110104.tar.bz2  Go front end and runtime

  MD5=b53e4806b0a05e56e7f852a08b57a68d
  SHA1=ce839adc19667f7324e01ee2586a098a7ce33d04

 gcc-java-4.4-20110104.tar.bz2Java front end and runtime

  MD5=5b060abea2e9f52157ef3359cd02e4c9
  SHA1=249acb0e6f9f9f1ec7a93ab8e508f1abba4736c6

 gcc-objc-4.4-20110104.tar.bz2Objective-C front end and runtime

  MD5=14b9e07c732b76f39971ad99bfd878dc
  SHA1=f0154ce2f0ec84a0a817590a02a5f3324438f48d

 gcc-testsuite-4.4-20110104.tar.bz2   The GCC testsuite

  MD5=806305e6761b186bc80896bc8abd946c
  SHA1=6d18769004fbb39b84d38e8e4290472b89ba96ea

Diffs from 4.4-20101228 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: gcc interprets C++0x initialization construct as function declaration

2011-01-04 Thread Jonathan Wakely
On 3 January 2011 05:24, Nathan Ridge wrote:
>
> Is this the desired behaviour?

Questions about whether code is valid or whether gcc has a bug should
be sent to the gcc-h...@gcc.gnu.org mailing list or entered into
bugzilla, thanks.


Re: RFC: Add 32bit x86-64 support to binutils

2011-01-04 Thread Jakub Jelinek
On Tue, Jan 04, 2011 at 10:35:42AM -0800, H. Peter Anvin wrote:
> On 01/04/2011 09:56 AM, H.J. Lu wrote:
> >>
> >> I think it is a gross misconception to tie the ABI to the ELF class of
> >> an object. Specifying the ABI should imo be done via e_flags or
> >> one of the unused bytes of e_ident, and in all reality the ELF class
> >> should *only* affect the file layout (and 64-bit should never have
> >> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects
> >> may have uses for 32-bit architectures/ABIs, e.g. when debug
> >> information exceeds the 4G boundary).
> > 
> > I agree with you in principle. But I think it should be done via
> > a new attribute section, similar to ARM.
> > 
> 
> Oh god, please, no.
> 
> I have to say I'm highly questioning to Jan's statement in the first
> place.  Crossing 32- and 64-bit ELF like that sounds like a kernel
> security hole waiting to happen.

Yeah, and there are other targets where the elf class determines ABI
too (e.g. EM_S390 is used for both 31-bit and 64-bit binaries and
the ELF class determines which).

Jakub


Re: RFC: Add 32bit x86-64 support to binutils

2011-01-04 Thread H. Peter Anvin
On 01/04/2011 09:56 AM, H.J. Lu wrote:
>>
>> I think it is a gross misconception to tie the ABI to the ELF class of
>> an object. Specifying the ABI should imo be done via e_flags or
>> one of the unused bytes of e_ident, and in all reality the ELF class
>> should *only* affect the file layout (and 64-bit should never have
>> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects
>> may have uses for 32-bit architectures/ABIs, e.g. when debug
>> information exceeds the 4G boundary).
> 
> I agree with you in principle. But I think it should be done via
> a new attribute section, similar to ARM.
> 

Oh god, please, no.

I have to say I'm highly questioning to Jan's statement in the first
place.  Crossing 32- and 64-bit ELF like that sounds like a kernel
security hole waiting to happen.

-hpa



The Linux binutils 2.21.51.0.5 is released

2011-01-04 Thread H.J. Lu
This release added the ILP32 support

http://www.kernel.org/pub/linux/devel/binutils/ilp32/abi.pdf

to Linux/x86-64.


H.J.
---
This is the beta release of binutils 2.21.51.0.5 for Linux, which is
based on binutils 2011 0104 in CVS on sourceware.org plus various
changes. It is purely for Linux.

All relevant patches in patches have been applied to the source tree.
You can take a look at patches/README to see what have been applied and
in what order they have been applied.

Starting from the 2.21.51.0.2 release, BFD linker has the working LTO
plugin support. It can be used with GCC 4.5 and above. For GCC 4.5, you
need to configure GCC with --enable-gold to enable LTO plugin support.

Starting from the 2.21.51.0.2 release, binutils fully supports compressed
debug sections.  However, compressed debug section isn't turned on by
default in assembler. I am planning to turn it on for x86 assembler in
the future release, which may lead to the Linux kernel bug messages like

WARNING: lib/ts_kmp.o (.zdebug_aranges): unexpected non-allocatable section.

But the resulting kernel works fine.

Starting from the 2.20.51.0.4 release, no diffs against the previous
release will be provided.

You can enable both gold and bfd ld with --enable-gold=both.  Gold will
be installed as ld.gold and bfd ld will be installed as ld.bfd.  By
default, ld.bfd will be installed as ld.  You can use the configure
option, --enable-gold=both/gold to choose gold as the default linker,
ld.  IA-32 binary and X64_64 binary tar balls are configured with
--enable-gold=both/ld --enable-plugins --enable-threads.

Starting from the 2.18.50.0.4 release, the x86 assembler no longer
accepts

fnstsw %eax

fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged.
Please use

fnstsw %ax

Starting from the 2.17.50.0.4 release, the default output section LMA
(load memory address) has changed for allocatable sections from being
equal to VMA (virtual memory address), to keeping the difference between
LMA and VMA the same as the previous output section in the same region.

For

.data.init_task : { *(.data.init_task) }

LMA of .data.init_task section is equal to its VMA with the old linker.
With the new linker, it depends on the previous output section. You
can use

.data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) }

to ensure that LMA of .data.init_task section is always equal to its
VMA. The linker script in the older 2.6 x86-64 kernel depends on the
old behavior.  You can add AT (ADDR(section)) to force LMA of
.data.init_task section equal to its VMA. It will work with both old
and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and
above is OK.

The new x86_64 assembler no longer accepts

monitor %eax,%ecx,%edx

You should use

monitor %rax,%ecx,%edx

or
monitor

which works with both old and new x86_64 assemblers. They should
generate the same opcode.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.21.51.0.5 to
hjl.to...@gmail.com

and

http://www.sourceware.org/bugzilla/

Changes from binutils 2.21.51.0.4:

1. Update from binutils 2011 0104.
2. Add ILP32 support to Linux/x86-64.
3. Prevent the Linux x86-64 kernel build failure and remove
__ld_compatibility supprt.  PR 12356.
4. Improve gold.
5. Improve Windows support.
6. Improve hppa support.
7. Improve mips support.

Changes from binutils 2.21.51.0.3:

1. Update from binutils 2010 1217.
2. Fix the Linux relocatable kernel build.  PR 12327.
3. Improve mips support.

Changes from binutils 2.21.51.0.2:

1. Update from binutils 2010 1215.
2. Add BFD linker support for placing input .ctors/.dtors sections in
output .init_array/.fini_array section.  Add SORT_BY_INIT_PRIORITY.  The
benefits are
   a. Avoid output .ctors/.dtors section in executables and shared
  libraries.
   b. Allow mixing input .ctors/.dtors sections with input
   .init_array/.fini_array sectiobs.  GCC PR 46770.
3. Add BFD 

libiberty/.gitignore isn't in gcc tree

2011-01-04 Thread H.J. Lu
Hi,

libiberty/.gitignore was added to src. But it isn't in gcc tree.

-- 
H.J.


Re: RFC: Add 32bit x86-64 support to binutils

2011-01-04 Thread H.J. Lu
On Mon, Jan 3, 2011 at 2:40 AM, Jan Beulich  wrote:
 On 30.12.10 at 21:02, "H.J. Lu"  wrote:
>>
>> Here is the ILP32 psABI:
>>
>> http://www.kernel.org/pub/linux/devel/binutils/ilp32/
>>
>
> I think it is a gross misconception to tie the ABI to the ELF class of
> an object. Specifying the ABI should imo be done via e_flags or
> one of the unused bytes of e_ident, and in all reality the ELF class
> should *only* affect the file layout (and 64-bit should never have
> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects
> may have uses for 32-bit architectures/ABIs, e.g. when debug
> information exceeds the 4G boundary).
>

I agree with you in principle. But I think it should be done via
a new attribute section, similar to ARM.


-- 
H.J.


Re: [PATCH] -ftree-loop-linear fixes (PR tree-optimization/46970) (take 2)

2011-01-04 Thread Sebastian Pop
On Tue, Jan 4, 2011 at 10:22, Richard Guenther
 wrote:
> Ugh.  Sebastian - can we nuke tree-loop-linear compeltely and
> make -ftree-loop-linear an alias for -floop-interchange without
> regressions?  I'd like to reduce the number of broken passes from
> 2 to 1 this way ...

I wouldn't mind removing tree-loop-linear, although other people
should also give their opinion on this matter: tree-loop-linear has no
external dependences whereas -floop-interchange depends on cloog and ppl.

Also we should get all the testsuite/gcc.dg/tree-ssa/ltrans-*.c
passing with -floop-interchange.  I will add all these testcases to
the graphite testsuite and see where we stand.

Sebastian


Re: Really poor 4.5.2 results on Debian Squeeze with Intel i7

2011-01-04 Thread Andrew Pinski
On Mon, Jan 3, 2011 at 12:29 AM, Eric Botcazou  wrote:
>> I was wondering about that lately. Should testsuite failures with
>> --enable-checking=all be reported? IIRC, the 4.5 branch won't even
>> bootstrap with that setting.
>
> I'd think so, but only for the trunk probably.

And don't report the testcases which timeout without running first
outside of the testsuite harness as they might be just very slow.

-- Pinski


Re: access to static data member fails with indirect ptr

2011-01-04 Thread Jonathan Wakely
On 4 January 2011 14:11, Klaus Rudolph wrote:
>
>> > Is my code wrong
>>
>> Yes.  You need to define A::x.
>
> Grrr... so stupid! :-)
>
> Yes, you are right. I stumbled that only a few lines generates an error. Yes, 
> the compiler optimize them out if the access is direct. With -O3
> it compiles and links without errors also without having const int A::x;

In future please send questions like this to the gcc-h...@gcc.gnu.org
mailing list, which is for help using gcc. This list is for discussing
development *of* gcc, not using gcc, as described at
http://gcc.gnu.org/lists.html

There are several invalid bug reports related to this same question
which give a bit more detail, e.g.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14404


Re: Code performance regression between gcc 4.5 and 4.6

2011-01-04 Thread Martin Reinecke



On 01/04/11 15:10, H.J. Lu wrote:


We need a testcase to investigate.


This is now PR47167.

Cheers,
  Martin



Re: access to static data member fails with indirect ptr

2011-01-04 Thread Klaus Rudolph

> > Is my code wrong
> 
> Yes.  You need to define A::x.

Grrr... so stupid! :-)

Yes, you are right. I stumbled that only a few lines generates an error. Yes, 
the compiler optimize them out if the access is direct. With -O3 
it compiles and links without errors also without having const int A::x; 

Thanks for the hint.

Regards!
 Klaus

-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de


Re: Code performance regression between gcc 4.5 and 4.6

2011-01-04 Thread H.J. Lu
On Tue, Jan 4, 2011 at 5:57 AM, Martin Reinecke
 wrote:
>
>
> On 01/04/11 14:48, H.J. Lu wrote:
>>
>> On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke
>>   wrote:
>>>
>>> Hi,
>>>
>>> while benchmarking a numerical C library making heavy use of SSE2
>>> intrinsics, I have noticed a significant (around 10 percent) slowdown
>>> in the code generated by the current gcc trunk, compared to the one
>>> produced by the 4.5.1 release.
>>> It's quite hard to reduce the code to a small test case, but I can easily
>>> point out the hot code regions where most of the CPU time is spent.
>>> Do you think I should open a PR for this, or is this kind of performance
>>> fluctuation to be expected?
>>>
>>
>> What compiler flags are you using? On which processors do you
>> run the library?
>
> The CPU is a Core2 Duo E8500; the optimization flags are
> "-O2 -ffast-math -fomit-frame-pointer".
> This is on a 64bit OS, so SSE2 is supported without additional
> flags.
>
> Using "-march=native" in addition to the flags above makes the timings
> worse for gcc 4.5.1 and slightly better for gcc 4.6,  but still the
> code generated by 4.5.1 is quite a bit faster.
> The trunk version was compiled from yesterday's sources.
>

We need a testcase to investigate.


-- 
H.J.


Re: Code performance regression between gcc 4.5 and 4.6

2011-01-04 Thread Martin Reinecke



On 01/04/11 14:48, H.J. Lu wrote:

On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke
  wrote:

Hi,

while benchmarking a numerical C library making heavy use of SSE2
intrinsics, I have noticed a significant (around 10 percent) slowdown
in the code generated by the current gcc trunk, compared to the one
produced by the 4.5.1 release.
It's quite hard to reduce the code to a small test case, but I can easily
point out the hot code regions where most of the CPU time is spent.
Do you think I should open a PR for this, or is this kind of performance
fluctuation to be expected?



What compiler flags are you using? On which processors do you
run the library?


The CPU is a Core2 Duo E8500; the optimization flags are
"-O2 -ffast-math -fomit-frame-pointer".
This is on a 64bit OS, so SSE2 is supported without additional
flags.

Using "-march=native" in addition to the flags above makes the timings
worse for gcc 4.5.1 and slightly better for gcc 4.6,  but still the
code generated by 4.5.1 is quite a bit faster.
The trunk version was compiled from yesterday's sources.

Cheers,
  Martin


Re: access to static data member fails with indirect ptr

2011-01-04 Thread Andrew Haley

On 01/04/2011 12:49 PM, Klaus Rudolph wrote:

Is my code wrong


Yes.  You need to define A::x.

Add this line:

const int A::x;


If the code is wrong, I expect a compiler error not a linker message!


No, because A::x might be defined in another translation unit.

Andrew.


Re: Code performance regression between gcc 4.5 and 4.6

2011-01-04 Thread H.J. Lu
On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke
 wrote:
> Hi,
>
> while benchmarking a numerical C library making heavy use of SSE2
> intrinsics, I have noticed a significant (around 10 percent) slowdown
> in the code generated by the current gcc trunk, compared to the one
> produced by the 4.5.1 release.
> It's quite hard to reduce the code to a small test case, but I can easily
> point out the hot code regions where most of the CPU time is spent.
> Do you think I should open a PR for this, or is this kind of performance
> fluctuation to be expected?
>

What compiler flags are you using? On which processors do you
run the library?



-- 
H.J.


access to static data member fails with indirect ptr

2011-01-04 Thread Klaus Rudolph
Hi all,

the following code fails with gcc 4.4.3,4.5.0 and 4.6 snapshot (some weeks old) 
:



#include 

using namespace std;


class A
{
public:
static const int x=10;
};

class Zgr_A
{
public:
A* operator->() { return (A*)0; }
};

template 
class  Zgr
{
public:
T* operator->() { return (T*)0; }
};

int main()
{
A a_direct;
A* a_ptr;
Zgr_A a_indirect_ptr;
Zgr a_template_ptr;

A* ptr_from_indirect= a_indirect_ptr.operator->();

cout << "0. " << A::x << endl;
cout << "1. " << a_direct.x << endl;
cout << "2. " << a_ptr->x << endl;
cout << "3. " << a_indirect_ptr->x << endl;
cout << "4. " << a_template_ptr->x << endl;
cout << "5. " << ptr_from_indirect->x << endl;
cout << "6. " << a_template_ptr.operator->()->x << endl;
cout << "7. " << ((A*)(a_template_ptr.operator->()))->x << endl;

return 0;
}



Result:
g++ -g main.cpp -o go
/tmp/ccABoZtk.o: In function `main':
main.cpp:37: undefined reference to `A::x'
main.cpp:38: undefined reference to `A::x'
main.cpp:40: undefined reference to `A::x'
main.cpp:41: undefined reference to `A::x'
collect2: ld returned 1 exit status

Is my code wrong or is it a compiler bug? If the code is wrong, I expect a 
compiler error not a linker message!

Wondering... 
 Klaus

P.S. Borland C++ compiles and links correct.


-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de


Code performance regression between gcc 4.5 and 4.6

2011-01-04 Thread Martin Reinecke

Hi,

while benchmarking a numerical C library making heavy use of SSE2
intrinsics, I have noticed a significant (around 10 percent) slowdown
in the code generated by the current gcc trunk, compared to the one
produced by the 4.5.1 release.
It's quite hard to reduce the code to a small test case, but I can easily
point out the hot code regions where most of the CPU time is spent.
Do you think I should open a PR for this, or is this kind of performance
fluctuation to be expected?

Cheers,
  Martin


Re: Behavior change of driver on multiple input assembly files

2011-01-04 Thread Jie Zhang

On 01/04/2011 07:33 AM, Ian Lance Taylor wrote:

On Thu, Dec 30, 2010 at 9:07 PM, Jie Zhang  wrote:


For a minimal fix, I propose to change combinable fields of assembly
languages in default_compilers[] to 0. See the attached patch
"gcc-not-combine-assembly-inputs.diff". I don't know why the combinable
fields were set to 1 when --combine option was introduced. There is no
explanation about that in that patch email.[2] Does anyone still remember?


This patch is OK if it fixes PR 47137.  Please mention the PR in the
ChangeLog entry.

Thanks. I have committed it now. I also posted it to gcc-patches mailing 
list with an updated ChangeLog entry:


http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00122.html

--
Jie Zhang



GCC 4.6.0 Status Report (2011-01-04), Stage 3 is over

2011-01-04 Thread Richard Guenther

Status
==

Stage 3 is over and the trunk is now in regression and documentation
fixes only mode (operating as if we were on a release branch).  This
means we are now moving towards a release candidate of GCC 4.6.0
which can materialize once the list of serious regressions no longer
contains a P1 regression.

We have accumulated numerous serious regressions during Stage 1 and
also during Stage 3.  Now it is time to start fixing them.  Port
and OS maintainers may want to look at the list of all regressions
(including those rated as P4 and P5) and at least try to get a hand
on those that didn't appear in previous release series.


Quality Data


Priority  #   Change from Last Report
---   ---
P1   31   - 19
P2  109   -  5
P3   28   +  4
---   ---
Total   168   - 20


Previous Report
===

http://gcc.gnu.org/ml/gcc/2010-10/msg00417.html


The next report will be sent by Jakub.