Re: RFC: Add 32bit x86-64 support to binutils
>>> On 04.01.11 at 21:02, Jakub Jelinek wrote: > On Tue, Jan 04, 2011 at 10:35:42AM -0800, H. Peter Anvin wrote: >> On 01/04/2011 09:56 AM, H.J. Lu wrote: >> >> >> >> I think it is a gross misconception to tie the ABI to the ELF class of >> >> an object. Specifying the ABI should imo be done via e_flags or >> >> one of the unused bytes of e_ident, and in all reality the ELF class >> >> should *only* affect the file layout (and 64-bit should never have >> >> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects >> >> may have uses for 32-bit architectures/ABIs, e.g. when debug >> >> information exceeds the 4G boundary). >> > >> > I agree with you in principle. But I think it should be done via >> > a new attribute section, similar to ARM. >> > >> >> Oh god, please, no. >> >> I have to say I'm highly questioning to Jan's statement in the first >> place. Crossing 32- and 64-bit ELF like that sounds like a kernel >> security hole waiting to happen. A particular OS/kernel has the freedom to not implement support for other than the default format. But having the ABI disallow it altogether certainly isn't the right choice. And yes, we had been allowing cross-bitness ELF in an experimental (long canceled) OS of ours. > Yeah, and there are other targets where the elf class determines ABI > too (e.g. EM_S390 is used for both 31-bit and 64-bit binaries and > the ELF class determines which). So the usual thing is going to happen - someone made a mistake (I'm convinced the ELF class was never meant to affect anything but the file format), and this gets taken as an excuse to let the mistake spread. Jan
gcc-4.4-20110104 is now available
Snapshot gcc-4.4-20110104 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20110104/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.4 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch revision 168486 You'll find: gcc-4.4-20110104.tar.bz2 Complete GCC (includes all of below) MD5=a20926b23c217d847349975fcfcebf39 SHA1=fd2690c821e3a0ec46ca0b29262ae4672b721ad1 gcc-core-4.4-20110104.tar.bz2C front end and core compiler MD5=d474a5cdff19ffb203479ec88940a83f SHA1=9f91e276364365ee812326bb13461096e5fc68bb gcc-ada-4.4-20110104.tar.bz2 Ada front end and runtime MD5=6a26c6e5b934f3ea20a1d0a26cf235ef SHA1=bfaaa020e1fa0fd6caf01471ed6be270c1a739c4 gcc-fortran-4.4-20110104.tar.bz2 Fortran front end and runtime MD5=0da0a7ebab5ff18cce43dd9501a72f36 SHA1=f29287caa48799a221bee6ee81ee15a8c1c10a9e gcc-g++-4.4-20110104.tar.bz2 C++ front end and runtime MD5=656d80428c6a7ddb3a11bab87b08881a SHA1=8cf5c78dcb8242195a288df1986ef59f170c278e gcc-go-4.4-20110104.tar.bz2 Go front end and runtime MD5=b53e4806b0a05e56e7f852a08b57a68d SHA1=ce839adc19667f7324e01ee2586a098a7ce33d04 gcc-java-4.4-20110104.tar.bz2Java front end and runtime MD5=5b060abea2e9f52157ef3359cd02e4c9 SHA1=249acb0e6f9f9f1ec7a93ab8e508f1abba4736c6 gcc-objc-4.4-20110104.tar.bz2Objective-C front end and runtime MD5=14b9e07c732b76f39971ad99bfd878dc SHA1=f0154ce2f0ec84a0a817590a02a5f3324438f48d gcc-testsuite-4.4-20110104.tar.bz2 The GCC testsuite MD5=806305e6761b186bc80896bc8abd946c SHA1=6d18769004fbb39b84d38e8e4290472b89ba96ea Diffs from 4.4-20101228 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.4 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: gcc interprets C++0x initialization construct as function declaration
On 3 January 2011 05:24, Nathan Ridge wrote: > > Is this the desired behaviour? Questions about whether code is valid or whether gcc has a bug should be sent to the gcc-h...@gcc.gnu.org mailing list or entered into bugzilla, thanks.
Re: RFC: Add 32bit x86-64 support to binutils
On Tue, Jan 04, 2011 at 10:35:42AM -0800, H. Peter Anvin wrote: > On 01/04/2011 09:56 AM, H.J. Lu wrote: > >> > >> I think it is a gross misconception to tie the ABI to the ELF class of > >> an object. Specifying the ABI should imo be done via e_flags or > >> one of the unused bytes of e_ident, and in all reality the ELF class > >> should *only* affect the file layout (and 64-bit should never have > >> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects > >> may have uses for 32-bit architectures/ABIs, e.g. when debug > >> information exceeds the 4G boundary). > > > > I agree with you in principle. But I think it should be done via > > a new attribute section, similar to ARM. > > > > Oh god, please, no. > > I have to say I'm highly questioning to Jan's statement in the first > place. Crossing 32- and 64-bit ELF like that sounds like a kernel > security hole waiting to happen. Yeah, and there are other targets where the elf class determines ABI too (e.g. EM_S390 is used for both 31-bit and 64-bit binaries and the ELF class determines which). Jakub
Re: RFC: Add 32bit x86-64 support to binutils
On 01/04/2011 09:56 AM, H.J. Lu wrote: >> >> I think it is a gross misconception to tie the ABI to the ELF class of >> an object. Specifying the ABI should imo be done via e_flags or >> one of the unused bytes of e_ident, and in all reality the ELF class >> should *only* affect the file layout (and 64-bit should never have >> forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects >> may have uses for 32-bit architectures/ABIs, e.g. when debug >> information exceeds the 4G boundary). > > I agree with you in principle. But I think it should be done via > a new attribute section, similar to ARM. > Oh god, please, no. I have to say I'm highly questioning to Jan's statement in the first place. Crossing 32- and 64-bit ELF like that sounds like a kernel security hole waiting to happen. -hpa
The Linux binutils 2.21.51.0.5 is released
This release added the ILP32 support http://www.kernel.org/pub/linux/devel/binutils/ilp32/abi.pdf to Linux/x86-64. H.J. --- This is the beta release of binutils 2.21.51.0.5 for Linux, which is based on binutils 2011 0104 in CVS on sourceware.org plus various changes. It is purely for Linux. All relevant patches in patches have been applied to the source tree. You can take a look at patches/README to see what have been applied and in what order they have been applied. Starting from the 2.21.51.0.2 release, BFD linker has the working LTO plugin support. It can be used with GCC 4.5 and above. For GCC 4.5, you need to configure GCC with --enable-gold to enable LTO plugin support. Starting from the 2.21.51.0.2 release, binutils fully supports compressed debug sections. However, compressed debug section isn't turned on by default in assembler. I am planning to turn it on for x86 assembler in the future release, which may lead to the Linux kernel bug messages like WARNING: lib/ts_kmp.o (.zdebug_aranges): unexpected non-allocatable section. But the resulting kernel works fine. Starting from the 2.20.51.0.4 release, no diffs against the previous release will be provided. You can enable both gold and bfd ld with --enable-gold=both. Gold will be installed as ld.gold and bfd ld will be installed as ld.bfd. By default, ld.bfd will be installed as ld. You can use the configure option, --enable-gold=both/gold to choose gold as the default linker, ld. IA-32 binary and X64_64 binary tar balls are configured with --enable-gold=both/ld --enable-plugins --enable-threads. Starting from the 2.18.50.0.4 release, the x86 assembler no longer accepts fnstsw %eax fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged. Please use fnstsw %ax Starting from the 2.17.50.0.4 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.21.51.0.5 to hjl.to...@gmail.com and http://www.sourceware.org/bugzilla/ Changes from binutils 2.21.51.0.4: 1. Update from binutils 2011 0104. 2. Add ILP32 support to Linux/x86-64. 3. Prevent the Linux x86-64 kernel build failure and remove __ld_compatibility supprt. PR 12356. 4. Improve gold. 5. Improve Windows support. 6. Improve hppa support. 7. Improve mips support. Changes from binutils 2.21.51.0.3: 1. Update from binutils 2010 1217. 2. Fix the Linux relocatable kernel build. PR 12327. 3. Improve mips support. Changes from binutils 2.21.51.0.2: 1. Update from binutils 2010 1215. 2. Add BFD linker support for placing input .ctors/.dtors sections in output .init_array/.fini_array section. Add SORT_BY_INIT_PRIORITY. The benefits are a. Avoid output .ctors/.dtors section in executables and shared libraries. b. Allow mixing input .ctors/.dtors sections with input .init_array/.fini_array sectiobs. GCC PR 46770. 3. Add BFD
libiberty/.gitignore isn't in gcc tree
Hi, libiberty/.gitignore was added to src. But it isn't in gcc tree. -- H.J.
Re: RFC: Add 32bit x86-64 support to binutils
On Mon, Jan 3, 2011 at 2:40 AM, Jan Beulich wrote: On 30.12.10 at 21:02, "H.J. Lu" wrote: >> >> Here is the ILP32 psABI: >> >> http://www.kernel.org/pub/linux/devel/binutils/ilp32/ >> > > I think it is a gross misconception to tie the ABI to the ELF class of > an object. Specifying the ABI should imo be done via e_flags or > one of the unused bytes of e_ident, and in all reality the ELF class > should *only* affect the file layout (and 64-bit should never have > forbidden to use 32-bit ELF containers; similarly 64-bit ELF objects > may have uses for 32-bit architectures/ABIs, e.g. when debug > information exceeds the 4G boundary). > I agree with you in principle. But I think it should be done via a new attribute section, similar to ARM. -- H.J.
Re: [PATCH] -ftree-loop-linear fixes (PR tree-optimization/46970) (take 2)
On Tue, Jan 4, 2011 at 10:22, Richard Guenther wrote: > Ugh. Sebastian - can we nuke tree-loop-linear compeltely and > make -ftree-loop-linear an alias for -floop-interchange without > regressions? I'd like to reduce the number of broken passes from > 2 to 1 this way ... I wouldn't mind removing tree-loop-linear, although other people should also give their opinion on this matter: tree-loop-linear has no external dependences whereas -floop-interchange depends on cloog and ppl. Also we should get all the testsuite/gcc.dg/tree-ssa/ltrans-*.c passing with -floop-interchange. I will add all these testcases to the graphite testsuite and see where we stand. Sebastian
Re: Really poor 4.5.2 results on Debian Squeeze with Intel i7
On Mon, Jan 3, 2011 at 12:29 AM, Eric Botcazou wrote: >> I was wondering about that lately. Should testsuite failures with >> --enable-checking=all be reported? IIRC, the 4.5 branch won't even >> bootstrap with that setting. > > I'd think so, but only for the trunk probably. And don't report the testcases which timeout without running first outside of the testsuite harness as they might be just very slow. -- Pinski
Re: access to static data member fails with indirect ptr
On 4 January 2011 14:11, Klaus Rudolph wrote: > >> > Is my code wrong >> >> Yes. You need to define A::x. > > Grrr... so stupid! :-) > > Yes, you are right. I stumbled that only a few lines generates an error. Yes, > the compiler optimize them out if the access is direct. With -O3 > it compiles and links without errors also without having const int A::x; In future please send questions like this to the gcc-h...@gcc.gnu.org mailing list, which is for help using gcc. This list is for discussing development *of* gcc, not using gcc, as described at http://gcc.gnu.org/lists.html There are several invalid bug reports related to this same question which give a bit more detail, e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14404
Re: Code performance regression between gcc 4.5 and 4.6
On 01/04/11 15:10, H.J. Lu wrote: We need a testcase to investigate. This is now PR47167. Cheers, Martin
Re: access to static data member fails with indirect ptr
> > Is my code wrong > > Yes. You need to define A::x. Grrr... so stupid! :-) Yes, you are right. I stumbled that only a few lines generates an error. Yes, the compiler optimize them out if the access is direct. With -O3 it compiles and links without errors also without having const int A::x; Thanks for the hint. Regards! Klaus -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
Re: Code performance regression between gcc 4.5 and 4.6
On Tue, Jan 4, 2011 at 5:57 AM, Martin Reinecke wrote: > > > On 01/04/11 14:48, H.J. Lu wrote: >> >> On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke >> wrote: >>> >>> Hi, >>> >>> while benchmarking a numerical C library making heavy use of SSE2 >>> intrinsics, I have noticed a significant (around 10 percent) slowdown >>> in the code generated by the current gcc trunk, compared to the one >>> produced by the 4.5.1 release. >>> It's quite hard to reduce the code to a small test case, but I can easily >>> point out the hot code regions where most of the CPU time is spent. >>> Do you think I should open a PR for this, or is this kind of performance >>> fluctuation to be expected? >>> >> >> What compiler flags are you using? On which processors do you >> run the library? > > The CPU is a Core2 Duo E8500; the optimization flags are > "-O2 -ffast-math -fomit-frame-pointer". > This is on a 64bit OS, so SSE2 is supported without additional > flags. > > Using "-march=native" in addition to the flags above makes the timings > worse for gcc 4.5.1 and slightly better for gcc 4.6, but still the > code generated by 4.5.1 is quite a bit faster. > The trunk version was compiled from yesterday's sources. > We need a testcase to investigate. -- H.J.
Re: Code performance regression between gcc 4.5 and 4.6
On 01/04/11 14:48, H.J. Lu wrote: On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke wrote: Hi, while benchmarking a numerical C library making heavy use of SSE2 intrinsics, I have noticed a significant (around 10 percent) slowdown in the code generated by the current gcc trunk, compared to the one produced by the 4.5.1 release. It's quite hard to reduce the code to a small test case, but I can easily point out the hot code regions where most of the CPU time is spent. Do you think I should open a PR for this, or is this kind of performance fluctuation to be expected? What compiler flags are you using? On which processors do you run the library? The CPU is a Core2 Duo E8500; the optimization flags are "-O2 -ffast-math -fomit-frame-pointer". This is on a 64bit OS, so SSE2 is supported without additional flags. Using "-march=native" in addition to the flags above makes the timings worse for gcc 4.5.1 and slightly better for gcc 4.6, but still the code generated by 4.5.1 is quite a bit faster. The trunk version was compiled from yesterday's sources. Cheers, Martin
Re: access to static data member fails with indirect ptr
On 01/04/2011 12:49 PM, Klaus Rudolph wrote: Is my code wrong Yes. You need to define A::x. Add this line: const int A::x; If the code is wrong, I expect a compiler error not a linker message! No, because A::x might be defined in another translation unit. Andrew.
Re: Code performance regression between gcc 4.5 and 4.6
On Tue, Jan 4, 2011 at 4:43 AM, Martin Reinecke wrote: > Hi, > > while benchmarking a numerical C library making heavy use of SSE2 > intrinsics, I have noticed a significant (around 10 percent) slowdown > in the code generated by the current gcc trunk, compared to the one > produced by the 4.5.1 release. > It's quite hard to reduce the code to a small test case, but I can easily > point out the hot code regions where most of the CPU time is spent. > Do you think I should open a PR for this, or is this kind of performance > fluctuation to be expected? > What compiler flags are you using? On which processors do you run the library? -- H.J.
access to static data member fails with indirect ptr
Hi all, the following code fails with gcc 4.4.3,4.5.0 and 4.6 snapshot (some weeks old) : #include using namespace std; class A { public: static const int x=10; }; class Zgr_A { public: A* operator->() { return (A*)0; } }; template class Zgr { public: T* operator->() { return (T*)0; } }; int main() { A a_direct; A* a_ptr; Zgr_A a_indirect_ptr; Zgr a_template_ptr; A* ptr_from_indirect= a_indirect_ptr.operator->(); cout << "0. " << A::x << endl; cout << "1. " << a_direct.x << endl; cout << "2. " << a_ptr->x << endl; cout << "3. " << a_indirect_ptr->x << endl; cout << "4. " << a_template_ptr->x << endl; cout << "5. " << ptr_from_indirect->x << endl; cout << "6. " << a_template_ptr.operator->()->x << endl; cout << "7. " << ((A*)(a_template_ptr.operator->()))->x << endl; return 0; } Result: g++ -g main.cpp -o go /tmp/ccABoZtk.o: In function `main': main.cpp:37: undefined reference to `A::x' main.cpp:38: undefined reference to `A::x' main.cpp:40: undefined reference to `A::x' main.cpp:41: undefined reference to `A::x' collect2: ld returned 1 exit status Is my code wrong or is it a compiler bug? If the code is wrong, I expect a compiler error not a linker message! Wondering... Klaus P.S. Borland C++ compiles and links correct. -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
Code performance regression between gcc 4.5 and 4.6
Hi, while benchmarking a numerical C library making heavy use of SSE2 intrinsics, I have noticed a significant (around 10 percent) slowdown in the code generated by the current gcc trunk, compared to the one produced by the 4.5.1 release. It's quite hard to reduce the code to a small test case, but I can easily point out the hot code regions where most of the CPU time is spent. Do you think I should open a PR for this, or is this kind of performance fluctuation to be expected? Cheers, Martin
Re: Behavior change of driver on multiple input assembly files
On 01/04/2011 07:33 AM, Ian Lance Taylor wrote: On Thu, Dec 30, 2010 at 9:07 PM, Jie Zhang wrote: For a minimal fix, I propose to change combinable fields of assembly languages in default_compilers[] to 0. See the attached patch "gcc-not-combine-assembly-inputs.diff". I don't know why the combinable fields were set to 1 when --combine option was introduced. There is no explanation about that in that patch email.[2] Does anyone still remember? This patch is OK if it fixes PR 47137. Please mention the PR in the ChangeLog entry. Thanks. I have committed it now. I also posted it to gcc-patches mailing list with an updated ChangeLog entry: http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00122.html -- Jie Zhang
GCC 4.6.0 Status Report (2011-01-04), Stage 3 is over
Status == Stage 3 is over and the trunk is now in regression and documentation fixes only mode (operating as if we were on a release branch). This means we are now moving towards a release candidate of GCC 4.6.0 which can materialize once the list of serious regressions no longer contains a P1 regression. We have accumulated numerous serious regressions during Stage 1 and also during Stage 3. Now it is time to start fixing them. Port and OS maintainers may want to look at the list of all regressions (including those rated as P4 and P5) and at least try to get a hand on those that didn't appear in previous release series. Quality Data Priority # Change from Last Report --- --- P1 31 - 19 P2 109 - 5 P3 28 + 4 --- --- Total 168 - 20 Previous Report === http://gcc.gnu.org/ml/gcc/2010-10/msg00417.html The next report will be sent by Jakub.