Re: Describing multi-register values in RTL
Le jeudi 21 octobre 2010 à 21:11 -0700, Ian Lance Taylor a écrit : > Paul Koning writes: > > > To take that example, on the pdp11 an SImode is two HImodes. Could > > the RTL template in the MD file for, say, addsi3 split that into two > > or three insns that operate on HImode values and describe the actual > > instructions? In this case: add high parts, then add low parts and > > propagate carry into high. Split that way it would seem you would not > > be constrained to adjacent registers, or for that matter to both being > > registers at all. This is exactly the kind of thing I'm looking at. > The lower subreg pass will do that for you if you have the right set of > insns. Could you expand a bit on what the 'right set of instructions' is or even better give an example of an md file where we could find an example? Thanks a lot! Fred
Re: -mcmodel=large doesn't work to me
Wei Li writes: > I am working on huge object files and I am glad to see that gcc > supports -mcmodel=large now. However, my experiment even doesn't work > because of relocation problem in crtbeginS.o This message was not appropriate for the mailing list gcc@gcc.gnu.org, which is for the development of gcc itself. It would be appropriate for the mailing list gcc-h...@gcc.gnu.org. Please take any followups to gcc-help. Thanks. Basically, you have encountered a problem which is a cross between a bug and an installation issue. In order to use -mcmodel=large reliably, you really need to compile everything with -mcmodel=large. In this case, the startup file crtbeginS.o, which is part of gcc, was not compiled with -mcmodel=large. This can be fixed with a minor gcc source code modification, but the effect will be to build -mcmodel=large versions of all the gcc libraries. Distros may prefer to avoid that. So the best approach here may be to add yet another configure option to request that this be done. Please consider filing an enhancement request at http://gcc.gnu.org/bugzilla/ . Thanks. Ian
Re: Describing multi-register values in RTL
Paul Koning writes: > To take that example, on the pdp11 an SImode is two HImodes. Could > the RTL template in the MD file for, say, addsi3 split that into two > or three insns that operate on HImode values and describe the actual > instructions? In this case: add high parts, then add low parts and > propagate carry into high. Split that way it would seem you would not > be constrained to adjacent registers, or for that matter to both being > registers at all. The lower subreg pass will do that for you if you have the right set of insns. It's not ideal to do it at expand time because it will inhibit the RTL CSE and loop optimizers and some of the optimizations done by combine. Those passes are not criticial, but they do have some effect. Ian
-mcmodel=large doesn't work to me
Hi, I am working on huge object files and I am glad to see that gcc supports -mcmodel=large now. However, my experiment even doesn't work because of relocation problem in crtbeginS.o My Source file: t.c #include extern int foo(int argc, char **argv); void *pv1[1024]={(void*)foo,}; char a[2147483658] = {1, 2 }; char b[2147483658] = {2, 3 }; void *pv2[1024]={(void*)foo,}; int foo(int argc, char **argv) { printf("%d", a[2147483657]); printf("%d", b[2147483657]); return 0; } Command line: gcc -mcmodel=large -fPIC -shared t.c -o t.so Error: /gcc4.5.1/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/crtbeginS.o: In function `__do_global_dtors_aux': crtstuff.c:(.text+0x3): relocation truncated to fit: R_X86_64_PC32 against `.bss' crtstuff.c:(.text+0x37): relocation truncated to fit: R_X86_64_PC32 against `.bss' crtstuff.c:(.text+0x57): relocation truncated to fit: R_X86_64_PC32 against `.bss' crtstuff.c:(.text+0x62): relocation truncated to fit: R_X86_64_PC32 against `.bss' crtstuff.c:(.text+0x6d): relocation truncated to fit: R_X86_64_PC32 against `.bss' Could someone help me figure out the problem? I am using RH5 64bit. Thanks, Wei
Re: Describing multi-register values in RTL
On Oct 21, 2010, at 8:15 PM, Ian Lance Taylor wrote: > Frederic Riss writes: > >> Is it possible to describe multi-register values in RTL when the >> subparts of the value aren't stored in consecutive registers? For >> example having a DI value constructed from 2 unrelated SI registers >> (without losing the semantic of the original DI value) ? > > Not really. I suppose you could use CONCAT, but you would lose > optimizations. > > I don't know what you are doing but I'll note that the lower-subreg pass > can be used to split a DImode value into 2 SImode values after the fact > that the value is DImode is no longer relevant. Can "expand" do this? I've wondered about this in the past. It comes up in machines where register pairs are used as a software convention, like MIPS in 32 bit mode. Or on the PDP-11, where a few instructions require odd/even adjacent registers but a lot of wide mode sequences simply require two smaller mode values, that don't even need to be in the same sort of container. To take that example, on the pdp11 an SImode is two HImodes. Could the RTL template in the MD file for, say, addsi3 split that into two or three insns that operate on HImode values and describe the actual instructions? In this case: add high parts, then add low parts and propagate carry into high. Split that way it would seem you would not be constrained to adjacent registers, or for that matter to both being registers at all. paul
Re: Describing multi-register values in RTL
Frederic Riss writes: > Is it possible to describe multi-register values in RTL when the > subparts of the value aren't stored in consecutive registers? For > example having a DI value constructed from 2 unrelated SI registers > (without losing the semantic of the original DI value) ? Not really. I suppose you could use CONCAT, but you would lose optimizations. I don't know what you are doing but I'll note that the lower-subreg pass can be used to split a DImode value into 2 SImode values after the fact that the value is DImode is no longer relevant. Ian
gcc-4.5-20101021 is now available
Snapshot gcc-4.5-20101021 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20101021/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 165794 You'll find: gcc-4.5-20101021.tar.bz2 Complete GCC (includes all of below) MD5=0fdf412588ffc9cd4941b75e246e989b SHA1=4f9d2cbf01516f2ddcd6443b1fc88af12864a2f7 gcc-core-4.5-20101021.tar.bz2C front end and core compiler MD5=825a85dda0eaa62493c6b2b610a7c1c3 SHA1=58134281ef384779726bd82a861896052fe9b85f gcc-ada-4.5-20101021.tar.bz2 Ada front end and runtime MD5=4ed486d315552e1ab7db7425f6e96e48 SHA1=20a4b9f42d0abd60aefb00a216fd08014d972193 gcc-fortran-4.5-20101021.tar.bz2 Fortran front end and runtime MD5=c8bdfcd57ee72968271168d526eb7115 SHA1=4da147ab374e3a23df9dc9b21cda6df166ff3e3b gcc-g++-4.5-20101021.tar.bz2 C++ front end and runtime MD5=0827c8991a0863a1155d568fd334 SHA1=49f7562ced2be6a9b7ec33e6c366a22f551b18af gcc-java-4.5-20101021.tar.bz2Java front end and runtime MD5=7b9502f082b108b292fcdff19eeeb170 SHA1=691a8659bde01ef00f69d917238b65c97cebb724 gcc-objc-4.5-20101021.tar.bz2Objective-C front end and runtime MD5=d49c5845eae03a7c057a5184946bf65d SHA1=84089187caabeef9daba29a68fc714bda9b40779 gcc-testsuite-4.5-20101021.tar.bz2 The GCC testsuite MD5=2266d72d8613e45df534111376503823 SHA1=0dbcc4002dfc3e55884a4171a5332f6c400cc6a2 Diffs from 4.5-20101014 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Describing multi-register values in RTL
Hi, Is it possible to describe multi-register values in RTL when the subparts of the value aren't stored in consecutive registers? For example having a DI value constructed from 2 unrelated SI registers (without losing the semantic of the original DI value) ? Thanks a lot, Fred
Re: __GXX_EXPERIMENTAL_CXX0X__
On 21 October 2010 18:52, Neal Becker wrote: > I need a preprocessor macro to detect c++0x support. For now, that is > __GXX_EXPERIMENTAL_CXX0X__ > > but what happens once -std=c++0x is the default? Will this macro still > be defined? > > Don't we need a > > __GXX_CXX0X__ ? __cplusplus should be defined to the date the final standard gets submitted - although there's a longstanding bug report that g++ defines __cplusplus to 1 currently. I'm hoping that will change once C++0x support is no longer considered experimental.
Re: peephole2: dead regs not marked as dead
Georg Lay writes: > Regs that are "naturally" dead because the function ends are not marked as > dead, > and therefore some optimization opportunities pass by unnoticed, e.g. together > with recog.c::peep2_reg_dead_p() et. al. I don't understand what you mean. All registers other than the return register, stack pointer, and frame pointer die at the end of the function, and they should be marked accordingly. Can you give an example? Ian
Re: %pc relative addressing of string literals/const data
On Thu, Oct 21, 2010 at 08:17:51PM +0200, Gunther Nikl wrote: > Michael Meissner wrote: > > Note, the 64-bit ABI requires that r2 have the current function's GOT in it > > when the function is called, while the 32-bit ABI uses r2 as a small data > > pointer (and possibly r13 as a second small data pointer). > > If the 32-bit ABI is the SYSV-ABI, then you got the register usage > wrong. r13 is the small data pointer with -msdata=sysv and r2 is a > second small data pointer when using -msdata=eabi. > > Joakim, did you try using -msdata=sysv together with "-G 64000"? Yes, I think I mentally swapped the registers. Sorry. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com
Re: %pc relative addressing of string literals/const data
Michael Meissner wrote: > Note, the 64-bit ABI requires that r2 have the current function's GOT in it > when the function is called, while the 32-bit ABI uses r2 as a small data > pointer (and possibly r13 as a second small data pointer). If the 32-bit ABI is the SYSV-ABI, then you got the register usage wrong. r13 is the small data pointer with -msdata=sysv and r2 is a second small data pointer when using -msdata=eabi. Joakim, did you try using -msdata=sysv together with "-G 64000"? Gunther
__GXX_EXPERIMENTAL_CXX0X__
I need a preprocessor macro to detect c++0x support. For now, that is __GXX_EXPERIMENTAL_CXX0X__ but what happens once -std=c++0x is the default? Will this macro still be defined? Don't we need a __GXX_CXX0X__ ?
Re: Bug in expand_builtin_setjmp_receiver ?
On Thu, Oct 21, 2010 at 02:14:15PM +0200, Frederic Riss wrote: > On 19 October 2010 15:31, Ian Lance Taylor wrote: > > However, I agree that it does seem that it should be added to or > > subtracted from hard_frame_pointer_rtx before setting > > virtual_stack_vars_rtx, or something. I only see one existing target > > which sets STARTING_FRAME_OFFSET to a non-zero value and does not have a > > nonlocal_goto expander: lm32. It would be interesting to know whether > > that target works here. > > Is it easy to test lm32 on some simulator? lm32 has a gdb simulator available, so it should be fairly easy to write a board file for it if one doesn't already exist. Unfortunately, building lm32-elf is broken in several different ways right now. -Nathan
Stadyum Derbi Özel Reklam Teklifi
Sayın Yetkili; TRT1 Ekranlarında 24 Ekim Pazar Günü STADYUM programında Fenerbahçe - Galatasaray maçının ilk özet görüntüleri yayınlanacaktır. Geçen sene görüntüler sadece TRT1'de değilken bile Galatasaray-Fenerbahçe maçının yayınlandığı hafta program açık ara GÜN 1.si olmuştu. Bu sene ise özet görüntüler sadece ve sadece TRT1 ekranlarında yayınlanacaktır. -Program Önü Kuşak Reklam Sn.Birim Fiyatı : 350 TL + KDV -Program İçi Kuşak Reklam Sn.Birim Fiyatı : 400 TL + KDV -Program içi Bant Reklam Sn.Birim Fiyatı : 500 TL + KDV Paket yapıldığında teklif fiyat ve frekans açısından daha avantajlı daha gelecektir. Detaylı bilgi için lütfen iletişime geçin. Saygılarımla... ALTERNATİF MEDYA Esra ALTAŞ Reklam Koordinatörü T:0216 459 0 444 F:0216 459 0 555 e...@alternatifmedya.tv & alternatifmedya...@gmail.com Not: Mail almak istemiyorsanız bu maili 'listenizden çıkmak istiyorum' diye cevaplamanız yeterlidir.
Re: Bug in expand_builtin_setjmp_receiver ?
Hi Ian, On 19 October 2010 15:31, Ian Lance Taylor wrote: > It should not be necessary to use STARTING_FRAME_OFFSET when using > virtual_stack_vars_rtx, as it should be added in by the vregs pass. See > instantiate_new_reg, and note that var_offset is set to > STARTING_FRAME_OFFSET. Yes, but here we are reconstructing virtual_stack_vars_rtx after some longjmp or such, thus it needs to take that into account, doesn't it? > However, I agree that it does seem that it should be added to or > subtracted from hard_frame_pointer_rtx before setting > virtual_stack_vars_rtx, or something. I only see one existing target > which sets STARTING_FRAME_OFFSET to a non-zero value and does not have a > nonlocal_goto expander: lm32. It would be interesting to know whether > that target works here. Is it easy to test lm32 on some simulator? If someone can do it, or has the test results handy, the test that gets issues when I set STARTING_FRAME_OFFSET is gcc.c-torture/execute/built-in-setjmp.c at optimization level O2 and higher. After the longjmp, the code tries to access some alloca'd variable and it fails. Cheers, Fred
Re: GCC RTX generation question
> "Radu Hobincu" writes: > 2. I have another piece of code that fails to compile with -O3. - struct desc{ int int1; int int2; int int3; }; int bugTest(struct desc *tDesc){ return *((int*)(tDesc->int1 + 16)); } -- >>> >>> That code looks awfully strange. Is that an integer or a pointer? >>> This time the compiler crashes with a segmentation fault. From what I could dig up with gdb, the compilers tries to make a LIBCALL for a memcopy, but I'm not really sure why. At the end is the back-trace of the crash. >>> >>> gcc is invoking memmove. This is happening in the return statement. >>> For some reason gcc thinks that the function returns a struct. Your >>> example does not return a struct.. I can not explain this. >> >> Ok, after changing both PARM_BOUNDARY and STACK_BOUNDARY from 8 to 32, >> now >> the compiler no longer crashes with segmentation fault, but it still >> generates a memmove syscall. >> >> To explain the code, I have a structure holding some info about a serial >> interface. One of the fields of the structure is the base address at >> which >> the serial is mapped in the main memory. Offseted by 16 bytes is the >> address from which I can read the available byte count received by the >> serial. It would probably be a better practice to define the base as >> (*int) rather than (int) but this should work as well. I tried both >> >> return *((int*)tDesc->int1 + 4); >> return *((int*)(tDesc->int1 + 16)); >> >> The result is the same: a system call. Is this in any way related to the >> back-end definition which I might have done wrong, or is it middle-end >> related? > > I don't know. There is something very odd about the fact that gcc > thinks that you are returning a struct when you are actually returning > an int. In particular, as far as I can see, cfun->returns_struct is > true. I think you need to try to figure out why that is happening. > > Ian > Ok, thanks again for pointing me in the right direction. It seems that I declared the FUNCTION_VALUE_REGNO_P as register 12, but I didn't specify it as a CALL_USED_REGISTERS. So the compiler tried to return the value in memory. Since the returned value was something that was supposed to be read from memory, it probably decided to use memmove to copy the 4 bytes of the int pointer from the return statement to the stack (not sure if it's faster than a read and a write with an additional general register tho). Anyway, thank you! Radu
peephole2: dead regs not marked as dead
Hi, I just came across an optimization issue in pass peephole2: Regs that are "naturally" dead because the function ends are not marked as dead, and therefore some optimization opportunities pass by unnoticed, e.g. together with recog.c::peep2_reg_dead_p() et. al. As I could not find a related PR, is this worth opening one? Regards, G.Lay Ah, I am using version 4.5.1