Re: Question concerning shared libraries in non-standard locations
* Paul Hilfinger: > 2. Remember to include the appropriate -W,l-R option or whatever in >and every compilation. I don't think it's a good idea to compile in rpaths to non-standard (user-specific) directories by default. This can lead to trapdoor rpaths and generally makes the binaries less portable (not more).
Re: unable to detect exception model
On Sun, 25 Jun 2006, Seongbae Park wrote: > On 6/25/06, Eric Botcazou <[EMAIL PROTECTED]> wrote: > > > So, something obviously wrong with > > > > > > struct max_alignment { > > > char c; > > > union { > > >HOST_WIDEST_INT i; > > >long double d; > > > } u; > > > }; > > > > > > /* The biggest alignment required. */ > > > > > > #define MAX_ALIGNMENT (offsetof (struct max_alignment, u)) > > > > > > for SPARC 32bit? > > > > I don't think so, the ABI says 8 in both cases. > > > > Note that bootstrap doesn't fail on SPARC/Solaris 2.[56] and (presumably) > > SPARC/Linux, which have HOST_WIDE_INT == 32, whereas SPARC/Solaris 7+ have > > HOST_WIDE_INT == 64. All are 32-bit compilers. > > > > Bootstrap doesn't fail on SPARC64/Solaris 7+ either, for which the ABI says > > 16 > > for the alignment in both cases. They are 64-bit compilers. > > SPARC psABI3.0 (32bit version) defines long double as 8 byte aligned. > SCD4.2, 64bit version, defines long double as 16 byte aligned with some caveat > (which essentially says long double can be 8-byte aligned in some cases > - fortran common block case - but the compiler should assume > 16-byte unless it can prove otherwise). > On 32bit ABI, there's also a possiblity of "double"s being only 4-byte aligned > when a double is passed on the stack. > > I don't know enough about gcc's gc to know whether the above can trip it over, > but the memory allocation (malloc and the likes) shouldn't be a > problem as long as > it returns 8-byte aligned block on 32bit and 16-byte aligned on 64bit. If the above (MAX_ALIGNMENT) is not 8 byte aligned on 32bit and 16 byte aligned on 64bit then we might allocate objects from a malloced page (4096bytes, e.g.) of sub-objects of size 24, which would be not 16 byte aligned for a requested storage of 16 bytes. (I just made up the numbers for illustration, in this exact way it cannot happen) So, allocation via ggc_malloc only guarantees alignment up to MAX_ALIGNMENT, and if that is wrong, MAX_ALIGNMENT has to be fixed. But if it has to, it was wrong before... Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: Question concerning shared libraries in non-standard locations
> I don't think it's a good idea to compile in rpaths to non-standard > (user-specific) directories by default. Whether it should be the default is debatable and different people will have different opinions. Whether there should be an option to do that seems very useful and desirable to me. That's what we do for Ada already. Arno
Re: unable to detect exception model
> Reverting your patch makes it go away too. I'll try and look into it > tomorrow. tree build_string (int len, const char *str) { tree s; size_t length; length = len + sizeof (struct tree_string); s = ggc_alloc_tree (length); Breakpoint 5, build_string (len=34, str=0x1048e58 "No space for profiling buffer(s)\n") at /home/eric/svn/gcc/gcc/tree.c:1124 1124 length = len + sizeof (struct tree_string); (gdb) next 1131 s = ggc_alloc_tree (length); (gdb) p length $1 = 58 (gdb) next 1133 memset (s, 0, sizeof (struct tree_common)); (gdb) p s $2 = 0xff3803fc 's' should be 8-byte aligned because it's a "tree". ggc_alloc_stat (size=58) at /home/eric/svn/gcc/gcc/ggc-page.c:1089 1089 if (size < 512) (gdb) next 1091 order = size_lookup[size]; (gdb) 1092 object_size = OBJECT_SIZE (order); (gdb) 1103 entry = G.pages[order]; (gdb) p order $6 = 41 (gdb) p object_size $7 = 60 Breakpoint 6, init_ggc () at /home/eric/svn/gcc/gcc/ggc-page.c:1548 1548 mask = ~(((unsigned)-1) << ffs (OBJECT_SIZE (order))); (gdb) p order $19 = 41 (gdb) next 1549 mask &= 2 * MAX_ALIGNMENT - 1; (gdb) x/i $pc 0xa931c8 :ld [ %fp + -24 ], %g1 (gdb) x/i 0xa931cc :and %g1, 0xf, %g1 -- Eric Botcazou
Re: unable to detect exception model
On Mon, 26 Jun 2006, Eric Botcazou wrote: > > Reverting your patch makes it go away too. I'll try and look into it > > tomorrow. > > tree > build_string (int len, const char *str) > { > tree s; > size_t length; > > length = len + sizeof (struct tree_string); > > s = ggc_alloc_tree (length); > > Breakpoint 5, build_string (len=34, > str=0x1048e58 "No space for profiling buffer(s)\n") > at /home/eric/svn/gcc/gcc/tree.c:1124 > 1124 length = len + sizeof (struct tree_string); > (gdb) next > 1131 s = ggc_alloc_tree (length); > (gdb) p length > $1 = 58 > (gdb) next > 1133 memset (s, 0, sizeof (struct tree_common)); > (gdb) p s > $2 = 0xff3803fc > > 's' should be 8-byte aligned because it's a "tree". The way it works is that ggc_alloc_stat is asked for 58 bytes, which if being a correct C object size, has alignof (object) == 2. Now, with struct tree_string GTY(()) { struct tree_common common; int length; char str[1]; }; it is unfortunate that we compute the allocation size by doing magic arithmetic instead of asking for sizeof (struct tree_string_with_length_FOO) (maybe one can do this with some VLA type?!). At least I know what's going on, and given stage3 and yadayada it might be best to revert the non-bugfixing parts of the patch. Or adjust all ggc_alloc callers to request properly aligned storage... e.g. for this particular case Index: tree.c === --- tree.c (revision 115006) +++ tree.c (working copy) @@ -1121,7 +1121,8 @@ build_string (int len, const char *str) tree s; size_t length; - length = len + sizeof (struct tree_string); + length = (len + sizeof (struct tree_string) + + __alignof__ (struct tree_string)) & ~__alignof__ (struct tree_string); #ifdef GATHER_STATISTICS tree_node_counts[(int) c_kind]++; but with this things going on, the whole reasoning why the patch is correct falls apart (if we declare doing so correct). Thanks for tracking this down (and I wonder why ia64 bootstrap succeeded with trapping misaligned), Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: unable to detect exception model
On Mon, 26 Jun 2006, Richard Guenther wrote: > On Mon, 26 Jun 2006, Eric Botcazou wrote: > > > > Reverting your patch makes it go away too. I'll try and look into it > > > tomorrow. > > > > tree > > build_string (int len, const char *str) > > { > > tree s; > > size_t length; > > > > length = len + sizeof (struct tree_string); > > > > s = ggc_alloc_tree (length); > > > > Breakpoint 5, build_string (len=34, > > str=0x1048e58 "No space for profiling buffer(s)\n") > > at /home/eric/svn/gcc/gcc/tree.c:1124 > > 1124 length = len + sizeof (struct tree_string); > > (gdb) next > > 1131 s = ggc_alloc_tree (length); > > (gdb) p length > > $1 = 58 > > (gdb) next > > 1133 memset (s, 0, sizeof (struct tree_common)); > > (gdb) p s > > $2 = 0xff3803fc > > > > 's' should be 8-byte aligned because it's a "tree". > > The way it works is that ggc_alloc_stat is asked for 58 bytes, which > if being a correct C object size, has alignof (object) == 2. Now, with > > struct tree_string GTY(()) > { > struct tree_common common; > int length; > char str[1]; > }; > > it is unfortunate that we compute the allocation size by doing magic > arithmetic instead of asking for sizeof (struct > tree_string_with_length_FOO) (maybe one can do this with some VLA > type?!). Note that at present length = len + sizeof (struct tree_string); always allocates too much, because sizeof (struct tree_string) is a multiple of alignof (struct tree_string) and so has the trailing char[] array padded to 8 bytes (in your case). So even (len + sizeof (struct tree_string)) & ~__alignof__(struct tree_string) might magically work in every case. Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: unable to detect exception model
On Jun 26, 2006, at 2:07 AM, Richard Guenther wrote: So even (len + sizeof (struct tree_string)) & ~__alignof__(struct tree_string) might magically work in every case. Of course using __alignof__ is wrong in GCC sources since that would mean you have to use GCC to bootstrap with which is not documented. But this does not explain PPC-Darwin's problem as PPC is not a STRICT_ALIGNMENT target. So either we are allocating too little or someone is going past an array bounds somewhere. -- Pinski
Re: unable to detect exception model
> but with this things going on, the whole reasoning why the patch is > correct falls apart (if we declare doing so correct). That's my understanding too. :-( > Thanks for tracking this down (and I wonder why ia64 bootstrap succeeded > with trapping misaligned), Note that SPARC64 bootstrap succeeds too. The discrepancy SPARC32/SPARC64 stems from the fact that SPARC32 also uses 64-bit instructions in some cases, while there is no 128-bit integer instructions on SPARC64 so word alignment is enough in practice. -- Eric Botcazou
Re: unable to detect exception model
On Mon, 26 Jun 2006, Andrew Pinski wrote: > > On Jun 26, 2006, at 2:07 AM, Richard Guenther wrote: > > >So even > >(len + sizeof (struct tree_string)) & ~__alignof__(struct tree_string) > >might magically work in every case. > > Of course using __alignof__ is wrong in GCC sources since that would mean > you have to use GCC to bootstrap with which is not documented. > > But this does not explain PPC-Darwin's problem as PPC is not a > STRICT_ALIGNMENT target. > So either we are allocating too little or someone is going past an array > bounds somewhere. I'll currently investigate "fixing" build_string, which interestingly fails and may hint at a problem elsewhere. Of course alignof is wrong, but one can use /* Make sure to request aligned storage and do not waste bytes provided by padding of struct tree_string. */ length = ((len + sizeof (struct tree_string)) & ~(sizeof (struct tree_string) - offsetof (struct tree_string, str) - 1)); instead (not pretty, but works). Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: unable to detect exception model
On Mon, 26 Jun 2006, Richard Guenther wrote: > On Mon, 26 Jun 2006, Andrew Pinski wrote: > > > > > On Jun 26, 2006, at 2:07 AM, Richard Guenther wrote: > > > > >So even > > >(len + sizeof (struct tree_string)) & ~__alignof__(struct tree_string) > > >might magically work in every case. > > > > Of course using __alignof__ is wrong in GCC sources since that would mean > > you have to use GCC to bootstrap with which is not documented. > > > > But this does not explain PPC-Darwin's problem as PPC is not a > > STRICT_ALIGNMENT target. > > So either we are allocating too little or someone is going past an array > > bounds somewhere. > > I'll currently investigate "fixing" build_string, which interestingly > fails and may hint at a problem elsewhere. Of course alignof is wrong, > but one can use > > /* Make sure to request aligned storage and do not waste bytes > provided by padding of struct tree_string. */ > length = ((len + sizeof (struct tree_string)) > & ~(sizeof (struct tree_string) > - offsetof (struct tree_string, str) - 1)); > > instead (not pretty, but works). Though it will not fix the alignment problem, only not waste the bytes from the padding (which is why using __alignof__ didn't work, it was wrong). A fix that also fixes the alignment problem would look like /* Make sure to request aligned storage and do not waste bytes provided by padding of struct tree_string. */ length = ((len + sizeof (struct tree_string)) & ~(sizeof (struct tree_string) - offsetof (struct tree_string, str) - 1)); length = ((length + __alignof__ (struct tree_string) - 1) & ~(__alignof__ (struct tree_string) - 1)); that is, align length - I don't know how to avoid using __alignof__ here, though, other than using a fake struct like struct foo_align_for_tree_string { char c; struct tree_string s; } and offsetof (struct foo_align_for_tree_string, s). Which would make it as ugly as struct dummy_to_get_alignof_tree_string { char c; struct tree_string s; }; /* Make sure to request aligned storage and do not waste bytes provided by padding of struct tree_string. */ length = ((len + sizeof (struct tree_string)) & ~(sizeof (struct tree_string) - offsetof (struct tree_string, str) - 1)); length = ((length + offsetof (struct dummy_to_get_alignof_tree_string, s) - 1) & ~(offsetof (struct dummy_to_get_alignof_tree_string, s) - 1)); :/ I'll go ahead and revert the ggc-page.c patch now. Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: Project RABLET
On Sun, 2006-06-25 at 01:04 -0400, Vladimir N. Makarov wrote: > Andrew MacLeod wrote: > Having no information about the final register allocator decision, > the partial register pressure reducing through rematerialization is > not working in many cases. For example, making rematerialization of > > a <- b + c > > when you reduce the pressure from 100 to 50 for x86 there is a big > chance that b and c will be not placed in hard registers. Instead of > one load (of a), two loads (b and c) will be needed. This result code > is even worse than before reducing pressure. Having implemented a complex rematerialization pass before, I understand it well enough to know that replacing 'a' with 'b + c' is the wrong thing to do, unless at least one of b or c is used again right next to the use of 'a'. That can actually increase register pressure, or have nothing more than a neutral effect. Its *way* more complicated than simply moving expressions downward. Its tracking all the things used in expressions and determining that at a given location, it *is* worthwhile to do it because the correct values are already trivially available, or can alternatively be recomputed in a worthwhile way. Often, there are only a few worthwhile remats out of all the possible ones. In general, I have never found the primary benefit of rematerialization to be register pressure reduction, but rather one of avoiding placing stores to spill in the instruction stream. When there is a lot of spilling, those stores can really bog down a pipeline. > > So rematerialization in out-of-ssa pass will work well only for full > pressure relief (to the level equal to the number of hard registers) > or close to the full relief. That's a pretty broad assertion to make, and I disagree with it :-) Especially when you only talk about a single component of the whole. RABLET is a group of things which tend to enable each other. Any one of them by themselves would actually accomplish less. Some of the required components have a benefit of some sort unrelated to the others (such as faster out of ssa translation). Others require interaction with the whole to see any benefit. Remember that one of the components is a new integrated expand... RABLET will have an understanding of the instructions that can be generated, and will try to make use of those when making decisions. An additional benefit likely to be seen from this is that code tends to utilize more of the variable names the user will recognize, rather than something like 'SPILL_678'. And maybe, just maybe, it'll be easier to get the debug info correct (TER can muck it up pretty badly :-) > > The SSA pressure relief through rematerialization described in > Simpson's theses is oriented for such architectures (with a big > regular register file size of 32 as I remember). So it can work for > ppc but it will be less successful for major interest platforms x86 and > x86_64. I haven't read the thesis, but I would be surprised if it describes what I am planning to do. Without the integration with expand to do instruction selection, a few tweaks in some RTL optimizations, and a very specific gcc-oriented union of all these components, what I am planning to do would be completely worthless. Remat is really a small part of it. What benefit I will see? Well, time will tell :-) Andrew
Matching of non-standard instructions
Hi, My target has some instructions that do not exactly match any predefined pattern names. What is the correct way to get gcc to use them in code generation? For example, I have an add instruction that can add a 32-bit integer (with or without sign extension) to a 64-bit operand and store the result as 64 bits. C code like: __int64_t a = 1; int b = 2; a += b; will generate code that sign or zero extends b into a 64 bit operand and then apply the adddi3 pattern. I've unsuccessfully tried to figure out how to do this from the gcc internals documentation and looking at some other ports. Some hints to get me going in the right direction would be much appreciated. I've tried adding unnamed patterns for these instructions, using zero_extend or sign_extend but I guess those will not get used because the generation has already applied the adddi3 pattern when it generates the RTL in the first place. Roland
Re: Project RABLET
On 6/26/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Sun, 2006-06-25 at 01:04 -0400, Vladimir N. Makarov wrote: > Andrew MacLeod wrote: > Having no information about the final register allocator decision, > the partial register pressure reducing through rematerialization is > not working in many cases. For example, making rematerialization of > > a <- b + c > > when you reduce the pressure from 100 to 50 for x86 there is a big > chance that b and c will be not placed in hard registers. Instead of > one load (of a), two loads (b and c) will be needed. This result code > is even worse than before reducing pressure. Having implemented a complex rematerialization pass before, I understand it well enough to know that replacing 'a' with 'b + c' is the wrong thing to do, unless at least one of b or c is used again right next to the use of 'a'. That can actually increase register pressure, or have nothing more than a neutral effect. Its *way* more complicated than simply moving expressions downward. Its tracking all the things used in expressions and determining that at a given location, it *is* worthwhile to do it because the correct values are already trivially available, or can alternatively be recomputed in a worthwhile way. Often, there are only a few worthwhile remats out of all the possible ones. In general, I have never found the primary benefit of rematerialization to be register pressure reduction, but rather one of avoiding placing stores to spill in the instruction stream. When there is a lot of spilling, those stores can really bog down a pipeline. > > So rematerialization in out-of-ssa pass will work well only for full > pressure relief (to the level equal to the number of hard registers) > or close to the full relief. That's a pretty broad assertion to make, and I disagree with it :-) Especially when you only talk about a single component of the whole. RABLET is a group of things which tend to enable each other. Any one of them by themselves would actually accomplish less. Some of the required components have a benefit of some sort unrelated to the others (such as faster out of ssa translation). Others require interaction with the whole to see any benefit. Remember that one of the components is a new integrated expand... RABLET will have an understanding of the instructions that can be generated, and will try to make use of those when making decisions. I think the overall idea of RABLET is quite sound. Especially if integrating expand with out-of-ssa this might provide early instruction selection (now, that would possibly require moving all of RTL loop optimizations before this point). Of course re-doing expand is the hard part here... Richard.
Re: unable to detect exception model
Richard Guenther wrote: > I'll go ahead and revert the ggc-page.c patch now. Thanks, I think that's the right call. I'm sorry I didn't spot this issue in my review. The idea you have is a good one, but it does look like some of the funny games we're playing get in the way. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: unable to detect exception model
On Mon, 26 Jun 2006, Mark Mitchell wrote: > Richard Guenther wrote: > > > I'll go ahead and revert the ggc-page.c patch now. > > Thanks, I think that's the right call. I'm sorry I didn't spot this > issue in my review. The idea you have is a good one, but it does look > like some of the funny games we're playing get in the way. Yes, we can revisit this in stage1 again. It should be only a few places to fix, but this is too risky for stage3 given that breakage is to occur randomly based on GC allocation choices. Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs
Re: Project RABLET
Andrew MacLeod wrote: On Sun, 2006-06-25 at 01:04 -0400, Vladimir N. Makarov wrote: Andrew MacLeod wrote: The SSA pressure relief through rematerialization described in Simpson's theses is oriented for such architectures (with a big regular register file size of 32 as I remember). So it can work for ppc but it will be less successful for major interest platforms x86 and x86_64. I haven't read the thesis, but I would be surprised if it describes what I am planning to do. Without the integration with expand to do instruction selection, a few tweaks in some RTL optimizations, and a very specific gcc-oriented union of all these components, what I am planning to do would be completely worthless. Remat is really a small part of it. The thesis contains a small part about different heuristics of reducing register pressure on SSA through rematerializaion (what and where to rematerialize). I see your plan is much bigger, not only reducing through rematerialization but other things including better RTL expansion and, more imprortant, doing it in an integrated way. In any case, it looks as an interesting project. Good luck with that. What benefit I will see? Well, time will tell :-)
Predcom branch status & call for testing
Hello, predcom branch is now ready for testing. We basically implemented second-order predictive commoning (quick overview can be found e.g. in http://www.cs.ualberta.ca/~amaral/cascon/CDP04/slides/tal.pdf, or at the beginning of tree-predcom.c file). There are still some issues we are working on (in SPEC2000, bzip2 and applu are misscompiled, and some of the implementation details are realized in a very inefficient way), but we are now getting the expected results for mgrid, as well as other smaller testcases. The recommended flags for testing are -O2 -funsafe-math-optimizations versus -O2 -funsafe-math-optimizations -fno-predictive-commoning (-funsafe-math-optimizations is necessary to get advantage of aggressive reasociation in mgrid and other testcases that use fp arithmetics). Zdenek
Re: ARM gcc 4.1 optimization bug.
Dave Korn wrote: On 06 June 2006 15:17, Richard Earnshaw wrote: On Tue, 2006-06-06 at 15:05, Dirk Behme wrote: Fengwei Yin wrote: Hi Daniel, I have already reported this bug. The bug number is #27363. I also tried the gcc snapshot 4.1.1-20060421. The bug is not fixed in this version too. > On 5/1/06, Daniel Jacobowitz <[EMAIL PROTECTED]> wrote: On Sun, Apr 30, 2006 at 11:03:05AM +0800, Fengwei Yin wrote: I am using gcc4.1 for ARM to build Linux kernel. But there is a bug related to the gcc optimization. I assume this is correct mail list to report this bug. If not, please let me know. No, if you have a bug report, please use the bug reporting system. Please see: http://gcc.gnu.org/bugs.html Any news regarding this? Seems that I have the same problem. However, I first selected an other list http://sourceware.org/ml/crossgcc/2006-06/msg00032.html before finding this ;) Looking at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27363 it looks to me that this isn't fixed at 4.1.1? The bug is in state 'WAITING', which means there is not enough information for us to investigate the problem. See http://gcc.gnu.org/bugs.html for details of what we need. Just to elaborate: we need a simple self-contained testcase that anybody can compile themselves and see the bug. There's no possible way to analyze and fix it without being able to recreate it! In the bug report, you wrote "I tried to make a simple test example for this bug. But If I put the code from ALSA subsystem of Linux kernel to a test.c file, the gcc will product correct assembly code." So, what you need to do is re-do the original kernel build, but add "--save-temps" to the compiler flags, then find the .i file with the preprocessed source code for the failing module. This will be entirely selfcontained and will reproduce the bug, because that's what the compiler actually gets fed with in the case where you do see the bug; when you extracted it to a separate file there was probably some subtle difference related maybe to macro #defines or something that didn't match up. We now put the code generated with --save-temps and the resulting assembly files for different optimization levels as attachment to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27363 Anything else needed? Best regards Dirk
The default -mpic-register for -mthumb
The default -mpic-register (on ARM/Thumb) is r10. In -mthumb mode, r10 is not available to the math instructions as a direct argument. On top of that, preserving r10 complicates the function prologue. Does it make more sense to use a directly accessible register, r7 for example, as the default -mpic-register when in -mthumb mode? Can a non-preserved register, such as r3, be used instead? (assuming the compiler saves the function argument in r3 somewhere) Why is the PIC register fixed? Could the compiler register allocation logic be allowed to allocate the _GLOBAL_OFFSET_TABLE_ pointer the same as any other variable? In which case it would probably realise that cost(r3) < cost(r7) < cost(r10) at least in the case where r3 isn't being used by a function argument. Cheers, Shaun
RE: Fortran Compiler
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of hector riojas roldan > Sent: Friday, June 23, 2006 5:40 PM > To: gcc@gcc.gnu.org > Subject: Fortran Compiler > > Hello, I would like to know if there is a fortran compiler > that runs on AMD 64 bits. I have installed suse 10.1 linux on > my computer, I would really apreciated all your help. I heard > yours also have C and > C++. > Thank you very much, I write you from Argentina, héctor Riojas Roldan The GNU compiler has a Fortran compiler. I believe under SUSE Linux this is in the gcc-fortran package that you can install with YAST.
Re: The default -mpic-register for -mthumb
On Mon, 2006-06-26 at 17:48, Shaun Jackman wrote: > The default -mpic-register (on ARM/Thumb) is r10. In -mthumb mode, r10 > is not available to the math instructions as a direct argument. On top > of that, preserving r10 complicates the function prologue. Does it > make more sense to use a directly accessible register, r7 for example, > as the default -mpic-register when in -mthumb mode? Can a > non-preserved register, such as r3, be used instead? (assuming the > compiler saves the function argument in r3 somewhere) > > Why is the PIC register fixed? Could the compiler register allocation > logic be allowed to allocate the _GLOBAL_OFFSET_TABLE_ pointer the > same as any other variable? In which case it would probably realise > that > cost(r3) < cost(r7) < cost(r10) > at least in the case where r3 isn't being used by a function argument. As of gcc-4.2 it isn't fixed, it's just like any other pseudo generated by the compiler. R.
Re: Matching of non-standard instructions
"Roland Persson" <[EMAIL PROTECTED]> writes: > For example, I have an add instruction that can add a 32-bit integer (with > or without sign extension) to a 64-bit operand and store the result as 64 > bits. > > C code like: > __int64_t a = 1; > int b = 2; > a += b; > > will generate code that sign or zero extends b into a 64 bit operand and > then apply the adddi3 pattern. > > I've unsuccessfully tried to figure out how to do this from the gcc > internals documentation and looking at some other ports. Some hints to get > me going in the right direction would be much appreciated. > > I've tried adding unnamed patterns for these instructions, using > zero_extend or sign_extend but I guess those will not get used because the > generation has already applied the adddi3 pattern when it generates the RTL > in the first place. Adding unnamed patterns is correct. Although they won't be used when expanding trees into RTL, they will be available for use by the combine pass. Ian
Re: Matching of non-standard instructions
On Mon, Jun 26, 2006 at 04:16:28PM +0200, Roland Persson wrote: > Hi, > > My target has some instructions that do not exactly match any predefined > pattern names. What is the correct way to get gcc to use them in code > generation? Please see http://gcc.gnu.org/onlinedocs/gccint/Patterns.html>. > For example, I have an add instruction that can add a 32-bit integer (with > or without sign extension) to a 64-bit operand and store the result as 64 > bits. [cut] > I've tried adding unnamed patterns for these instructions, using > zero_extend or sign_extend but I guess those will not get used because the > generation has already applied the adddi3 pattern when it generates the RTL > in the first place. In a similiar case, only with 8 and 16 bits, respectively, I'm having success with this pattern, which you could use as a starting point: (define_insn "*zero_extendqihi_addhi3" [(set (match_operand:HI 0 "nonimmediate_operand" "=q,m") (plus:HI (zero_extend:HI (match_operand:QI 1 "general_operand" "qmi,qi")) (match_operand:HI 2 "nonimmediate_operand" "0,0"))) (clobber (reg:CC CC_REG))] "" ... ) (It would have been a help to us to see examples of insn patterns you have tried unsuccessfully.) You will need to enable code optimization. The optimization pass you want in this case is named "combine". You should consult the manual so you know what sort of optimizations you can expect. In addition to the instruction pattern, you may have to adjust the RTX cost calculation (TARGET_RTX_COSTS) so that the optimizers know if this instruction is cheaper than two individual instructions. This is not difficult as such, but can be a lot of work if you want to do this for all instruction patterns and get reasonably accurate costs. The only really difficult part is knowing which patterns combine is looking for. AFAIK, there is no way, short of patching combine, to make combine reveal such important information. Some sort of clue is given by http://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html> but is not the whole story. -- Rask Ingemann Lambertsen
Re: Predcom branch status & call for testing
On Jun 26, 2006, at 8:56 AM, Zdenek Dvorak wrote: Hello, predcom branch is now ready for testing. Very cool. We basically implemented second-order predictive commoning (quick overview can be found e.g. in http://www.cs.ualberta.ca/~amaral/cascon/CDP04/slides/tal.pdf, or at the beginning of tree-predcom.c file). When I originally ran across this work, I emailed the author about it. He mentioned that he/IBM had filed for several patents on the work. I haven't corresponded with him since then, but it may be working checking out. -Chris
Re: "Free as in Freedom"
On Sun, Jun 25, 2006 at 05:02:41PM -0700, Alexander Verhaeghe wrote: > Quote Jan-Benedict Glaw "So please shut up now." > > Quite friendly I must say, it's the german way I > suppose of handling things? > > To Jan-Benedict Glaw I WON'T SHUT UP because of "Free > as in Freedom"! That's fine. Just talk to some else, e.g. a hired Babysitter. p.s. and cross-posting to more than half a dozend list and individuals isn't exactly considered polite either ;-) p.p.s. have a lot of fun fighting the internet fundamentals :)
Re: The default -mpic-register for -mthumb
On 6/26/06, Richard Earnshaw <[EMAIL PROTECTED]> wrote: As of gcc-4.2 it isn't fixed, it's just like any other pseudo generated by the compiler. Glad to hear it! Thanks, Shaun
Specifying a MULTILIB dependency
I'm using MULTILIB_OPTIONS and MULTILIB_DIRNAMES to compile a PIC/XIP toolchain. I'm familiar with the MULTILIB_EXCEPTIONS mechanism to specify incompatible configurations. How, though, do I indicate that msingle-pic-base depends on fPIC? MULTILIB_OPTIONS+= fPIC MULTILIB_DIRNAMES += pic MULTILIB_OPTIONS+= msingle-pic-base MULTILIB_DIRNAMES += xip Please cc me in your reply. Thanks, Shaun
Why does __float80 depend on -mmmx/-msse?
There are ix86_init_mmx_sse_builtins () { .. /* The __float80 type. */ if (TYPE_MODE (long_double_type_node) == XFmode) (*lang_hooks.types.register_builtin_type) (long_double_type_node, "__float80"); else { /* The __float80 type. */ float80_type = make_node (REAL_TYPE); TYPE_PRECISION (float80_type) = 80; layout_type (float80_type); (*lang_hooks.types.register_builtin_type) (float80_type, "__float80"); } That means __float80 is only available when -mmmx/-msse is used for 32bit compiler. Why does __float80 have to depend on -mmmx/-msse? H.J.
Testing MULTILIB configurations
After I've modified the MULTILIB options in t-arm-elf, what's the fastest way to test the new configuration without rebuilding the entire toolchain? Right now the best method I have is `make clean-gcc all-gcc', which is admittedly quite slow. Please cc me in your reply. Thanks! Shaun
Re: Specifying a MULTILIB dependency
"Shaun Jackman" <[EMAIL PROTECTED]> writes: > I'm using MULTILIB_OPTIONS and MULTILIB_DIRNAMES to compile a PIC/XIP > toolchain. I'm familiar with the MULTILIB_EXCEPTIONS mechanism to > specify incompatible configurations. How, though, do I indicate that > msingle-pic-base depends on fPIC? > > MULTILIB_OPTIONS+= fPIC > MULTILIB_DIRNAMES += pic > MULTILIB_OPTIONS+= msingle-pic-base > MULTILIB_DIRNAMES += xip The usual hacked up way is to MULTILIB_EXCEPTIONS to remove -msingle-pic-base without -fPIC. Something like MULTILIB_EXCEPTIONS = -msingle-pic-base might do it. Ian
Re: Why does __float80 depend on -mmmx/-msse?
"H. J. Lu" <[EMAIL PROTECTED]> writes: > There are > > ix86_init_mmx_sse_builtins () > { > .. > /* The __float80 type. */ > if (TYPE_MODE (long_double_type_node) == XFmode) > (*lang_hooks.types.register_builtin_type) (long_double_type_node, >"__float80"); > else > { > /* The __float80 type. */ > float80_type = make_node (REAL_TYPE); > TYPE_PRECISION (float80_type) = 80; > layout_type (float80_type); > (*lang_hooks.types.register_builtin_type) (float80_type, > "__float80"); > } > > That means __float80 is only available when -mmmx/-msse is used for > 32bit compiler. Why does __float80 have to depend on -mmmx/-msse? As far as I can see, it doesn't. On the other hand, as far as I can see, __float80 is undocumented and unused for the i386. Why does it exist? Ian
Re: Specifying a MULTILIB dependency
On 26 Jun 2006 14:04:36 -0700, Ian Lance Taylor <[EMAIL PROTECTED]> > The usual hacked up way is to MULTILIB_EXCEPTIONS to remove -msingle-pic-base without -fPIC. Something like MULTILIB_EXCEPTIONS = -msingle-pic-base might do it. I tried your suggestion, but it didn't seem to have the desired effect. With my limited understanding of how MULTILIB_EXCEPTIONS works, I thought the following attempt might have a chance: MULTILIB_EXCEPTIONS += *!fPIC*/*msingle-pic-base* No luck though. Cheers, Shaun
Re: Specifying a MULTILIB dependency
"Shaun Jackman" <[EMAIL PROTECTED]> writes: > On 26 Jun 2006 14:04:36 -0700, Ian Lance Taylor <[EMAIL PROTECTED]> > > The usual hacked up way is to MULTILIB_EXCEPTIONS to remove > > -msingle-pic-base without -fPIC. Something like > > > > MULTILIB_EXCEPTIONS = -msingle-pic-base > > > > might do it. > > I tried your suggestion, but it didn't seem to have the desired > effect. With my limited understanding of how MULTILIB_EXCEPTIONS > works, I thought the following attempt might have a chance: > > MULTILIB_EXCEPTIONS += *!fPIC*/*msingle-pic-base* No, that wouldn't work. MULTILIB_EXCEPTIONS takes a shell glob pattern. It is invoked for each option set which is going to be generated. I would expect that one of the option sets would be simply "-msingle-pic-base". So it seems to me that MULTILIB_EXCEPTIONS can be made to work, if you play with the syntax. However, if it can not be made to work for some reason, then I don't know of anything else which will do what you want. Sorry. Ian
Re: Specifying a MULTILIB dependency
On 26 Jun 2006 14:42:20 -0700, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: No, that wouldn't work. MULTILIB_EXCEPTIONS takes a shell glob pattern. It is invoked for each option set which is going to be generated. I would expect that one of the option sets would be simply "-msingle-pic-base". So it seems to me that MULTILIB_EXCEPTIONS can be made to work, if you play with the syntax. However, if it can not be made to work for some reason, then I don't know of anything else which will do what you want. Sorry. Reading through gcc/genmultilib, it looks as though MULTILIB_EXCLUSIONS can take a '!' parameter, but MULTILIB_EXCEPTIONS cannot. Here's another failed experiment: MULTILIB_OPTIONS+= fno-pic/fPIC MULTILIB_DIRNAMES += nopic pic # MULTILIB_OPTIONS+= mno-single-pic-base/msingle-pic-base MULTILIB_DIRNAMES += noxip xip MULTILIB_EXCEPTIONS += *fno-pic*/*msingle-pic-base* I wish I knew what text the shell glob pattern was globbing against. It's clearly not using the resulting directory structure, since the EXCEPTIONS use the OPTIONS names and not the DIRNAMES. Cheers, Shaun
Re: Specifying a MULTILIB dependency
"Shaun Jackman" <[EMAIL PROTECTED]> writes: > Reading through gcc/genmultilib, it looks as though > MULTILIB_EXCLUSIONS can take a '!' parameter, but MULTILIB_EXCEPTIONS > cannot. I forgot about MULTILIB_EXCLUSIONS (it might be nice if it were documented). I don't know if it would help you, since it is evaluated at runtime--it will still build the library, it just won't use it. > Here's another failed experiment: > > MULTILIB_OPTIONS+= fno-pic/fPIC > MULTILIB_DIRNAMES += nopic pic > # > MULTILIB_OPTIONS+= mno-single-pic-base/msingle-pic-base > MULTILIB_DIRNAMES += noxip xip > MULTILIB_EXCEPTIONS += *fno-pic*/*msingle-pic-base* > > I wish I knew what text the shell glob pattern was globbing against. > It's clearly not using the resulting directory structure, since the > EXCEPTIONS use the OPTIONS names and not the DIRNAMES. To see the kind of thing which MULTILIB_EXCEPTIONS checks against, run gcc --print-multi-lib. The string before the ';' is probably the one being checked. Ian
Re: Specifying a MULTILIB dependency
On 6/26/06, Shaun Jackman <[EMAIL PROTECTED]> wrote: Reading through gcc/genmultilib, it looks as though MULTILIB_EXCLUSIONS can take a '!' parameter, but MULTILIB_EXCEPTIONS cannot. The solutions was to use MULTILIB_EXCLUSIONS! MULTILIB_EXCLUSIONS += !fPIC/msingle-pic-base Yeeha! Cheers, Shaun
Re: Testing MULTILIB configurations
On 6/26/06, Shaun Jackman <[EMAIL PROTECTED]> wrote: After I've modified the MULTILIB options in t-arm-elf, what's the fastest way to test the new configuration without rebuilding the entire toolchain? Right now the best method I have is `make clean-gcc all-gcc', which is admittedly quite slow. Please cc me in your reply. Thanks! Shaun The best method I found was to remove gcc/s-mlib and gcc/stmp-multilib, rebuild xgcc (make -C gcc xgcc), and run xgcc -print-multi-lib to find the new multilib configuration. Cheers, Shaun
Re: Source code of CIL back-end
On Thu, Jun 22, 2006 at 11:49:45AM +0200, Roberto COSTA wrote: > By the way, is there any news about the status of the CIL issue? > I'm sorry to bother the list readers about this, but whom could I > directly ask? Sorry for the delay in answering, Robert. I was out of town, and apparently people thought I'd post the answer on this one, since I brought up the topic for SC discussion. The SC discussed it with Richard Stallman, and he agrees that it is not "dangerous" (the FSF had raised objections to byte-code systems in the past, so many of us assumed there would be a problem). So there is no political/legal objection to including a CIL back end. If it passes technical review, it can be included.
Re: Source code of CIL back-end
On Thu, Jun 22, 2006 at 09:06:28PM -0400, Daniel Berlin wrote: > As Joe Buck, a Steering Committee member said, you need to talk to RMS > directly and get him to accept the idea, before we can do anything about it. I already asked RMS directly, and he has approved. Again, sorry for the delay on getting back to the list, I was out of town.
Re: Why does __float80 depend on -mmmx/-msse?
On Jun 26, 2006, at 2:09 PM, Ian Lance Taylor wrote: As far as I can see, it doesn't. You missed: if (TARGET_MMX) ix86_init_mmx_sse_builtins (); Which HJL should have also quoted. On the other hand, as far as I can see, __float80 is undocumented and unused for the i386. Why does it exist? Jan added it with __float128 also: 2003-10-30 Jan Hubicka <[EMAIL PROTECTED]> (ix86_init_mmx_sse_builtins): Add __float80, __float128. I think it was added for x86_64 ABI support which defines them http://gcc.gnu.org/ml/gcc-patches/2003-10/msg02473.html -- Pinski
Re: Why does __float80 depend on -mmmx/-msse?
Andrew Pinski <[EMAIL PROTECTED]> writes: > On Jun 26, 2006, at 2:09 PM, Ian Lance Taylor wrote: > > > As far as I can see, it doesn't. > > You missed: >if (TARGET_MMX) > ix86_init_mmx_sse_builtins (); > > Which HJL should have also quoted. I didn't miss it, I was just ambiguous. H.J. asked "Why does __float80 have to depend on -mmmx/-msse?" and I answered "it doesn't," meaning that __float80 doesn't have to depend on -mmmx/-msse. > > On the other hand, as far as I can > > see, __float80 is undocumented and unused for the i386. Why does it > > exist? > > Jan added it with __float128 also: > 2003-10-30 Jan Hubicka <[EMAIL PROTECTED]> > > (ix86_init_mmx_sse_builtins): Add __float80, __float128. > > I think it was added for x86_64 ABI support which defines them > > http://gcc.gnu.org/ml/gcc-patches/2003-10/msg02473.html Are you saying that the x86_64 ABI calls for __float80 to be defined? I can't find any reference to __float80 which is not related to gcc or ia64. In any case that does not give any explanation for why it should only be defined for MMX or SSE. I don't object to defining __float80 for i386. I agree with H.J. that if we define it, we should define it unconditionally. And I also say that if we define it, we should document it, or at least find some other document which mentions it. Ian
Re: [RFA] Boehm GC support addition for debugging
On Jun 26, 2006, at 9:01 AM, Bryce McKinlay wrote: Otherwise, this is fine. No it is not, it broke bootstrap on powerpc-darwin: /Users/regress/tbox/svn-gcc/libjava/boehm.cc: In function 'void _Jv_SuspendThread(_Jv_Thread_t*)': /Users/regress/tbox/svn-gcc/libjava/boehm.cc:679: error: 'GC_suspend_thread' was not declared in this scope /Users/regress/tbox/svn-gcc/libjava/boehm.cc: In function 'void _Jv_ResumeThread(_Jv_Thread_t*)': /Users/regress/tbox/svn-gcc/libjava/boehm.cc:685: error: 'GC_resume_thread' was not declared in this scope This is the fourth patch to libjava in the last two weeks which have caused a bootstrap failure on powerpc-darwin (and sometimes on other targets) so maybe it might be time to step back for a second before approving/applying patches. Thanks, Andrew Pinski
Re: [RFA] Boehm GC support addition for debugging
On Jun 26, 2006, at 10:02 PM, Andrew Pinski wrote: On Jun 26, 2006, at 9:01 AM, Bryce McKinlay wrote: Otherwise, this is fine. No it is not, it broke bootstrap on powerpc-darwin: Did you look when the function would have been declared. From gc.h: #if defined(GC_PTHREADS) && !defined(GC_SOLARIS_THREADS) \ && !defined(GC_WIN32_THREADS) && !defined(GC_DARWIN_THREADS) GC_API void GC_suspend_thread GC_PROTO((pthread_t)); GC_API void GC_resume_thread GC_PROTO((pthread_t)); #endif So this is not just Darwin but also Windows and maybe solaris too. -- Pinski
Re: [RFA] Boehm GC support addition for debugging
Andrew Pinski wrote: On Jun 26, 2006, at 10:02 PM, Andrew Pinski wrote: On Jun 26, 2006, at 9:01 AM, Bryce McKinlay wrote: Otherwise, this is fine. No it is not, it broke bootstrap on powerpc-darwin: Did you look when the function would have been declared. From gc.h: #if defined(GC_PTHREADS) && !defined(GC_SOLARIS_THREADS) \ && !defined(GC_WIN32_THREADS) && !defined(GC_DARWIN_THREADS) GC_API void GC_suspend_thread GC_PROTO((pthread_t)); GC_API void GC_resume_thread GC_PROTO((pthread_t)); #endif So this is not just Darwin but also Windows and maybe solaris too. Solaris too. :( I'd wish being a bit more careful with such patches. Everyone can ask for RFT and I'll happily test on my farm as time permits. Andreas P.S, Me being a _spare_ time contributor
Re: Why does __float80 depend on -mmmx/-msse?
On Mon, Jun 26, 2006 at 09:24:57PM -0700, Ian Lance Taylor wrote: > Andrew Pinski <[EMAIL PROTECTED]> writes: > > > On Jun 26, 2006, at 2:09 PM, Ian Lance Taylor wrote: > > > > > As far as I can see, it doesn't. > > > > You missed: > >if (TARGET_MMX) > > ix86_init_mmx_sse_builtins (); > > > > Which HJL should have also quoted. > > I didn't miss it, I was just ambiguous. H.J. asked "Why does > __float80 have to depend on -mmmx/-msse?" and I answered "it doesn't," > meaning that __float80 doesn't have to depend on -mmmx/-msse. > > > > On the other hand, as far as I can > > > see, __float80 is undocumented and unused for the i386. Why does it > > > exist? > > > > Jan added it with __float128 also: > > 2003-10-30 Jan Hubicka <[EMAIL PROTECTED]> > > > > (ix86_init_mmx_sse_builtins): Add __float80, __float128. > > > > I think it was added for x86_64 ABI support which defines them > > > > http://gcc.gnu.org/ml/gcc-patches/2003-10/msg02473.html > > Are you saying that the x86_64 ABI calls for __float80 to be defined? > I can't find any reference to __float80 which is not related to gcc or > ia64. > > In any case that does not give any explanation for why it should only > be defined for MMX or SSE. > > I don't object to defining __float80 for i386. I agree with H.J. that > if we define it, we should define it unconditionally. And I also say > that if we define it, we should document it, or at least find some > other document which mentions it. I have no strong opinion on the support for __float80. But the current behavior seems odd to me. Also, we have incomplete support for __float128. There is no runtime support for __float128 at all on x86-64. Libstdc++ doesn't support __float128 nor does glibc. I did post a __float128 runtime patch for glibc. But very few people showed any interest in it. I was wondering why gcc bothered with __float80 and __float128 on x86-64. H.J.