Auto-Vectorization, Polyhedral Model
Hi, Can anyone explain whether GCC has implemented Auto-Vectorization based on Polyhedral Model? Are there any related projects shooting at this, and in progress? Steven.
Re: builtin gamma function
James Hirschorn writes: > I have noticed that the builtin gamma function is very accurate and > extremely fast. Can someone tell me where to find the source code for the > implementation? Are you calling it on a constant? Because gcc will fold gamma applied to a constant which meets certain characteristics. If you aren't calling it on a constant, then you are getting the function from your libc. gcc does not itself include an implementation of the gamma function. Ian
builtin gamma function
I have noticed that the builtin gamma function is very accurate and extremely fast. Can someone tell me where to find the source code for the implementation? gdb skips over the call to the builtin gamma. I assume it is not implemented by the simple Lanczos algorithm in tr1/gamma.tcc because I tried this code and it was about half the speed and less accurate. Thanks, James
Re: Troubleshooting with gcc 4.6
Matthias Klose writes: >> When asking a question of this sort, it helps a lot if you show us >> precisely what you did and precisely what happened. Without seeing >> that, I am going to guess that you are running into multiarch libraries. >> Debian, and therefore Ubuntu, decided to move the system libraries from >> the locations where all GNU/Linux distros have put them for many years. >> They have updated their own versions of gcc, but the mainstream gcc >> releases have not been updated. >> >> This is going to be an ongoing problem for many years for people who use >> Debian or Ubuntu. I do not know how to resolve it. > > This is not a multiarch issue. Passing --as-needed by default to the linker > was > enabled in the Ubuntu 11.10 release, which is one month old [1]. > > Even multiarch is only seven month old (first appeared in Ubuntu 11.04), so I > honestly can't see any justification for your "many years" statement. > > Yes, I do need to re-submit the updated multiarch patch. I assume you mean my second use of "many years." It is going to be an ongoing problem because people often want to build gcc releases which are not at tip. On gcc-help we routinely get questions about how to build gcc 3.4 and later. As far as I can see, none of those older releases are going to build on current and future Debian/Ubuntu releases, because the libraries have moved. Therefore, I believe we are going to be dealing with this issue on gcc-help for many years to come. Fortunately, we will gain the corresponding benefit of, well, hmmm, I can't think of any benefit, actually. But there must be one out there that is worth all this disruption for gcc users. Ian
Re: Troubleshooting with gcc 4.6
On 11/09/2011 07:50 PM, Ian Lance Taylor wrote: > santi writes: > >> I recently updated my Ubuntu 10.10 to 11.10 and since then I have been >> having problems with my compiler. I have seen that this new Ubuntu >> distribution uses gcc 4.6 whilest my old 10.10 used gcc 4.4.5 or >> 4.4.6. >> >> The main problem I have nowadays is with the math.h library when I >> need to use functions as sqrt() or pow() that I used to use without >> any problem in the old distribution (well, I had to write the -lm >> option when I tried to compile my source files but it did run >> perfectly). Today I'm getting and unresolve refernce to 'sqrt' when I >> comile my files even though I'm using the -lm option. this is caused by passing --as-needed by default to the linker. Make sure to pass libraries on the command line behind objects (you need the symbol referenced before the definition is found). You'll likely find this issue on OpenSuse releases (it may be enabled for package builds only). > This question is not appropriate for the mailing list gcc@gcc.gnu.org. > It would be appropriate for gcc-h...@gcc.gnu.org. Please take any > followups to gcc-help. Thanks. > > When asking a question of this sort, it helps a lot if you show us > precisely what you did and precisely what happened. Without seeing > that, I am going to guess that you are running into multiarch libraries. > Debian, and therefore Ubuntu, decided to move the system libraries from > the locations where all GNU/Linux distros have put them for many years. > They have updated their own versions of gcc, but the mainstream gcc > releases have not been updated. > > This is going to be an ongoing problem for many years for people who use > Debian or Ubuntu. I do not know how to resolve it. This is not a multiarch issue. Passing --as-needed by default to the linker was enabled in the Ubuntu 11.10 release, which is one month old [1]. Even multiarch is only seven month old (first appeared in Ubuntu 11.04), so I honestly can't see any justification for your "many years" statement. Yes, I do need to re-submit the updated multiarch patch. Matthias [1] https://wiki.ubuntu.com/OneiricOcelot/ReleaseNotes?action=show&redirect=OneiricOcelot%2FTechnicalOverview#GCC_4.6_Toolchain
Re: Bugzilla components for target libraries
On 11/10/2011 06:30 PM, Joseph S. Myers wrote: > On Thu, 10 Nov 2011, Rainer Orth wrote: > >> I've recently noticed that several of our target libraries are not >> properly (if at all) represented as bugzilla components. The following >> table shows the current situation: >> >> directory component > > You omitted boehm-gc and zlib, both used in target libraries (libgcjgc, > libzgcj) though not intended for direct use as such by GCC users (anyone > wanting to use them directly should use the upstream releases). boehm-gc is used for the gc enabled libobjc as well.
Re: When is the hardware related register is allocated?
Feng LI writes: > Thanks, it helps a lot! One more question is that during split phase, > I'll generate 2 instructions in the following order for some reason, > CLC; > CMOVC reg imm32; > > But I need to keep the following condition: > 1. The compiler will not optimize out the code or break the sequence > here. I'm doing the split phase after "reload_completed". > 2. Store the REG CC before CLC, and restore after CMOVC. > > Is there some way to do that? You can write any instruction string you like manually, including two or four instructions. Offhand I don't know of any way to get the compiler to save CC for you around your instruction. That's a stiff requirement. Ian > On Mon, Nov 14, 2011 at 7:59 AM, Ian Lance Taylor wrote: >> Feng LI writes: >> >>> I'm working on a gcc backend, we need to use the information of the >>> allocated hardware register to generate the code from builtin >>> functions. But at the context in ix86_expand_builtin, where I could >>> get the operands which the registers are pseudo registers >>> (REGNO(op)>FIRST_PSEUDO_REGISTER). >>> >>> Do you know where could I get the information of the hardware register >>> and generate assemble code from there? >> >> At the point where ix86_expand_builtin is called, the hardware register >> is not known. >> >> Typically this kind of thing would be handled via a >> define_insn_and_split which represents the operation in some general way >> (probably using an UNSPEC) before reload and then splits after reload >> based on the registers it winds up seeing. >> >> Ian >>
Re: When is the hardware related register is allocated?
Hi Ian, Thanks, it helps a lot! One more question is that during split phase, I'll generate 2 instructions in the following order for some reason, CLC; CMOVC reg imm32; But I need to keep the following condition: 1. The compiler will not optimize out the code or break the sequence here. I'm doing the split phase after "reload_completed". 2. Store the REG CC before CLC, and restore after CMOVC. Is there some way to do that? Thank you, Feng On Mon, Nov 14, 2011 at 7:59 AM, Ian Lance Taylor wrote: > Feng LI writes: > >> I'm working on a gcc backend, we need to use the information of the >> allocated hardware register to generate the code from builtin >> functions. But at the context in ix86_expand_builtin, where I could >> get the operands which the registers are pseudo registers >> (REGNO(op)>FIRST_PSEUDO_REGISTER). >> >> Do you know where could I get the information of the hardware register >> and generate assemble code from there? > > At the point where ix86_expand_builtin is called, the hardware register > is not known. > > Typically this kind of thing would be handled via a > define_insn_and_split which represents the operation in some general way > (probably using an UNSPEC) before reload and then splits after reload > based on the registers it winds up seeing. > > Ian >
Re: bootstrap regression on sparc
David Miller writes: > While building libstdc++ I get an assertion failure in haifa-sched.c, > specifically the assertion on line 3437 is failing: > > gcc_assert (!jump_p > || ((common_sched_info->sched_pass_id == SCHED_RGN_PASS) > && IS_SPECULATION_BRANCHY_CHECK_P (insn)) > || (common_sched_info->sched_pass_id > == SCHED_EBB_PASS)); > > I haven't looked more deeply at it, but the first recent suspicious change > are the basic block handling changes Alan made two days ago: > > 2011-11-09 Alan Modra > > * function.c (bb_active_p): Delete. > (dup_block_and_redirect, active_insn_between): New functions. > (convert_jumps_to_returns, emit_return_for_exit): New functions, > split out from.. Indeed: I've file PR bootstrap/51086 for that. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: A question about redudant load elimination
From tree dump we can see that there are two assignments from x, one to unsigned and one to signed. I guess that's the reason. Apparently there is room to improve though. int prephitmp.8; int * D.2027; unsigned int D.2026; unsigned int x.1; int x.0; # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) x.0_1 = x; x.1_2 = (unsigned int) x.0_1; // unsigned move D.2026_3 = x.1_2 * 4; D.2027_5 = a_4(D) + D.2026_3; *D.2027_5 = 1; prephitmp.8_6 = x; // signed move On Mon, Nov 14, 2011 at 4:01 PM, Jiangning Liu wrote: > Hi, > > For this test case, > > int x; > extern void f(void); > > void g(int *a) > { > a[x] = 1; > if (x == 100) > f(); > a[x] = 2; > } > > For trunk, the x86 assembly code is like below, > > movl x, %eax > movl 16(%esp), %ebx > movl $1, (%ebx,%eax,4) > movl x, %eax // Is this a redundant one? > cmpl $100, %eax > je .L4 > movl $2, (%ebx,%eax,4) > addl $8, %esp > .cfi_remember_state > .cfi_def_cfa_offset 8 > popl %ebx > .cfi_restore 3 > .cfi_def_cfa_offset 4 > ret > .p2align 4,,7 > .p2align 3 > .L4: > .cfi_restore_state > call f > movl x, %eax > movl $2, (%ebx,%eax,4) > addl $8, %esp > .cfi_def_cfa_offset 8 > popl %ebx > .cfi_restore 3 > .cfi_def_cfa_offset 4 > Ret > > Is the 2nd "movl x, %eax" is a redundant one for single thread programming > model? If yes, can this be optimized away? > > Thanks, > -Jiangning > > > >
A question about redudant load elimination
Hi, For this test case, int x; extern void f(void); void g(int *a) { a[x] = 1; if (x == 100) f(); a[x] = 2; } For trunk, the x86 assembly code is like below, movlx, %eax movl16(%esp), %ebx movl$1, (%ebx,%eax,4) movlx, %eax // Is this a redundant one? cmpl$100, %eax je .L4 movl$2, (%ebx,%eax,4) addl$8, %esp .cfi_remember_state .cfi_def_cfa_offset 8 popl%ebx .cfi_restore 3 .cfi_def_cfa_offset 4 ret .p2align 4,,7 .p2align 3 .L4: .cfi_restore_state callf movlx, %eax movl$2, (%ebx,%eax,4) addl$8, %esp .cfi_def_cfa_offset 8 popl%ebx .cfi_restore 3 .cfi_def_cfa_offset 4 Ret Is the 2nd "movl x, %eax" is a redundant one for single thread programming model? If yes, can this be optimized away? Thanks, -Jiangning