Auto-Vectorization, Polyhedral Model

2011-11-14 Thread steven su
Hi,
Can anyone explain whether GCC has implemented Auto-Vectorization
based on Polyhedral Model?
Are there any related projects shooting at this, and  in progress?

Steven.


Re: builtin gamma function

2011-11-14 Thread Ian Lance Taylor
James Hirschorn  writes:

> I have noticed that the builtin gamma function is very accurate and
> extremely fast. Can someone tell me where to find the source code for the
> implementation?

Are you calling it on a constant?  Because gcc will fold gamma applied
to a constant which meets certain characteristics.

If you aren't calling it on a constant, then you are getting the
function from your libc.  gcc does not itself include an implementation
of the gamma function.

Ian


builtin gamma function

2011-11-14 Thread James Hirschorn
I have noticed that the builtin gamma function is very accurate and
extremely fast. Can someone tell me where to find the source code for the
implementation?

gdb skips over the call to the builtin gamma. I assume it is not implemented
by the simple Lanczos algorithm in tr1/gamma.tcc because I tried this code
and it was about half the speed and less accurate.

Thanks,
James



Re: Troubleshooting with gcc 4.6

2011-11-14 Thread Ian Lance Taylor
Matthias Klose  writes:

>> When asking a question of this sort, it helps a lot if you show us
>> precisely what you did and precisely what happened.  Without seeing
>> that, I am going to guess that you are running into multiarch libraries.
>> Debian, and therefore Ubuntu, decided to move the system libraries from
>> the locations where all GNU/Linux distros have put them for many years.
>> They have updated their own versions of gcc, but the mainstream gcc
>> releases have not been updated.
>> 
>> This is going to be an ongoing problem for many years for people who use
>> Debian or Ubuntu.  I do not know how to resolve it.
>
> This is not a multiarch issue. Passing --as-needed by default to the linker 
> was
> enabled in the Ubuntu 11.10 release, which is one month old [1].
>
> Even multiarch is only seven month old (first appeared in Ubuntu 11.04), so I
> honestly can't see any justification for your "many years" statement.
>
> Yes, I do need to re-submit the updated multiarch patch.

I assume you mean my second use of "many years."  It is going to be an
ongoing problem because people often want to build gcc releases which
are not at tip.  On gcc-help we routinely get questions about how to
build gcc 3.4 and later.  As far as I can see, none of those older
releases are going to build on current and future Debian/Ubuntu
releases, because the libraries have moved.  Therefore, I believe we are
going to be dealing with this issue on gcc-help for many years to come.
Fortunately, we will gain the corresponding benefit of, well, hmmm, I
can't think of any benefit, actually.  But there must be one out there
that is worth all this disruption for gcc users.

Ian


Re: Troubleshooting with gcc 4.6

2011-11-14 Thread Matthias Klose
On 11/09/2011 07:50 PM, Ian Lance Taylor wrote:
> santi  writes:
> 
>> I recently updated my Ubuntu 10.10 to 11.10 and since then I have been
>> having problems with my compiler. I have seen that this new Ubuntu
>> distribution uses gcc 4.6 whilest my old 10.10 used gcc 4.4.5 or
>> 4.4.6.
>>
>> The main problem I have nowadays is with the math.h library when I
>> need to use functions as sqrt() or pow() that I used to use without
>> any problem in the old distribution (well, I had to write the -lm
>> option when I tried to compile my source files but it did run
>> perfectly). Today I'm getting and unresolve refernce to 'sqrt' when I
>> comile my files even though I'm using the -lm option.

this is caused by passing --as-needed by default to the linker. Make sure to
pass libraries on the command line behind objects (you need the symbol
referenced before the definition is found). You'll likely find this issue on
OpenSuse releases (it may be enabled for package builds only).

> This question is not appropriate for the mailing list gcc@gcc.gnu.org.
> It would be appropriate for gcc-h...@gcc.gnu.org.  Please take any
> followups to gcc-help.  Thanks.
> 
> When asking a question of this sort, it helps a lot if you show us
> precisely what you did and precisely what happened.  Without seeing
> that, I am going to guess that you are running into multiarch libraries.
> Debian, and therefore Ubuntu, decided to move the system libraries from
> the locations where all GNU/Linux distros have put them for many years.
> They have updated their own versions of gcc, but the mainstream gcc
> releases have not been updated.
> 
> This is going to be an ongoing problem for many years for people who use
> Debian or Ubuntu.  I do not know how to resolve it.

This is not a multiarch issue. Passing --as-needed by default to the linker was
enabled in the Ubuntu 11.10 release, which is one month old [1].

Even multiarch is only seven month old (first appeared in Ubuntu 11.04), so I
honestly can't see any justification for your "many years" statement.

Yes, I do need to re-submit the updated multiarch patch.

  Matthias

[1]
https://wiki.ubuntu.com/OneiricOcelot/ReleaseNotes?action=show&redirect=OneiricOcelot%2FTechnicalOverview#GCC_4.6_Toolchain


Re: Bugzilla components for target libraries

2011-11-14 Thread Matthias Klose
On 11/10/2011 06:30 PM, Joseph S. Myers wrote:
> On Thu, 10 Nov 2011, Rainer Orth wrote:
> 
>> I've recently noticed that several of our target libraries are not
>> properly (if at all) represented as bugzilla components.  The following
>> table shows the current situation:
>>
>>   directory  component
> 
> You omitted boehm-gc and zlib, both used in target libraries (libgcjgc, 
> libzgcj) though not intended for direct use as such by GCC users (anyone 
> wanting to use them directly should use the upstream releases).

boehm-gc is used for the gc enabled libobjc as well.


Re: When is the hardware related register is allocated?

2011-11-14 Thread Ian Lance Taylor
Feng LI  writes:

> Thanks, it helps a lot! One more question is that during split phase,
> I'll generate 2 instructions in the following order for some reason,
> CLC;
> CMOVC reg imm32;
>
> But I need to keep the following condition:
> 1. The compiler will not optimize out the code or break the sequence
> here. I'm doing the split phase after "reload_completed".
> 2. Store the REG CC before CLC, and restore after CMOVC.
>
> Is there some way to do that?

You can write any instruction string you like manually, including two or
four instructions.

Offhand I don't know of any way to get the compiler to save CC for you
around your instruction.  That's a stiff requirement.

Ian

> On Mon, Nov 14, 2011 at 7:59 AM, Ian Lance Taylor  wrote:
>> Feng LI  writes:
>>
>>> I'm working on a gcc backend, we need to use the information of the
>>> allocated hardware register to generate the code from builtin
>>> functions. But at the context in ix86_expand_builtin, where I could
>>> get the operands which the registers are pseudo registers
>>> (REGNO(op)>FIRST_PSEUDO_REGISTER).
>>>
>>> Do you know where could I get the information of the hardware register
>>> and generate assemble code from there?
>>
>> At the point where ix86_expand_builtin is called, the hardware register
>> is not known.
>>
>> Typically this kind of thing would be handled via a
>> define_insn_and_split which represents the operation in some general way
>> (probably using an UNSPEC) before reload and then splits after reload
>> based on the registers it winds up seeing.
>>
>> Ian
>>


Re: When is the hardware related register is allocated?

2011-11-14 Thread Feng LI
Hi Ian,

Thanks, it helps a lot! One more question is that during split phase,
I'll generate 2 instructions in the following order for some reason,
CLC;
CMOVC reg imm32;

But I need to keep the following condition:
1. The compiler will not optimize out the code or break the sequence
here. I'm doing the split phase after "reload_completed".
2. Store the REG CC before CLC, and restore after CMOVC.

Is there some way to do that?

Thank you,
Feng

On Mon, Nov 14, 2011 at 7:59 AM, Ian Lance Taylor  wrote:
> Feng LI  writes:
>
>> I'm working on a gcc backend, we need to use the information of the
>> allocated hardware register to generate the code from builtin
>> functions. But at the context in ix86_expand_builtin, where I could
>> get the operands which the registers are pseudo registers
>> (REGNO(op)>FIRST_PSEUDO_REGISTER).
>>
>> Do you know where could I get the information of the hardware register
>> and generate assemble code from there?
>
> At the point where ix86_expand_builtin is called, the hardware register
> is not known.
>
> Typically this kind of thing would be handled via a
> define_insn_and_split which represents the operation in some general way
> (probably using an UNSPEC) before reload and then splits after reload
> based on the registers it winds up seeing.
>
> Ian
>


Re: bootstrap regression on sparc

2011-11-14 Thread Rainer Orth
David Miller  writes:

> While building libstdc++ I get an assertion failure in haifa-sched.c,
> specifically the assertion on line 3437 is failing:
>
> gcc_assert (!jump_p
> || ((common_sched_info->sched_pass_id == SCHED_RGN_PASS)
> && IS_SPECULATION_BRANCHY_CHECK_P (insn))
> || (common_sched_info->sched_pass_id
> == SCHED_EBB_PASS));
>
> I haven't looked more deeply at it, but the first recent suspicious change
> are the basic block handling changes Alan made two days ago:
>
> 2011-11-09  Alan Modra  
>
>   * function.c (bb_active_p): Delete.
>   (dup_block_and_redirect, active_insn_between): New functions.
>   (convert_jumps_to_returns, emit_return_for_exit): New functions,
>   split out from..

Indeed: I've file PR bootstrap/51086 for that.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: A question about redudant load elimination

2011-11-14 Thread Ye Joey
From tree dump we can see that there are two assignments from x, one
to unsigned and one to signed. I guess that's the reason. Apparently
there is room to improve though.

  int prephitmp.8;
  int * D.2027;
  unsigned int D.2026;
  unsigned int x.1;
  int x.0;

  # BLOCK 2 freq:1
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  x.0_1 = x;
  x.1_2 = (unsigned int) x.0_1;  // unsigned move
  D.2026_3 = x.1_2 * 4;
  D.2027_5 = a_4(D) + D.2026_3;
  *D.2027_5 = 1;
  prephitmp.8_6 = x; // signed move

On Mon, Nov 14, 2011 at 4:01 PM, Jiangning Liu  wrote:
> Hi,
>
> For this test case,
>
> int x;
> extern void f(void);
>
> void g(int *a)
> {
>        a[x] = 1;
>        if (x == 100)
>                f();
>        a[x] = 2;
> }
>
> For trunk, the x86 assembly code is like below,
>
>        movl    x, %eax
>        movl    16(%esp), %ebx
>        movl    $1, (%ebx,%eax,4)
>        movl    x, %eax   // Is this a redundant one?
>        cmpl    $100, %eax
>        je      .L4
>        movl    $2, (%ebx,%eax,4)
>        addl    $8, %esp
>        .cfi_remember_state
>        .cfi_def_cfa_offset 8
>        popl    %ebx
>        .cfi_restore 3
>        .cfi_def_cfa_offset 4
>        ret
>        .p2align 4,,7
>        .p2align 3
> .L4:
>        .cfi_restore_state
>        call    f
>        movl    x, %eax
>        movl    $2, (%ebx,%eax,4)
>        addl    $8, %esp
>        .cfi_def_cfa_offset 8
>        popl    %ebx
>        .cfi_restore 3
>        .cfi_def_cfa_offset 4
>        Ret
>
> Is the 2nd "movl x, %eax" is a redundant one for single thread programming
> model? If yes, can this be optimized away?
>
> Thanks,
> -Jiangning
>
>
>
>


A question about redudant load elimination

2011-11-14 Thread Jiangning Liu
Hi,

For this test case,

int x;
extern void f(void);

void g(int *a)
{
a[x] = 1;
if (x == 100)
f();
a[x] = 2;
}

For trunk, the x86 assembly code is like below,

movlx, %eax
movl16(%esp), %ebx
movl$1, (%ebx,%eax,4)
movlx, %eax   // Is this a redundant one?
cmpl$100, %eax
je  .L4
movl$2, (%ebx,%eax,4)
addl$8, %esp
.cfi_remember_state
.cfi_def_cfa_offset 8
popl%ebx
.cfi_restore 3
.cfi_def_cfa_offset 4
ret
.p2align 4,,7
.p2align 3
.L4:
.cfi_restore_state
callf
movlx, %eax
movl$2, (%ebx,%eax,4)
addl$8, %esp
.cfi_def_cfa_offset 8
popl%ebx
.cfi_restore 3
.cfi_def_cfa_offset 4
Ret

Is the 2nd "movl x, %eax" is a redundant one for single thread programming
model? If yes, can this be optimized away?

Thanks,
-Jiangning