Hi,

On 10.11.2010 12:32, roy rosen wrote:
Hi,

I was wondering if gcc has software pipelining.
I saw options -fsel-sched-pipelining -fselective-scheduling
-fselective-scheduling2 but I don't see any pipelining happening
(tried with ia64).
Is there a gcc VLIW port in which I can see it working?
You need to try -fmodulo-sched. Selective scheduling works by default on ia64 with -O3, otherwise you need -fselective-scheduling2 -fsel-sched-pipelining. Note that selective scheduling disables autoinc generation for the pipelining to work, and modulo scheduling will likely refuse to pipeline a loop with autoincs.

Modulo scheduling implementation in GCC may be improved, but that's a different topic.

Andrey


For an example function like

int nor(char* __restrict__ c, char* __restrict__ d)
{
     int i, sum = 0;
     for (i = 0; i<  256; i++)
         d[i] = c[i]<<  3;
     return sum;
}

with no pipelining a code like

r1 = 0
r2 = c
r3 = d
_startloop
if r1 == 256 jmp _end
r4 = [r2]+
r4>>= r4
[r3]+ = r4
r1++
jmp _startloop
_end

here inside the loop there is a data dependency between all 3 insns
(only the r1++ is independent) which does not permit any parallelism

with pipelining I expect a code like

r1 = 2
r2 = c
r3 = d
// peel first iteration
r4 = [r2]+
r4>>= r4
r5 = [r2]+
_startloop
if r1 == 256 jmp _end
[r3]+ = r4 ; r4>>= r5 ; r5 = [r2]+
r1++
jmp _startloop
_end

Now the data dependecy is broken and parlallism is possible.
As I said I could not see that happening.
Can someone please tell me on which port and with what options can I
get such a result?

Thanks, Roy.

Reply via email to