Hi,

I was wondering if gcc has software pipelining.
I saw options -fsel-sched-pipelining -fselective-scheduling
-fselective-scheduling2 but I don't see any pipelining happening
(tried with ia64).
Is there a gcc VLIW port in which I can see it working?

For an example function like

int nor(char* __restrict__ c, char* __restrict__ d)
{
    int i, sum = 0;
    for (i = 0; i < 256; i++)
        d[i] = c[i] << 3;
    return sum;
}

with no pipelining a code like

r1 = 0
r2 = c
r3 = d
_startloop
if r1 == 256 jmp _end
r4 = [r2]+
r4 >>= r4
[r3]+ = r4
r1++
jmp _startloop
_end

here inside the loop there is a data dependency between all 3 insns
(only the r1++ is independent) which does not permit any parallelism

with pipelining I expect a code like

r1 = 2
r2 = c
r3 = d
// peel first iteration
r4 = [r2]+
r4 >>= r4
r5 = [r2]+
_startloop
if r1 == 256 jmp _end
[r3]+ = r4 ; r4 >>= r5 ; r5 = [r2]+
r1++
jmp _startloop
_end

Now the data dependecy is broken and parlallism is possible.
As I said I could not see that happening.
Can someone please tell me on which port and with what options can I
get such a result?

Thanks, Roy.

Reply via email to