Hi,
On 10.11.2010 12:32, roy rosen wrote:
Hi,
I was wondering if gcc has software pipelining.
I saw options -fsel-sched-pipelining -fselective-scheduling
-fselective-scheduling2 but I don't see any pipelining happening
(tried with ia64).
Is there a gcc VLIW port in which I can see it working?
You need to try -fmodulo-sched. Selective scheduling works by default on
ia64 with -O3, otherwise you need -fselective-scheduling2
-fsel-sched-pipelining. Note that selective scheduling disables autoinc
generation for the pipelining to work, and modulo scheduling will likely
refuse to pipeline a loop with autoincs.
Modulo scheduling implementation in GCC may be improved, but that's a
different topic.
Andrey
For an example function like
int nor(char* __restrict__ c, char* __restrict__ d)
{
int i, sum = 0;
for (i = 0; i< 256; i++)
d[i] = c[i]<< 3;
return sum;
}
with no pipelining a code like
r1 = 0
r2 = c
r3 = d
_startloop
if r1 == 256 jmp _end
r4 = [r2]+
r4>>= r4
[r3]+ = r4
r1++
jmp _startloop
_end
here inside the loop there is a data dependency between all 3 insns
(only the r1++ is independent) which does not permit any parallelism
with pipelining I expect a code like
r1 = 2
r2 = c
r3 = d
// peel first iteration
r4 = [r2]+
r4>>= r4
r5 = [r2]+
_startloop
if r1 == 256 jmp _end
[r3]+ = r4 ; r4>>= r5 ; r5 = [r2]+
r1++
jmp _startloop
_end
Now the data dependecy is broken and parlallism is possible.
As I said I could not see that happening.
Can someone please tell me on which port and with what options can I
get such a result?
Thanks, Roy.