Hello, > Well, the target architecture is actually quite peculiar, it's a > parallel SPMD machine. The only similarity with MIPS is the ISA. The > latency I'm trying to hide is somewhere around 24 cycles, but because it > is a parallel machine, up to 1024 threads have to stall for 24 cycles in > the absence of prefetching, which affects overall performance. > My initial studies show that this latency can be hidden with a properly > inserted prefetch instruction, and I think that the scheduler can help > with that, if properly guided. > > So my initial question remains: is there any way to tell the scheduler > not to place the prefetch instruction after the actual read? > > The prefetch instruction takes an address_operand, and it seems all I > need to do is tell the scheduler prefetch will "write" to that address, > so it will see a true dependence between the prefetch and the read.
this might also restrict the possibility to move the prefetch, since it would prevent it from being moved over any other memory operations that alias with the prefetched one. Unfortunately, I do not really understand the internals of the gcc scheduler, so I cannot give you more concrete help; but hopefully someone else will. Zdenek > But > I don't know how to do that, and changing the md file to say "+p" or > "+d" for the first operand of the prefetch didn't help.