On 10/7/2022 1:30 PM, Aritra Bagchi wrote:
Hi Eliot,

Thanks for the response. The unrolled loop, despite having the same dependency across "j", can send multiple loads simultaneously. So the limitation might not be due to that dependency across "j" of different iterations. But in the non-unrolled loop, the control dependency is there, which goes away when the loop is unrolled. But even without any speculation, gem5 could have scheduled loads as follows:

first schedule load A[ k ]
then, compare i with N
then schedule load A [ k + 1 ] if i < N

But what is happening is A[ k + 1 ] is scheduled only after load A[ k ] is completed. Is that completion necessary? It seems it isn't. The memory system is underutilised.

Thanks and regards,
Aritra

I understand your thinking, but it would be helpful to include
the assembly code listings for the two cases for us to apply
more careful reasoning as to what may be happening.

Best - EM
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to