Folks,

At many places in the code we have cluster of prefetch instructions which seems 
to be bad idea to do.
I already noticed that perfermance is better when prefetch instructions are 
interleaved with other code,
And there is nice section explaining right that in the Intel Optimization 
Manual.

http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html

Copy/paste from the document:
---
It may seem convenient to cluster all of PREFETCH instructions at the beginning 
of a loop body or before a loop, but this can lead to severe performance 
degradation. In order to achieve the best possible performance, PREFETCH 
instructions must be interspersed with other computational instructions in the 
instruction sequence rather than clustered together. If possible, they should 
also be placed apart from loads. This improves the instruction level 
parallelism and reduces the potential instruction resource stalls. In addition, 
this mixing reduces the pressure on the memory access resources and in turn 
reduces the possibility of the prefetch retiring without fetching data. 
—--

— 
Damjan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16171): https://lists.fd.io/g/vpp-dev/message/16171
Mute This Topic: https://lists.fd.io/mt/73323447/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to