Op 2019-11-09 om 02:24 schreef Marģers . via fpc-devel:
3) it changes code location (code cross page boundaries). For my particular cpu there are 64 byte code page. If loop can fit in it, speed is twice as it overlaps even one byte over page boundary. Jumping forward is ok (as expected code flow is always forward). And there is lager page few kb - calling outside - small penalty.

Most processors have a fairly large uop cache (up to 2048 for the newest generations iirc), so this would only be for the first iteration? Do you have a reference (agner fog page or so) or more explanation for this that describes this?)


_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to