Tomáš Glozar via Gcc <[email protected]> writes: > st 10. 12. 2025 v 15:24 odesílatel Tomáš Glozar <[email protected]> napsal: >> >> That is a good point. I will look at whether selective scheduling has >> any significant performance benefits on ia64 at the current state at >> least. My system is built with -O2, but some tests I have been doing >> with -O3. >> > > I did some testing with GCC 15.2.0 [1] on ia64. It appears that there > is a noticeable but small improvement of 0.136% (7.196491 tok/s vs > 7.186699 tok/s) when running llama2.c inference with -O3 compared to > -O3 -fno-selective-scheduling (whose core is floating-point > matrix-vector multiplication), That is not a representative example > though, as IIUC, selective scheduling should be mostly relevant for > non-numerical computation [2]. > [...] > > [1] Built with a patch disabling late combine by default on IA-64 as > that was shown to produce incorrect code in some cases.
What's this about? > [2] https://dl.acm.org/doi/abs/10.1145/267959.269966 > > Tomas
