Tomáš Glozar via Gcc <[email protected]> writes:

> st 10. 12. 2025 v 15:24 odesílatel Tomáš Glozar <[email protected]> napsal:
>>
>> That is a good point. I will look at whether selective scheduling has
>> any significant performance benefits on ia64 at the current state at
>> least. My system is built with -O2, but some tests I have been doing
>> with -O3.
>>
>
> I did some testing with GCC 15.2.0 [1] on ia64. It appears that there
> is a noticeable but small improvement of 0.136% (7.196491 tok/s vs
> 7.186699 tok/s) when running llama2.c inference with -O3 compared to
> -O3 -fno-selective-scheduling (whose core is floating-point
> matrix-vector multiplication),  That is not a representative example
> though, as IIUC, selective scheduling should be mostly relevant for
> non-numerical computation [2].
> [...]
>
> [1] Built with a patch disabling late combine by default on IA-64 as
> that was shown to produce incorrect code in some cases.

What's this about?

> [2] https://dl.acm.org/doi/abs/10.1145/267959.269966
>
> Tomas

Reply via email to