https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> ---
A test case for this problem is gcc/testsuite/g++.dg/tls/thread_local-order1.C,
which is compiled without optimizations and contains the following sequence:

        stc     gbr,r1
        mov.l   .L20,r2
        add     r2,r1
        lds     r1,fpul
        fsts    fpul,fr1
        flds    fr1,fpul
        sts     fpul,r0
        mov     r14,r15
        lds.l   @r15+,pr
        mov.l   @r15+,r14
        rts
        nop

what the code is actually doing:
        stc     gbr,r1
        mov.l   .L20,r2
        add     r2,r1
        mov     r1,r0
        mov     r14,r15
        lds.l   @r15+,pr
        mov.l   @r15+,r14
        rts
        nop

Reply via email to