Re: [fpc-devel] Experimentation: "Branch stitching"

J. Gareth Moreton via fpc-devel Mon, 28 Nov 2022 05:32:19 -0800

On 28/11/2022 12:59, Martin Frb via fpc-devel wrote:

On 28/11/2022 07:22, J. Gareth Moreton via fpc-devel wrote:
...
    testb   %al,%al
    je     .Lj733
    subb    $1,%al
    je     .Lj734
    jmp    .Lj732
    .balign 16,0x90
.Lj733:
    ...
    jmp    .Lj718
    .balign 16,0x90
.Lj732:
    movl    $2019050530,%ecx
    call    VERBOSE_$$_INTERNALERROR$LONGINT
    jmp    .Lj718
The block with the internal error can be moved and 'stitched' to the"jmp .Lj732" instruction.
    ...
    testb    %al,%al
    je    .Lj733
    subb    $1,%al
    je    .Lj734
    movl    $2019050530,%ecx
    call    VERBOSE_$$_INTERNALERROR$LONGINT
    jmp    .Lj718
    .balign 16,0x90
.Lj733:
    ...
I'm still working a few things out, since it can move the functionepilogue which makes things harder to read. Currently I'm onlymoving blocks where the label only has a single reference, therebycausing a dead label when it's stitched alongside its correspondingjump. This avoids problems where the label is referenced in a datablock that's distinct from the assembly and where moving it may causeproblems.
Well first of all, you didn't move the balign in front of .Lj732

I do move the alignment hints, but if the label becomes dead (due to thezero-distance jump being 'collapsed'), the alignment hint gets removed. It's an experiment in progress.

In the above example, that may be an improvement (most likely) becauseif the label really is referred once only (and thereby is also not aloop) then it may not be beneficial to align it (except maybe if theuser specified a non default align?).If the label is referred only once, but the whole think is inside aloop .... it may still be relevant to have the align? (not sure,depends on how the cpu caches stuff)?
Another thing is, that moving the block can make the other part of theloop longer (needing more cache). If this branch-to-be-moved is rarelyentered, it may want to be after the final "jmp-to-loop-start" of thenormal branch?Of course, if the loop is bigger than the block with the branches, andwe did know that the branch is some sort of exception only, then wewould want to move it even further away, to get it out of the loop......

It's a good point. I'll have to work out which situations will be fineand which will increase the cache. How is a procedure loaded into theCPU cache? Is there some good doumentation on this because I alwayswondered if the whole thing, or at least as much as possible, was loadedsequentially, and the alignment hints are mostly to avoid partial reads.

--------------
Btw, .balign N, 0x90 => isn't there an align that uses multibyte nop(like) instructions? (I posted some pdf to you a while back, iirc itpoints that out)

There is - it's the .plalign directive. I'm not sure why the compilermixes and matches them though.


Kit
_______________________________________________
fpc-devel maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Experimentation: "Branch stitching"

Reply via email to