Hi Garry,

Hopefully I have fixed this issue now, which is also causing problems elsewhere.

https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/598 - just waiting on it to be verified, approved and merged.

Gareth aka. Kit

On 20/02/2024 06:32, J. Gareth Moreton via fpc-devel wrote:

Thanks for the report and especially your investigative work. Ii'll take a look to see what's going on.

Gareth aka. Kit

On 20/02/2024 01:30, Garry Wood via fpc-devel wrote:

Hello,

Commit 6b2e4fa4 (main) entitled “* arm: "OpCmp2OpS" moved to Pass 2 so it doesn't conflict with AND; CMP -> TST optimisation” by Gareth from Feb 11 2024 produces incorrect assembler in certain cases.

https://gitlab.com/freepascal.org/fpc/source/-/commit/6b2e4fa4133a496c1c3f89e3c71fffbdd7c192fb

This piece of code:

function CPUMaskCount(CPUMask:LongWord):LongWord;

var

Count:LongWord;

begin

{}

Result:=0;

 for Count:=CPU_ID_0 to CPU_ID_MAX do

  begin

   if (CPUMask and (1 shl Count)) <> 0 then

    begin

     Inc(Result);

    end;

  end;

end;

when compiled with FPC prior to commit 6b2e4fa4 produces the following working assembler:

00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:

   20528: e1a01000            mov       r1, r0

2052c:               e3a00000            mov       r0, #0

   20530: e3a02000            mov       r2, #0

   20534: e3a03001            mov       r3, #1

   20538: e0113213           ands      r3, r1, r3, lsl r2

2053c:               12800001           addne   r0, r0, #1

   20540: e2822001           add        r2, r2, #1

   20544: e352001f            cmp       r2, #31

   20548: 9afffff9 bls          20534 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>

2054c:               e12fff1e               bx lr

But when compiled with FPC after commit 6b2e4fa4 it produces this assembler which doesn’t work:

00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:

   20528: e1a01000            mov       r1, r0

2052c:               e3a00000            mov       r0, #0

   20530: e3a02000            mov       r2, #0

   20534: e3a03001            mov       r3, #1

   20538: e1110003           tst           r1, r3

2053c:               12800001           addne   r0, r0, #1

   20540: e2822001           add        r2, r2, #1

   20544: e352001f            cmp       r2, #31

   20548: 9afffff9 bls          20534 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>

2054c:               e12fff1e               bx lr

You can see that the difference is the lack of lsl r2 on the end of the TST instruction which means that the shl on the original code is not being performed and the test is therefore invalid.

Similar code sequences in multiple other places produce the same result with the lsl suffix missing from the TST instruction.

Please let me know if you need any further information.

Garry Wood.


_______________________________________________
fpc-devel maillist  -fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to