To hazard a guess, it's sign-extending to the CPU word size as an intermediate step.  It's not something the peephole optimizer can easily eliminate.  Do the register allocations give any clues before that instruction?

Gareth aka. Kit

On 04/02/2020 18:50, Marģers . via fpc-devel wrote:
  p.s. tested execution speed and there is no measurable difference.


asm code
# [109] bit:= longint(1) shl k;
     movslq    %ecx,%rdx
     # Register r8d allocated
     movl    $1,%r8d
     # Register edx,edx allocated
     shlx    %edx,%r8d,%edx
     # Register r8d released
     # Register edx allocated
     movl    %edx,%esi
# Peephole Optimization: %esi = %edx; changed to minimise pipeline stall 
(MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done

what purpose serve: movslq    %ecx,%rdx   ?
movl    %edx,%esi seems unnecessary,
when just enough would be
movl    $1,%esi
shlx    %ecx,%esi,%esi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to