To hazard a guess, it's sign-extending to the CPU word size as an
intermediate step. It's not something the peephole optimizer can easily
eliminate. Do the register allocations give any clues before that
instruction?
Gareth aka. Kit
On 04/02/2020 18:50, Marģers . via fpc-devel wrote:
p.s. tested execution speed and there is no measurable difference.
asm code
# [109] bit:= longint(1) shl k;
movslq %ecx,%rdx
# Register r8d allocated
movl $1,%r8d
# Register edx,edx allocated
shlx %edx,%r8d,%edx
# Register r8d released
# Register edx allocated
movl %edx,%esi
# Peephole Optimization: %esi = %edx; changed to minimise pipeline stall
(MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done
what purpose serve: movslq %ecx,%rdx ?
movl %edx,%esi seems unnecessary,
when just enough would be
movl $1,%esi
shlx %ecx,%esi,%esi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel