... >>The code generated for the above sample is: >># [16] c:=a*b; >> movl U_P$PROJECT1_A,%edx >> movl U_P$PROJECT1_B,%eax >> mull %edx >> movl $0,%edx >> movl %eax,U_P$PROJECT1_C >> movl %edx,U_P$PROJECT1_C+4 >> >>What I want is the above code, but without the "movl $0,%edx" >>instruction. Is there a way to do this (wihtout using fpc_mul_qword). > > > Only assembler for now. Any suggestions how the compiler could be told > to generate such code? ... >function UI32x32To64(A,B: Longword): QWord; >assembler; register; nostackframe; >asm > mull %edx >end; > >It is fast but certainly much less than if it were inlined.
My suggestion would be: FUNCTION lmul ( CONST a, b : LongInt ) : int64 ; inline ; BEGIN {$ifdef cpu86} ASM movl a,%eax imull b movl %eax,lmul movl %edx,lmul+4 END ; {$else} {$ifdef cpu68k} lmul := int64 ( a ) * b ; {$else} {$ifdef cpusparc} lmul := int64 ( a ) * b ; {$else} {$ifdef cpualpha} lmul := int64 ( a ) * b ; {$else} {$ifdef cpupowerpc} lmul := int64 ( a ) * b ; {$else} lmul := int64 ( a ) * b ; {$endif} {$endif} {$endif} {$endif} {$endif} END ; and similar for unsigned mul. (shortened here; full code in ulmul.pas; liitle test in tmuls.pas, timing routines in wtimer/tdrsc1) Is portable so code doesn't need to be rewritten when compiled for other processors (but not optimal then) Tested only on i386. Seems to be faster than standard multiplication (interesting: significantly faster for signed mul than for unsigned mul), can be assembly-coded in the branches for the other cpus (if there are opcodes for such a multiplication - I don't know), could go to unit math.pp. It seems that routines which contain assembler are not inlined; on the day somebody finds a trick to inline such code they should be really fast. Gerhard
ULMUL.pas
Description: Binary data
tmuls.pas
Description: Binary data
WTIMER.pas
Description: Binary data
TDRSC1.pas
Description: Binary data
_______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal