Hi, I was reading this article about Delphi XE2 floating point performance.
http://delphitools.info/2011/09/02/first-look-at-xe2-floating-point-performance/ Not that I understand much of the assembler generated, but what I did notice is that Delphi XE2 64-bit uses the MOVAPD function (introduced in SSE2 [1]), but even if I specify -O3 -CfSSE3 with 64-bit FPC, FPC only uses the MOVSD (introduced in 386 [2]). So is there place for optimizing FPC a bit more? Reducing the number of instructions and using faster / newer assembler calls? What does the compiler generate for the two following lines? x := x0 * x0 - y0 * y0 + p; y := 2 * x0 * y0 + q; Delphi XE2 ----------- FMandelTest.pas.193: x := x0 * x0 - y0 * y0 + p; 00000000005A1452 660F28C4 movapd xmm0,xmm4 00000000005A1456 F20F59C4 mulsd xmm0,xmm4 00000000005A145A 660F28CD movapd xmm1,xmm5 00000000005A145E F20F59CD mulsd xmm1,xmm5 00000000005A1462 F20F5CC1 subsd xmm0,xmm1 00000000005A1466 F20F58C2 addsd xmm0,xmm2 FMandelTest.pas.194: y := 2 * x0 * y0 + q; 00000000005A146A 660F28CC movapd xmm1,xmm4 00000000005A146E F20F590DA2000000 mulsd xmm1,qword ptr [rel $000000a2] 00000000005A1476 F20F59CD mulsd xmm1,xmm5 00000000005A147A F20F58CB addsd xmm1,xmm3 64-bit FPC 2.5.1 ----------------- # Var x located in register xmm0 # Var x0 located in register xmm2 # Var y located in register xmm0 # Var y0 located in register xmm3 # Var p located in register xmm4 # Var q located in register xmm8 ...... .Ll3: # [17] x := x0 * x0 - y0 * y0 + p; movsd %xmm0,%xmm5 mulsd %xmm0,%xmm5 .Ll4: movsd %xmm3,%xmm1 .Ll5: movsd %xmm1,%xmm0 mulsd %xmm1,%xmm0 subsd %xmm0,%xmm5 addsd %xmm4,%xmm5 movsd %xmm5,-40(%rbp) .Ll6: # [18] y := 2 * x0 * y0 + q; movsd _$FPU_TEST$_Ld1,%xmm0 mulsd %xmm2,%xmm0 mulsd %xmm3,%xmm0 addsd %xmm8,%xmm0 movsd %xmm0,-32(%rbp) References: =========== 1) http://en.wikipedia.org/wiki/MOVAPD 2) http://en.wikipedia.org/wiki/X86_instruction_listings#Added_with_80386 The full Delphi source code for the Mandelbrot test can be downloaded from: http://delphitools.info/wp-content/uploads/2011/03/MandelTest.zip Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel