> -----Original Message----- > From: fpc-devel <fpc-devel-boun...@lists.freepascal.org> On Behalf > Of Florian Klämpfl > So something like > > cmp edx, $43300000 > jge @@zero > cmp edx, $3FE00000 > .align 16 > jbe @@skip > > might be much better.
That ended up making things worse in some cases. Here is a branchless version: function Frac1(const X: Double): Double; asm .noframe movq rdx, xmm0 mov rax, rdx xor rcx, rcx shr rdx, 32 and edx, $7FF00000 cmp edx, $43300000 cmovge rax, rcx movq xmm0, rax cvttsd2si rax, xmm0 cvtsi2sd xmm4, rax subsd xmm0, xmm4 end; It performs slightly slower in the "in range" case, noticeable worse in the other 2 cases (as it's exactly the same for all 3). I would guess that the "in range" case is the most common (you aren't going to call Frac if you know ahead of time that it's always 0 as the number is too big, or if you know that it already is a value between -1 and 1), so the higher cost for the out of range and only fraction cases is probably less important than it might look. It IS largely independent of code alignment or predictable patterns in the incoming value: Code address: Frac1: 0000000000536430 (48) Frac2: 0000000000536480 (0) Frac3: 00000000005364D0 (80) Frac4: 0000000000536520 (32) Frac5: 0000000000536570 (112) Frac6: 00000000005365C0 (64) Frac7: 0000000000536610 (16) Frac8: 0000000000536660 (96) 1st run: In range (1e15+0.5): Frac1 1431794 Frac2 1429232 Frac3 1463357 Frac4 1475042 Frac5 1446016 Frac6 1472979 Frac7 1443244 Frac8 1467528 Out of range (1e16+0.5): Frac1 1476556 Frac2 1458534 Frac3 1444431 Frac4 1427287 Frac5 1427326 Frac6 1427472 Frac7 1428914 Frac8 1419654 Only fraction (0.5): Frac1 1470644 Frac2 1475227 Frac3 1447379 Frac4 1529162 Frac5 1509275 Frac6 1485185 Frac7 1500826 Frac8 1524294 Code address: Frac1: 0000000000536423 (35) Frac2: 0000000000536458 (88) Frac3: 000000000053648D (13) Frac4: 00000000005364C2 (66) Frac5: 00000000005364F7 (119) Frac6: 000000000053652C (44) Frac7: 0000000000536561 (97) Frac8: 0000000000536596 (22) 1st run: In range (1e15+0.5): Frac1 1349334 Frac2 1429198 Frac3 1447011 Frac4 1436476 Frac5 1477058 Frac6 1496887 Frac7 1431293 Frac8 1435460 Out of range (1e16+0.5): Frac1 1349939 Frac2 1412543 Frac3 1462295 Frac4 1442081 Frac5 1512579 Frac6 1453593 Frac7 1457510 Frac8 1436533 Only fraction (0.5): Frac1 1371353 Frac2 1443000 Frac3 1437583 Frac4 1415591 Frac5 1474870 Frac6 1437224 Frac7 1452196 Frac8 1453833 Also, it still outperforms Delphi's Frac in all cases. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel