Op Sat, 11 Jun 2005, schreef L505:
> http://dennishomepage.gugs-cats.dk/LowerCaseChallenge.htm > > LowerCaseShaPas2_c > This one here is in Pascal, using GOTO and LABEL which consistently work fast > on > both -Og and -OG > But no one wants to maintain a GOTO and a LABEL.. > > [LowerCaseShaPas2_c] was slightly slower than [lowercase 6 ] (second fastest) > in -OG mode > [LowerCaseShaPas2_c] was slightly faster than [lowercase 9] (still second > fastest) in -Og mode .. so it's more consistent across compiler options it > seemed > > So maybe [lowercase 6 ] result should be submitted to fastcode to be tested? > > Also, if no one wants to use the assembly functions and GOTO/LABEL functions > in > the RTL due to code bloat/maintenance, we could always offer an optional unit > where people could call the fast functions only if they needed them badly. > Just like how fastcode does, external from the VCL. Hmmm... They managed to do the 4 bytes in parallel. I can figure out how it works, but it is interresting and should be fast. Replace the loop by: repeat if exitcondition then break; until false; ... this will generate exactly the same code, but is more according to the rules of art. Next, make it 64-bit safe, i.e. change cardinal(p) to ptruint(p). There are also potential speed improvements, i.e.: c2:=not(c1) and $80808080. If this is done it can be included in the sysutils unit. It should also be kept in mind that this code assumes 32-bit, even though it'll run on 64-bit, it won't be optimal on 64-bit (but this faster than byte per byte I guess). If you want to submit assembler routines (I think that LowerCaseSha2 trick is a good basis to build one), take the following guidelines in mind: * It should be worthwhile to use assembler, i.e. if you don't get more than 10% speed gain it isn't worth it; it's better to wait until the compiler generates better code. * Don't use CPU-specific optimizations. I.e. code will have to run all kinds of machines, Pentium-4 or Athlon specific optimizations aren't a good idea. Do it this way: * At the top of the implementation section, add {$i i386.inc}. * In i386.inc add the assembler version and do a {$define have_lowercase} * Put the Pascal implementation between a {$ifndef have_lowercase} and {$endif} This way the impact on the maintainability and portability is rather low. Daniël _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel