> I can't speak directly to your question on the iphone-cross target, but > can warn you that your mileage will vary when using the ARM assembly > modules. We observed that some algorithms actually run slower when > using the ARM assembly modules. It's been a couple of years and I don't > recall the details, but want to say some of the hash algorithms were > actually faster when using no-asm.
Well, I can imagine compiler succeeding to generate code better than sha1-armv4-large, but I can't imagine compiler beating sha256 or sha512. Was it really some of algorithm*s* or just one? Anyway, why sha1-amrv4-large? Two reasons: a) inner loops are not unrolled; b) over-reliance on merged rotate-n-arithmetic. "Over-reliance" means that it uses more such instructions than actually necessary, which can negatively affect performance. I realized this after having hard time getting sha256/512 to work well on Cortex-A53 (see sha512-armv8.pl, it's 64-bit module, but principle of merged rotate-n-arithmetic is same). It should also be noted that now there are additional code paths in sha1-armv4-large, namely NEON and ARMv8. > The results are likely to vary > depending on the actual chipset used. Right, ARM universe is very diverse. Assembly modules, i.e. all, not only ARM, are maintained to provide near-optimal performance across range of platforms, but sometimes optimizations conflict. In either case prerequisite is access to wide range of platforms and feedback. In order words, bring it up. > You'll probably want to test the > performance on the target hardware using the "openssl speed" command. > You can do this on a jailbroken iOS device via SSH. For the record. I do development on non-jailbroken unit, so that it's not hard requirement. _______________________________________________ openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev