On 24 October 2014 19:17, Andy Polyakov <[email protected]> wrote: > There is inconsistency in ARM support and I'd like to gather some > opinions on how to resolve it. Circulate this to ARM people near you. > > At some point an inconsistency of following nature was introduced and > then just grew. OpenSSL attempts to adapt to processor it's running on > by detecting capabilities and using run-time switch between different > code paths. The code probing capabilities is compiled and executed on > all supported ARM architectures, ARMv4 through ARMv8. Original rationale > was that one should be able to produce "universal" binary that can be > executed on wide range of processors and deliver optimal performance on > all of them. But at the same time assembly modules have #if > __ARM_ARCH__>=X which effectively renders them not as universal as > implied in capability probing code. This is the inconsistency. There are > two ways to resolve it. > > 1. __ARM_ARCH__ is effectively controlled by compiler -march command > line option, which naturally also controls compiler-generated outcome. > In order to live up to original intention to produce "universal" binary, > it would be appropriate to tell *compiler* to generate code for minimal > architecture one wants to target, but tell *assembler* to accept > instructions for maximum architecture one wants to target, e.g. > -march=armv4 -Wa,-march=armv7-a. > > 2. Abandon the idea of producing true "universal" binary and limit > capability detection to contemporary processor families, ARMv7/8 for the > moment of this writing. And run pre-ARMv7 without capability detection > (there is nothing to detect really). > > I suppose distro vendors would prefer 2nd option, because everything has > to match anyway. ISV on the other hand might prefer 1st one, unless of > course they target specific distros one by one rather than providing > unified binaries that can be executed on multiple distros.
I think it is perfectly feasible to have one universal build. The usual reason for /not/ having a universal build is that the compiler can emit faster instructions that are not available on the older architectures, but in OpenSSL's case, that is being handled by the runtime selected alternatives anyway. Since the minimum architecture required by a piece of assembler code is a property of the code itself, it should be encoded into it using .arch/.fpu declarations, and there is no need for the '-Wa' GCC option above. I did a quick test with removing the '#ifdef __ARM_ARCH__ >= 7' from the v8 AES asm file, and after adding the .arch declaration to it (and to ARM's cpuid.S), it builds perfectly fine when targeting -'march=armv5'. There shouldn't be any problems with incompatible ABI flags in the ELF metadata as long as no arguments are being passed in floating point registers. For things like aes-armv4.S, that are built unconditionally but contain #ifdefs to select between v5 and v6/v7 code sequences, e.g., for endian swap, it makes perfect sense to retain them, but any code that is runtime selected should always be written and built for the oldest architecture version that supports the runtime selected feature (v7 for NEON, v8 for crypto extensions) The remaining issue is toolchain support for the newest architecture version. Although undesirable in general, I think that is being dealt with adequately by translating crypto instructions to byte sequences in the perlasm. I don't see a compelling reason to keep supporting pre-v7 toolchains. So I would vote for #1, but with the .arch/.fpu declarations moved to the asm files. This still allows the distros to target a newer minimum architecture version, and in places where it matters (such as aes-armv4.S), it will result in slightly faster code that can no longer run on older cores. But I don't see a reason to make that the default for OpenSSL itself. Regards, Ard. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
