On 26 October 2014 22:43, Andy Polyakov <[email protected]> wrote: > Hi, > >>> There is inconsistency in ARM support and I'd like to gather some >>> opinions on how to resolve it. Circulate this to ARM people near you. >>> >>> At some point an inconsistency of following nature was introduced and >>> then just grew. OpenSSL attempts to adapt to processor it's running on >>> by detecting capabilities and using run-time switch between different >>> code paths. The code probing capabilities is compiled and executed on >>> all supported ARM architectures, ARMv4 through ARMv8. Original rationale >>> was that one should be able to produce "universal" binary that can be >>> executed on wide range of processors and deliver optimal performance on >>> all of them. But at the same time assembly modules have #if >>> __ARM_ARCH__>=X which effectively renders them not as universal as >>> implied in capability probing code. This is the inconsistency. There are >>> two ways to resolve it. >>> >>> 1. __ARM_ARCH__ is effectively controlled by compiler -march command >>> line option, which naturally also controls compiler-generated outcome. >>> In order to live up to original intention to produce "universal" binary, >>> it would be appropriate to tell *compiler* to generate code for minimal >>> architecture one wants to target, but tell *assembler* to accept >>> instructions for maximum architecture one wants to target, e.g. >>> -march=armv4 -Wa,-march=armv7-a. >>> >>> 2. Abandon the idea of producing true "universal" binary and limit >>> capability detection to contemporary processor families, ARMv7/8 for the >>> moment of this writing. And run pre-ARMv7 without capability detection >>> (there is nothing to detect really). >>> >>> I suppose distro vendors would prefer 2nd option, because everything has >>> to match anyway. ISV on the other hand might prefer 1st one, unless of >>> course they target specific distros one by one rather than providing >>> unified binaries that can be executed on multiple distros. >> >> I think it is perfectly feasible to have one universal build. The >> usual reason for /not/ having a universal build is that the compiler >> can emit faster instructions that are not available on the older >> architectures, but in OpenSSL's case, that is being handled by the >> runtime selected alternatives anyway. > > Implied question also is if it's *worth* it. My personal preference is > universal too, but what if it creates more problems than it solves. Or > it attempt to solve non-existing problem if ISVs are targeting platforms > one by one anyway. Or if nobody really cares about pre-ARMv7 when it > comes to software *distribution*, i.e. pre-AMRv7 is interesting only in > individual cases when you have to target very specific processor and > build everything yourself anyway. >
Well, even if ARMv5 may be a dead target according to some, with Raspberry PI being a non-Thumb2 ARMv6 which I don't think should end up in the 'legacy' column, we are still dealing with a fair amount of variation between the targets that I feel should be supported by the 'high end' build. So yes, I think it is worth it. >> Since the minimum architecture required by a piece of assembler code >> is a property of the code itself, it should be encoded into it using >> .arch/.fpu declarations, and there is no need for the '-Wa' GCC option >> above. > > The reason for why I suggested -Wa is to be able to support legacy > toolchains, as we can't afford assuming that everybody has capable > enough assembler. Basically idea is to be as flexible as possible to > meet "worst case" local circumstances. Yes, this means that the > suggestion was actually incomplete, because, as assembler doesn't tell > which command line option was passed, one has to convey the missing > information by other means, by additional -D option. > > I want to remind that question is *not* about removing run-time switch > as concept, but rather about distinguishing pre-ARMv7 and ARMv>=7. I.e. > NEON/cryto switch will stay, the only question if it's worth imposing it > on pre-ARMv7 builds [assuming that such binaries will be executed even > on ARMv>=7 systems]. > Yes, that is perfectly clear. So we can target armv4 on the compiler command line, and still get crypto instructions when executing the resulting binary on a v8 system. >> So I would vote for #1, but with the .arch/.fpu declarations moved to >> the asm files. > > This is undesired, because we don't want to make assumption that every > assembler actually supports these directives and required values. It's > safe to assume for big distro vendors, but there are people struggling > with exotic toolchains. > Well, in that case, will they be able to assemble the file in the first place, even with some -Wa, switch added? I know we deal with that using #ifdef's now, but those should be dropped imo. Perhaps we should introduce MIN_ARCH and MAX_ARCH variables, where the defaults are v4 and v8 respectively, and we can add Configure targets that narrow this down for targets like these? >> This still allows the distros to target a newer minimum >> architecture version, and in places where it matters (such as >> aes-armv4.S), it will result in slightly faster code that can no >> longer run on older cores. But I don't see a reason to make that the >> default for OpenSSL itself. > > Would you still vote for #1 if it has to be -Wa-driven? Depends on how we address my concern above, but #1 has my preference regardless. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
