Hi,

>> There is inconsistency in ARM support and I'd like to gather some
>> opinions on how to resolve it. Circulate this to ARM people near you.
>>
>> At some point an inconsistency of following nature was introduced and
>> then just grew. OpenSSL attempts to adapt to processor it's running on
>> by detecting capabilities and using run-time switch between different
>> code paths. The code probing capabilities is compiled and executed on
>> all supported ARM architectures, ARMv4 through ARMv8. Original rationale
>> was that one should be able to produce "universal" binary that can be
>> executed on wide range of processors and deliver optimal performance on
>> all of them. But at the same time assembly modules have #if
>> __ARM_ARCH__>=X which effectively renders them not as universal as
>> implied in capability probing code. This is the inconsistency. There are
>> two ways to resolve it.
>>
>> 1. __ARM_ARCH__ is effectively controlled by compiler -march command
>> line option, which naturally also controls compiler-generated outcome.
>> In order to live up to original intention to produce "universal" binary,
>> it would be appropriate to tell *compiler* to generate code for minimal
>> architecture one wants to target, but tell *assembler* to accept
>> instructions for maximum architecture one wants to target, e.g.
>> -march=armv4 -Wa,-march=armv7-a.
>>
>> 2. Abandon the idea of producing true "universal" binary and limit
>> capability detection to contemporary processor families, ARMv7/8 for the
>> moment of this writing. And run pre-ARMv7 without capability detection
>> (there is nothing to detect really).
>>
>> I suppose distro vendors would prefer 2nd option, because everything has
>> to match anyway. ISV on the other hand might prefer 1st one, unless of
>> course they target specific distros one by one rather than providing
>> unified binaries that can be executed on multiple distros.
> 
> I think it is perfectly feasible to have one universal build. The
> usual reason for /not/ having a universal build is that the compiler
> can emit faster instructions that are not available on the older
> architectures, but in OpenSSL's case, that is being handled by the
> runtime selected alternatives anyway.

Implied question also is if it's *worth* it. My personal preference is
universal too, but what if it creates more problems than it solves. Or
it attempt to solve non-existing problem if ISVs are targeting platforms
one by one anyway. Or if nobody really cares about pre-ARMv7 when it
comes to software *distribution*, i.e. pre-AMRv7 is interesting only in
individual cases when you have to target very specific processor and
build everything yourself anyway.

> Since the minimum architecture required by a piece of assembler code
> is a property of the code itself, it should be encoded into it using
> .arch/.fpu declarations, and there is no need for the '-Wa' GCC option
> above.

The reason for why I suggested -Wa is to be able to support legacy
toolchains, as we can't afford assuming that everybody has capable
enough assembler. Basically idea is to be as flexible as possible to
meet "worst case" local circumstances. Yes, this means that the
suggestion was actually incomplete, because, as assembler doesn't tell
which command line option was passed, one has to convey the missing
information by other means, by additional -D option.

I want to remind that question is *not* about removing run-time switch
as concept, but rather about distinguishing pre-ARMv7 and ARMv>=7. I.e.
NEON/cryto switch will stay, the only question if it's worth imposing it
on pre-ARMv7 builds [assuming that such binaries will be executed even
on ARMv>=7 systems].

> So I would vote for #1, but with the .arch/.fpu declarations moved to
> the asm files.

This is undesired, because we don't want to make assumption that every
assembler actually supports these directives and required values. It's
safe to assume for big distro vendors, but there are people struggling
with exotic toolchains.

> This still allows the distros to target a newer minimum
> architecture version, and in places where it matters (such as
> aes-armv4.S), it will result in slightly faster code that can no
> longer run on older cores. But I don't see a reason to make that the
> default for OpenSSL itself.

Would you still vote for #1 if it has to be -Wa-driven?

Thanks for feedback.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to