On 26 October 2014 22:43, Andy Polyakov <[email protected]> wrote:
> Hi,
>
>>> There is inconsistency in ARM support and I'd like to gather some
>>> opinions on how to resolve it. Circulate this to ARM people near you.
>>>
>>> At some point an inconsistency of following nature was introduced and
>>> then just grew. OpenSSL attempts to adapt to processor it's running on
>>> by detecting capabilities and using run-time switch between different
>>> code paths. The code probing capabilities is compiled and executed on
>>> all supported ARM architectures, ARMv4 through ARMv8. Original rationale
>>> was that one should be able to produce "universal" binary that can be
>>> executed on wide range of processors and deliver optimal performance on
>>> all of them. But at the same time assembly modules have #if
>>> __ARM_ARCH__>=X which effectively renders them not as universal as
>>> implied in capability probing code. This is the inconsistency. There are
>>> two ways to resolve it.
>>>
>>> 1. __ARM_ARCH__ is effectively controlled by compiler -march command
>>> line option, which naturally also controls compiler-generated outcome.
>>> In order to live up to original intention to produce "universal" binary,
>>> it would be appropriate to tell *compiler* to generate code for minimal
>>> architecture one wants to target, but tell *assembler* to accept
>>> instructions for maximum architecture one wants to target, e.g.
>>> -march=armv4 -Wa,-march=armv7-a.
>>>
>>> 2. Abandon the idea of producing true "universal" binary and limit
>>> capability detection to contemporary processor families, ARMv7/8 for the
>>> moment of this writing. And run pre-ARMv7 without capability detection
>>> (there is nothing to detect really).
>>>
>>> I suppose distro vendors would prefer 2nd option, because everything has
>>> to match anyway. ISV on the other hand might prefer 1st one, unless of
>>> course they target specific distros one by one rather than providing
>>> unified binaries that can be executed on multiple distros.
>>
>> I think it is perfectly feasible to have one universal build. The
>> usual reason for /not/ having a universal build is that the compiler
>> can emit faster instructions that are not available on the older
>> architectures, but in OpenSSL's case, that is being handled by the
>> runtime selected alternatives anyway.
>
> Implied question also is if it's *worth* it. My personal preference is
> universal too, but what if it creates more problems than it solves. Or
> it attempt to solve non-existing problem if ISVs are targeting platforms
> one by one anyway. Or if nobody really cares about pre-ARMv7 when it
> comes to software *distribution*, i.e. pre-AMRv7 is interesting only in
> individual cases when you have to target very specific processor and
> build everything yourself anyway.
>

Well, even if ARMv5 may be a dead target according to some, with
Raspberry PI being a non-Thumb2 ARMv6 which I don't think should end
up in the 'legacy' column, we are still dealing with a fair amount of
variation between the targets that I feel should be supported by the
'high end' build. So yes, I think it is worth it.

>> Since the minimum architecture required by a piece of assembler code
>> is a property of the code itself, it should be encoded into it using
>> .arch/.fpu declarations, and there is no need for the '-Wa' GCC option
>> above.
>
> The reason for why I suggested -Wa is to be able to support legacy
> toolchains, as we can't afford assuming that everybody has capable
> enough assembler. Basically idea is to be as flexible as possible to
> meet "worst case" local circumstances. Yes, this means that the
> suggestion was actually incomplete, because, as assembler doesn't tell
> which command line option was passed, one has to convey the missing
> information by other means, by additional -D option.
>
> I want to remind that question is *not* about removing run-time switch
> as concept, but rather about distinguishing pre-ARMv7 and ARMv>=7. I.e.
> NEON/cryto switch will stay, the only question if it's worth imposing it
> on pre-ARMv7 builds [assuming that such binaries will be executed even
> on ARMv>=7 systems].
>

Yes, that is perfectly clear. So we can target armv4 on the compiler
command line, and still get crypto instructions when executing the
resulting binary on a v8 system.

>> So I would vote for #1, but with the .arch/.fpu declarations moved to
>> the asm files.
>
> This is undesired, because we don't want to make assumption that every
> assembler actually supports these directives and required values. It's
> safe to assume for big distro vendors, but there are people struggling
> with exotic toolchains.
>

Well, in that case, will they be able to assemble the file in the
first place, even with some -Wa, switch added?
I know we deal with that using #ifdef's now, but those should be dropped imo.

Perhaps we should introduce MIN_ARCH and MAX_ARCH variables, where the
defaults are v4 and v8 respectively, and we can add Configure targets
that narrow this down for targets like these?

>> This still allows the distros to target a newer minimum
>> architecture version, and in places where it matters (such as
>> aes-armv4.S), it will result in slightly faster code that can no
>> longer run on older cores. But I don't see a reason to make that the
>> default for OpenSSL itself.
>
> Would you still vote for #1 if it has to be -Wa-driven?

Depends on how we address my concern above, but #1 has my preference regardless.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to