>> Attached is promised patch that reworks
>> interworking logic. As mentioned earlier idea is to use __ARM_ARCH__>=5
>> || !defined(__thumb__). Rationale is that load to pc does interworking
>> since ARMv5, but without __thumb__ it does what we need even on ARMv4.
>>
> 
> OK, this appears to build and run fine when built for ARMv5/arm and
> ARMv5/thumb using the Ubuntu softfloat toolchains (arm-linux-gnueabi)
> 
> The only use case that we may break is where someone links to the
> internal libcrypto.so symbols directly, and calls them from thumb code
> on an ARMv4t but I guess if you deserve the pain in that case :-)

Correct. Those who attempt to run Thumb binary linked directly to e.g.
AES_encrypt in "universal" non-Thumb libcrypto.so specifically on ARMv4t
processor will suffer. It should be noted that [modulo fact that they
are discouraged to make such calls] such users won't be using any major
distribution, but are likely to compile everything themselves. This
means that they are likely to share compiler flags across whole thing
and it should work out for them. Basically there are two options:

- [again, modulo fact that they are discouraged to make such calls]
leave moveq pc,lr thing specifically in aes-armv4 module (other modules
are not interesting to call from application);
- document this possibility and let user choose between right thing to
do (not making such calls) or re-compile whole thing with matching flags;

I'd vote for second option.

For public reference. pre-2 Thumb is effectively impaired by limited
amount of addressable registers and even instruction set functionality.
This means that any complex algorithm will take more instructions and
therefore it will always run slower. ARM's reasoning is "yes, it's more
instructions, but every instruction is 1/2 size", so that as long as
it's not twice instructions you still win in space. But is it possible
to break this 2x barrier and get code that is both larger and slower
than non-Thumb code? Yes, it's totally possible, especially in OpenSSL
case full of complex algorithms. Which by the way is why assembly
modules are actually compiled as non-Thumb and rely on interworking even
if rest is compiled as Thumb. Note that this is about pre-2 Thumb.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to