Hi,

I believe I'm the "original author" of some patch attempt at tackling this 
problem, more than a year ago, as referenced in this series.
And I understand why everyone, Khem being the first and not the only one, would 
like some "simpler" things for ARM.
But the problem is that ARM-based SoCs are very diverse, and ARM does have a 
number of optional IP blocks (such as crypto, but neon is another one, and 
there are others), defined for each architecture. Then ARM defines some 
"standard" SoCs (like cortex-A53, cortex-A57, ...) which may set some of those 
optional IPs as required for that SoC, and the rest still as optional.
And SoC vendors decide what optional IPs they will implement or not...

So when we're talking "cortex-A53", it's not necessarily the same cortex-A53 
for all SoC vendors.

GCC does support all that complexity. So the main question is, do we want to be 
able to generate code that could take advantage of the optional IPs present on 
a SoC? Or do we prefer to settle for the least common denominator?
As someone who is close to the SoC, I definitely would prefer to be able to 
take advantage of the optional IPs present on an ARM SoC, and I'd rather have a 
system that can at least support that even if it's slightly more complex. This 
said, once it's done, most people won't look under the hood but just use it, so 
the complexity would end up being hidden - much like now with armv7.

I've personally followed up on my patches from last year, and I now have a 
slightly modified/simplified version of them, which I've used to build some 
production-ready environments using cortex-a53/armv8 tunes, that trigger the 
optimization for cortex-a53 + neon. And if the SoC I'm working with had the 
crypto extension, I would be very happy to build for it, by just switching the 
tune I use for my cortex-a53 to the armv8 tune supporting crypto.

So I believe now may be a good time to talk this over again, because we're 
basically building for cortex-a53 with cortexa7/armv7ve, and that is not the 
most optimal thing to do in my opinion (like, some instructions that were 
native in armv7ve are simulated in armv8).

One thing that I did come up as a simplification was the handling of thumb, I 
don't think it needs to be an option anymore, since its support is mandatory in 
armv8 (but I think it was also the case in armv7). That simplifies things a 
bit, but nothing fundamental, you still need to carry the support for the 
optional IPs around...
And in addition to what I proposed to support last year, we indeed now have to 
add armv8.1a, armv8.2a, armv8.3a, armv8.4a (so far...), which each have their 
own specificities/differences that make it unlikely to be supported within a 
single file.

Thoughts? Can we talk this over, so we can have a chance to have a good support 
for armv8-32 in oe, instead of everyone doing its own?

Cheers,
Herve

-----Original Message-----
From: openembedded-core-boun...@lists.openembedded.org 
[mailto:openembedded-core-boun...@lists.openembedded.org] On Behalf Of Koen Kooi
Sent: mardi 12 juin 2018 11:01
To: Randy Li <ay...@soulik.info>
Cc: OE-core <openembedded-core@lists.openembedded.org>
Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex 
processors



> Op 9 jun. 2018, om 08:26 heeft Randy Li <ay...@soulik.info> het volgende 
> geschreven:
> 
> I read the ARMv8 manual again, it looks the hardware float is 
> mandatory in Linux Distributions and toolchain libraries. Even some 
> cortex processors can be configured without FPU/NEON hardware, but I 
> don't think they would be used in openembeded core.
> 
> So I can assume the NEON(SIMD) would exist all the time. Leaving only 
> the crc and crypto instructions are optional here.
> 
> 
> Randy Li (4):
>  arch-armv8a.inc: add tune include for armv8
>  tune-cortexa35: add tunes for ARM Cortex-A35
>  tune-cortexa32: add tunes for ARM Cortex-A32
>  tune-cortexa72: add tunes for ARM Cortex-A72

Having been forced to deal with the mess that’s 32-bit arm tunes: Let’s only 
add an implementation specific tunes *after* having seem conclusive, repeatable 
benchmark results. 90% of the 32 bit tune files are placebo effect and just 
explode number of package archs in your distro feed. The goal of aarch64 was to 
stop being different for the sake of being different, let’s not make a mess 
because we are used to messes.

regards,

Koen
--
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

-- 
_______________________________________________
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

Reply via email to