On 9/17/19 6:55 AM, Wilco Dijkstra wrote: > Hi Kyrill, > >>> When you select a CPU the goal is that we optimize and schedule for that >>> specific microarchitecture. That implies using atomics that work best for >>> that core rather than outlining them. >> >> I think we want to go ahead with this framework to enable the portable >> deployment of LSE atomics. >> >> More CPU-specific fine-tuning can come later separately. > > I'm not talking about CPU-specific fine-tuning, but ensuring we don't penalize > performance when a user selects the specific CPU their application will run > on. > And in that case outlining is unnecessary.
>From aarch64_override_options: Given both -march=foo -mcpu=bar, then the architecture will be foo and -mcpu will be treated as -mtune=bar, but will not use any insn not in foo. Given only -mcpu=foo, then the architecture will be the one supported by foo. So if foo supports LSE, then we will not outline the functions, no matter how we arrive at foo. r~