On 11/03/2021 19.02, Linus Torvalds wrote: > On Wed, Mar 10, 2021 at 5:45 PM Rasmus Villemoes > <li...@rasmusvillemoes.dk> wrote: >> >> Hm, gcc does elide the test of the return value, but jumps back to a >> place where it always loads state from its memory location and does the >> whole switch(). To get it to jump directly to the code implementing the >> various do_* helpers it seems one needs to avoid that global variable >> and instead return the next state explicitly. The below boots, but I >> still can't see any measurable improvement on ppc. > > Ok. That's definitely the right way to do efficient statemachines that > the compiler can actually generate ok code for, but if you can't > measure the difference I guess it isn't even worth doing.
Just for good measure, I now got around to test on x86 as well, where I thought the speculation stuff might make a difference. However, the indirect calls through the actions[] array don't actually hurt due to __noinitretpoline, and even removing that from the __init definition, I only see about 1.5% difference with that state machine patch applied. So it doesn't seem worth pursuing. I'll send v3 of the async patches shortly. Rasmus