> And then what? Parsing format string is still be there.
Perhaps could have a fast path for simple cases.
>
> This is first line of profile of the first function (format_decode)
>
> │ static noinline_for_stack
> │ int format_decode(const char *fmt, struct printf_spec *spec)
> │ {
> 10.38 │ push %rbp <===
> 1.07 │ mov %rsp,%rbp
> 1.09 │ push %r12
> 4.51 │ mov %rsi,%r12
> 1.40 │ push %rbx
> 1.86 │ mov %rdi,%rbx
> │ sub $0x8,%rsp
>
> It is so bloated that gcc needs to be asked to not screw up with stack
> size.
What happens when you drop all the noinlines for this? I assume
this would alread make it faster. And now that we have bigger
stacks we can likely tolerate it.
-Andi
--
[email protected] -- Speaking for myself only.