> And then what? Parsing format string is still be there. Perhaps could have a fast path for simple cases.
> > This is first line of profile of the first function (format_decode) > > │ static noinline_for_stack > │ int format_decode(const char *fmt, struct printf_spec *spec) > │ { > 10.38 │ push %rbp <=== > 1.07 │ mov %rsp,%rbp > 1.09 │ push %r12 > 4.51 │ mov %rsi,%r12 > 1.40 │ push %rbx > 1.86 │ mov %rdi,%rbx > │ sub $0x8,%rsp > > It is so bloated that gcc needs to be asked to not screw up with stack > size. What happens when you drop all the noinlines for this? I assume this would alread make it faster. And now that we have bigger stacks we can likely tolerate it. -Andi -- a...@linux.intel.com -- Speaking for myself only.