On Sat, Aug 06, 2016 at 08:16:27PM -0700, Andi Kleen wrote:
> Alexey Dobriyan <adobri...@gmail.com> writes:
> > -
> > +   seq_printf(m, "State:\t%s", get_task_state(p));
> > +
> > +   seq_puts(m, "\nTgid:\t");
> 
> The only different should be the format string.
> 
> Scanning the format string really shouldn't be that expensive?!?

Surprise, it is (see my reply to Al).

What seq_put_decimal_ull() did is the equivalent of

        seq << "foo";
        seq << bar;
        seq << '\n';

No precisions, not widths, no padding, no upper and lowercasing.

> It would be better if you could find out why that is slow and optimize
> it. Then you would benefit every seq_printf user, not just this
> special case.
> 
> Perhaps it could benefit from some of the bit masking tricks to
> scan the string with wider tests than a word.

And then what? Parsing format string is still be there.

This is first line of profile of the first function (format_decode)

       │     static noinline_for_stack
       │     int format_decode(const char *fmt, struct printf_spec *spec)
       │     {
 10.38 │       push   %rbp                      <===
  1.07 │       mov    %rsp,%rbp
  1.09 │       push   %r12
  4.51 │       mov    %rsi,%r12
  1.40 │       push   %rbx
  1.86 │       mov    %rdi,%rbx
       │       sub    $0x8,%rsp

It is so bloated that gcc needs to be asked to not screw up with stack
size.

Reply via email to