On 06/28/2017 06:53 PM, Milian Wolff wrote:
On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
Hi,

The --source-only option and new source code TUI view can show
the result of performance analysis based on full source code per
symbol(function). (Namhyung Kim told me this idea and it was also requested
by others some time ago..)

If someone wants to see the cause, he/she will need to dig into the asm.
But before that, looking at the source level can give a hint or clue
for the problem.

For example, if target symbol is 'hex2u64' of util/util.c,
the output is like below.

     $ perf annotate --source-only --stdio -s hex2u64
  Percent |      Source code of util.c for cycles:ppp (42 samples)
-----------------------------------------------------------------
     0.00 : 354   * While we find nice hex chars, build a long_val.
     0.00 : 355   * Return number of chars processed.
     0.00 : 356   */
     0.00 : 357  int hex2u64(const char *ptr, u64 *long_val)
     2.38 : 358  {
     2.38 : 359          const char *p = ptr;
     0.00 : 360          *long_val = 0;
     0.00 : 361
    30.95 : 362          while (*p) {
    23.81 : 363                  const int hex_val = hex(*p);
     0.00 : 364
    14.29 : 365                  if (hex_val < 0)
     0.00 : 366                          break;
     0.00 : 367
    26.19 : 368                  *long_val = (*long_val << 4) | hex_val;
     0.00 : 369                  p++;
     0.00 : 370          }
     0.00 : 371
     0.00 : 372          return p - ptr;
     0.00 : 373  }

And I added many perf developers into Cc: because I want to listen to your
opinions about this new feature, if you don't mind.

If you give some feedback, I'd appreciate it! :)

Thanks Taeung,

I requested this feature some time ago and it's really cool to see someone
step up and implement it - much appreciated!

Thank you so much, Milian !! :)


I just tested it out on my pet-example that leverages C++ instead of C:

~~~~~
#include <complex>
#include <cmath>
#include <random>
#include <iostream>

using namespace std;

int main()
{
     uniform_real_distribution<double> uniform(-1E5, 1E5);
     default_random_engine engine;
     double s = 0;
     for (int i = 0; i < 10000000; ++i) {
         s += norm(complex<double>(uniform(engine), uniform(engine)));
     }
     cout << s << '\n';
     return 0;
}
~~~~~

Compile it with:

g++ -O2 -g -std=c++11 test.cpp -o test

Then record it with perf:

perf record --call-graph dwarf ./test

Then analyse it with `perf report`. You'll see one entry for main with
something like:

+  100.00%    39.69%  cpp-inlining  cpp-inlining      [.] main

Select it and annotate it, then switch to your new source-only view:

main  test.cpp
        │  30
        │  31    using namespace std;
        │  32
        │  33    int main()
        │+ 34    {
        │  35        uniform_real_distribution<double> uniform(-1E5, 1E5);
        │  36        default_random_engine engine;
        │+ 37        double s = 0;
        │+ 38        for (int i = 0; i < 10000000; ++i) {
   4.88 │+ 39            s += norm(complex<double>(uniform(engine),
uniform(engine)));
        │  40        }
        │  41        cout << s << '\n';
        │  42        return 0;
        │+ 43    }

Note: the line numbers are off b/c my file contains a file-header on-top.
Ignore that.

Note2: There is no column header shown, so it's unclear what the first column
represents.

Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate
shows 4.88... What is that?

What this shows, is that it's extremely important to visualize inclusive cost
_and_ self cost in this view. Additionally, we need to account for inlining.
Right now, we only see the self cost that is directly within main, I suspect.
For C++ this is usually very misleading, and basically makes the annotate view
completely useless for application-level profiling. If a second column would
be added with the inclusive cost with the ability to drill down, then I could
easily see myself using this view.

I would appreciate if you could take this into account.

Thanks a lot



Sure, I got it.
I'll investigate this weird case and recheck this patchset based on your comments,
and then I'll reply again. :)

Thanks,
Taeung

Reply via email to