On 2/21/15 2:44 PM, Benjamin Moody wrote:
Hi,

Trying using a program with established results. e.g., This is from
PeterZ years ago. On intel 'perf stat -e instructions a.out' should show
~1 billion (a wee bit more, but clearly in the 1b range).

That program itself gave the expected results, but it's a good
suggestion.  When I tried changing that program to also fork a bunch
of child processes, I started getting similarly weird results to what
I had seen before.

This is with the 1billion instructions sent earlier:

$ ./myperfstat ./1bi
Total instructions: 1000087491

$ perf stat -e instructions:u ./1bi

 Performance counter stats for './1bi':

     1,000,087,608      instructions:u

       0.285943713 seconds time elapsed

Using the second version of this program:

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

main ()
{
        int i;

        fork();
        fork();

        for (i = 0; i < 100000000; i++) {
                asm("nop");
                asm("nop");
                asm("nop");
                asm("nop");
                asm("nop");
                asm("nop");
                asm("nop");
        }
        wait(NULL);
        wait(NULL);
        wait(NULL);
        wait(NULL);
}

This one spawn child processes so you have
    perf -> child -> gchild -> ggchild
               |---> gchild

So it has the grandparent characteristics you are getting at.

$ ./myperfstat ./1bi4
Total instructions: 4000094342

$ perf stat -e instructions:u ./1bi4

 Performance counter stats for './1bi4':

     4,000,094,649      instructions:u

       0.259537842 seconds time elapsed

Still seems to match up.

This is on:
$ uname -r
3.18.7-200.fc21.x86_64


Looking more closely at the behavior of perf, I think the essential
difference is that perf uses the *child* PID as argument to
perf_event_open.  When I tried changing my program to do the same, it
seems to have fixed the problem.  That is to say, where I used
something like

And I did not modify the mystat.c program to do this:


    fd = perf_event_open(..., getpid(), ...);
    child = fork();
    if (child == 0) {
        execvp(...);
    }

I needed to instead use

    pipe(pipefd);
    child = fork();
    if (child == 0) {
        read(pipefd[0], &x, 1);
        execvp(...);
    }
    fd = perf_event_open(..., child, ...);
    write(pipefd[1], &x, 1);

--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to