Stephane

"stephane eranian" <[EMAIL PROTECTED]> wrote on 05/13/2008 10:43:42
PM:

> Gary,
>
> On Wed, May 14, 2008 at 2:07 AM,  <[EMAIL PROTECTED]> wrote:
> >  Columns correspond to the following events [event:period
(events/sample)]
> >   CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles
counted
> >  (35 samples)
> >   CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles
counted
> >  (0 samples)
> >   CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles
counted
> >  (38 samples)
> >   CPU_OP_CYCLES_ALL (min):32767 - CPU Operating Cycles -- All CPU
cycles
> >  counted (The minimum for events of this type.) (0 samples) [not shown]
> >   CPU_OP_CYCLES_ALL (max):32767 - CPU Operating Cycles -- All CPU
cycles
> >  counted (The maximum for events of this type.) (63 samples) [not
shown]
> >   CPU_OP_CYCLES_ALL (sum):32767 - CPU Operating Cycles -- All CPU
cycles
> >  counted (Summed over all events of this type.) (73 samples) [not
shown]
> >
> >  The multiplication of samples times period (73 * 32767) give me:
> >
> >  2,391,991 Cpu cycles used
> >
>
> This is not a valid way of computing the number of cycles. It assumes
> you did monitor the entire
> execution of your test program. However, this is not necessarily true
> for non self-monitoring programs.

I am using hpcrun for this test which is based on PAPI.  My understanding
is that this makes it a self-monitoring program and therefore should not
have any blind spots.  I understand that for non self-monitoring programs
blind spots will occur but I would hope that even in that environment they
do not represent over 99 percent of the samples (2 vs 26,045 million
cycles).
I am not concerned here with the results being different by 5 or 10
percent,
they are different by a factor of about 13,000.

> There are blind spots when the sampling buffer fills up. This is due
> to the fact that the default sampling
> buffer does not use a double-buffering technique (a double buffer
> format is planned). Thus when it fills up,
> monitoring stops but execution continues by default unless you've used
> --overflow-block in pfmon or the
> equivalent context flag is set by hpcrun/PAPI.
>
> Also you may want to use --follow-all for pfmon as this option also
> follows across exec which is what
> happens with system().
>

I will use --follow-all for future tests.  I reran the test that way and
it did give a slightly higher result. The work being done in the system
call for this test is trivial and therefore not a really a factor in the
final result.

Thanks for the feedback.  I think the HPCToolkit and PAPI folks have
identified why this problem is happening (fork/exec issues in PAPI). The
effort now is to try and find the best way to resolve the issue.

Gary


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to