Stephane "stephane eranian" <[EMAIL PROTECTED]> wrote on 05/13/2008 10:43:42 PM:
> Gary, > > On Wed, May 14, 2008 at 2:07 AM, <[EMAIL PROTECTED]> wrote: > > Columns correspond to the following events [event:period (events/sample)] > > CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles counted > > (35 samples) > > CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles counted > > (0 samples) > > CPU_OP_CYCLES_ALL:32767 - CPU Operating Cycles -- All CPU cycles counted > > (38 samples) > > CPU_OP_CYCLES_ALL (min):32767 - CPU Operating Cycles -- All CPU cycles > > counted (The minimum for events of this type.) (0 samples) [not shown] > > CPU_OP_CYCLES_ALL (max):32767 - CPU Operating Cycles -- All CPU cycles > > counted (The maximum for events of this type.) (63 samples) [not shown] > > CPU_OP_CYCLES_ALL (sum):32767 - CPU Operating Cycles -- All CPU cycles > > counted (Summed over all events of this type.) (73 samples) [not shown] > > > > The multiplication of samples times period (73 * 32767) give me: > > > > 2,391,991 Cpu cycles used > > > > This is not a valid way of computing the number of cycles. It assumes > you did monitor the entire > execution of your test program. However, this is not necessarily true > for non self-monitoring programs. I am using hpcrun for this test which is based on PAPI. My understanding is that this makes it a self-monitoring program and therefore should not have any blind spots. I understand that for non self-monitoring programs blind spots will occur but I would hope that even in that environment they do not represent over 99 percent of the samples (2 vs 26,045 million cycles). I am not concerned here with the results being different by 5 or 10 percent, they are different by a factor of about 13,000. > There are blind spots when the sampling buffer fills up. This is due > to the fact that the default sampling > buffer does not use a double-buffering technique (a double buffer > format is planned). Thus when it fills up, > monitoring stops but execution continues by default unless you've used > --overflow-block in pfmon or the > equivalent context flag is set by hpcrun/PAPI. > > Also you may want to use --follow-all for pfmon as this option also > follows across exec which is what > happens with system(). > I will use --follow-all for future tests. I reran the test that way and it did give a slightly higher result. The work being done in the system call for this test is trivial and therefore not a really a factor in the final result. Thanks for the feedback. I think the HPCToolkit and PAPI folks have identified why this problem is happening (fork/exec issues in PAPI). The effort now is to try and find the best way to resolve the issue. Gary ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ perfmon2-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
