>>

Hi Gary,

>> Phil,
>>
>> On Wed, Apr 16, 2008 at 11:22 AM, Philip Mucci <[EMAIL PROTECTED]>  
>> wrote:
>>> Folks,
>>>
>>> hpcrun does it's sampling inside the target process using first  
>>> person
>>> access, not a 3rd person ptrace() like pfmon, so the process is
>>> implicitly blocked when processing samples, i.e. there are no  
>>> dropped
>>> samples unless something else has gone wrong.
>>>
>> Thanks for clarifing this. It makes more sense given how PAPI works.
>
> I think this means that I just lost the best explanation I have had  
> so far
> as to why I see these inconsistencies.
>

Yes, you are correct. The blocking is not an issue in hpcrun.

>>
>>> Another thing, you cannot rely on the sample count of hpcrun to
>>> compute cycles. Why? Because those are samples that only have not  
>>> been
>>> dropped. If samples occur ourside of the sample space (as can happen
>>> when one has floating point exceptions), the address will be in  
>>> kernel
>>> space and it will be dropped. pfmon has no concept of filtering out
>>> addresses, so even if you ask for user-space samples, you'll still  
>>> get
>>> samples in the output with kernel addresses. I'm not sure what the
>>> default is for your version of pfmon.
>>>
>> PFmon does not do filtering of samples. It relies on the hardware via
>> the priv levels. By default, pfmon only measures at the user level.
>> That does not mean you won't get kernel-level samples because there
>> are boundary effects when sampling.
>>
>>> Which value is correct, according to /bin/time? 2Billion or 154
> Billion?
>>>
>> This is a valid point. Which value makes most sense related to time?
>
>
> Let me provide all the steps I use when running these tests.  Maybe I
> am just doing something wrong and you can correct my misunderstanding.
>
> When I use "time" with my hpcrun test, it provides this information:
>
> time hpcrun -e CPU_CYCLES:32767 -o hpcrun.data -- ./code.exe >hpcrun
> 2>hpcrun.debug
> real    1m44.921s
> user    1m39.490s
> sys     0m2.420s
>
> The output from the hpcprof run on all of the data files produced by  
> this
> test shows the following summary information:
>
> Columns correspond to the following events [event:period (events/ 
> sample)]
>  CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (9 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (41602 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (24490 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (7 samples) [not shown]
>  CPU_CYCLES:32767 - CPU Cycles (5 samples) [not shown]
>  CPU_CYCLES (min):32767 - CPU Cycles (The minimum for events of this
> type.) (1 samples)
>  CPU_CYCLES (max):32767 - CPU Cycles (The maximum for events of this
> type.) (66127 samples)
>  CPU_CYCLES (sum):32767 - CPU Cycles (Summed over all events of this
> type.) (66200 samples)
>

Gary, this executable must be multi-threaded? Do each of the threads  
do the same amount of work? If so, the above is your clue. Most of the  
threads are experiencing the perfmon2 race where the signal comes in  
but gets dropped and thus monitoring is not restarted. PAPI from CVS  
has fixes in there for this on perfmon2 platforms. Is this OS using  
perfmon2 or the old 'monolithic' perfmon interface? If this is  
perfmon1, then we may have an issue here. But PAPI-CVS handles this  
properly for perfmon2 by using a real-time signal.

Judging from the other numbers you have below, I'd guess that if you  
set the sample rate to something much lower (which is certainly  
reasonable, 32768 is awfully small for a 1600Mhz processor), then  
you'd get more reasonable results. Experience (from Mark and the Rice  
folks) have shown that this signal dropping is much less likely to  
happen when the interrupt load is low.

So, I reckon if you set the sample period to 16,000,000 (approximate  
100/second), you'll get answers that match up.

Phil

P.S. Please get back to me on which version of PAPI and perfmon kernel  
support you have.


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to