Stephane Well you guessed right on both counts. The /proc/perfmon shows the version is 2.0.
The pfmon command looks like this: pfmon --debug -v -e CPU_CYCLES ./code.exe >pfmon 2>pfmon.debug The hpcrun command looks like this: hpcrun -e CPU_CYCLES:32767 -o hpcrun.data -- ./code.exe >hpcrun 2>hpcrun.debug So in both cases I run it as a tool that monitors another process (does not use self monitoring). I fail to see how this can account for the differences but am very interested in the explanation. As far as I know, this particular customer application is the only one we have found that produces inconsistent results. All other executables that I have run these tools against seem to produce counts for CPU_CYCLES that are very close. Please tell me more. [EMAIL PROTECTED] wrote on 04/12/2008 12:05:36 AM: > Gary, > > I suspect you are running the stock perfmon as shipped with 2.6.18, > i.e., v2.0. > You can find out in /proc/perfmon. > > I would need the cmdline options used for pfmon. > > As for HPCRUN, I would need to know how this is run. In particular > whether this is a self-monitoring run or just like pfmon, a tool > monitoring another thread. > I suspect the latter which could explain the differences you are seeing. > > On Fri, Apr 11, 2008 at 8:04 PM, <[EMAIL PROTECTED]> wrote: > > Stephane > > > > Our system is running: > > > > MODEL ia64 [type=ia64] > > CPU 8 x Itanium 2, 64 bits 1600.000442 Mhz > > MEM 8219456 kB real memory > > OS Bull Linux Advanced Server release 4 (V5) - kernel 2.6.18-B64k.1.7 > > > > This kernel is based on the 2.6.18 kernel but has Bull specific patches > > included in it. > > > > Since perfmon is included in the kernel I do not know how to find its > > version. I would > > expect that we are running the one that comes with the 2.6.18 kernel. If > > you can tell me > > how to find a version for perfmon I will get it for you. In addition if > > you can provide me > > with a list of the modules that make up perfmon, I can check to see if Bull > > has made > > any patches to those modules. I know that we have not yet installed the > > perfmon2 > > kernel patches. This is on our list to try but has not been done yet. > > > > The value of 154 billion CPU_CYCLES is the approximate value reported by > > PFMON in its stdout. > > > > The value of 2 billion is the approximate result when I multiply the total > > number of samples reported by > > HPCPROF (about 68000) times the sampling period used in the HPCRUN (32767). > > As a point of interest > > the contents of /proc/interrupts also shows about 68000 perfmon interrupts > > occur during the HPCRUN. > > > > I will send the kernel debug data for both the PFMON and HPCRUN tests to > > your googlemail account > > in a separate email. > > > > At this point if you can just point me in the right direction and suggest > > some things to look for I will be > > a happy camper. > > > > Thanks > > > > > > Gary > > > > > > "stephane eranian" <[EMAIL PROTECTED]> wrote on 04/10/2008 12:23:22 > > PM: > > > > > > > > > Gary, > > > > > > On Wed, Apr 9, 2008 at 1:18 AM, <[EMAIL PROTECTED]> wrote: > > > > > > > > I have a customer who has an application that when run under pfmon > > reports > > > > 154 billion CPU_CYCLES used (appears to be a reasonable value). When > > this > > > > same application is run under Hpcrun (from HPCToolkit using PAPI) it > > only > > > > reports about 2 billion CPU_CYCLES used. These tests are run on an > > Intel > > > > IA64 platform. > > > > > > > You need to tell me which kernel version, which perfmon version. > > > > > > Also how did you calculate those 2 numbers? What this simlpe counting and > > > derived from the samples you are getting. > > > > > > The 'losing interrupts' should not affect you because it is related > > > to handling > > > of signals in multi-threaded programs. > > > > > > > > > As for the log mail them to me directly. > > > > > > Thanks. > > > > > > > This application runs as a single thread and does not set a signal > > handler > > > > or mask the SIGIO signal. Hpcrun produces 8 data output files when run > > on > > > > this application. One for the application itself, 4 for bash scripts > > the > > > > application runs, 2 for 'rm' commands the application executes and 1 > > for a > > > > gzip command it runs. > > > > > > > > The customer wants to know why Hpcrun only reports a little over 1% of > > the > > > > cpu > > > > cycles used. I have been trying to compare what pfmon does to what > > hpcrun > > > > does > > > > and it seems that the only debug data available for both runs is the > > kernel > > > > debug > > > > data written by perfmon. This data clearly shows that Hpcrun/Papi is > > using > > > > the perfmon services differently than pfmon does. I tried to attach > > the > > > > debug output for these two runs to this mail but that exceeded the > > allowed > > > > message > > > > size for the list. > > > > > > > > I tried adding code (as a test case) to the Papi signal handler to > > count > > > > and print > > > > the number of signals paid during the run. The values printed seemed > > to > > > > pretty > > > > much match the values reported as number of samples when hpcprof is > > run on > > > > the > > > > hpcrun data files. This was an attempt to detect if my problem was > > > > handling signals > > > > or getting them and I think this test showed the problem is in getting > > > > them. > > > > > > > > I have also browsed this mailing list and found a thread called > > > > "papi on compute node linux" which was last updated 2008-03-10. The > > > > discussion in this thread sounds to me like it could easily explain > > what > > > > I am seeing. > > > > > > > > Is there a way I can determine if this discussion (ie: loosing > > interrupts) > > > > is what I am seeing ? > > > > > > > > Thanks for any help you can provide. > > > > > > > > Gary > > > > > > > > > > > > > > ------------------------------------------------------------------------- > > > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > > > > Don't miss this year's exciting event. There's still time to save > > $100. > > > > Use priority code J8TL2D2. > > > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java. > > > sun.com/javaone > > > > _______________________________________________ > > > > perfmon2-devel mailing list > > > > [email protected] > > > > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > > > > > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > > Don't miss this year's exciting event. There's still time to save $100. > > Use priority code J8TL2D2. > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java. > sun.com/javaone > > _______________________________________________ > > perfmon2-devel mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > perfmon2-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ perfmon2-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
