Hello,

I have just pushed into GIT an important bug fix that affected
all architectures when doing timeout-based set switching on a
per-thread basis.

You could easily reproduce the problem by using the multiplex2
example of libpfm:

$ ./multiplex2 --us-c --freq=1000 ../../pfmon/tests/noploop 10
clock_res=1000000ns(1000.00Hz) ask period=1000000ns(1000.00Hz) get
period=1000000ns(1000.00Hz)
noploop for 10 seconds
# 1000.00Hz period = 1000000nsecs
# 1666000 cycles @ 1666 MHz
# using time-based multiplexing
# 1000000 nsecs effective switch timeout
# 2 sets
# 1173.50 average run per set
# 5000547811.50 average ns per set
# set       measured total     #runs         scaled total event name
# ------------------------------------------------------------------
  000       14,602,410,677      1174      16,602,727,292 CPU_OP_CYCLES_ALL
  001        1,498,323,507      1173       12,436,159,551 IA64_INST_RETIRED

In this test, the program runs for 10s at a switch frequency of 1000Hz with
2 event sets. Thus, we expect each set to have run about 5000 times. This
is not what is happening, yet the scaled counts are correct. This comes from
the fact that multiplex2 does duration-based scaling. The number of times
each set ran is lower than expected but when they ran, they ran for longer.

You have to be careful with multiplex2 and your underlying hardware because
you may also be limited by the clock resolution (like the example above).
Here we are operating at the limit of the granularity for instance.

$ ./multiplex2 --us-c --freq=1000 ../../pfmon/tests/noploop 10
clock_res=1000000ns(1000.00Hz) ask period=1000000ns(1000.00Hz) get
period=1000000ns(1000.00Hz)
noploop for 10 seconds
# 1000.00Hz period = 1000000nsecs
# 1666000 cycles @ 1666 MHz
# using time-based multiplexing
# 1000000 nsecs effective switch timeout
# 2 sets
# 5000.50 average run per set
# 5000255597.00 average ns per set
# set       measured total     #runs         scaled total event name
# ------------------------------------------------------------------
  000        8,293,212,752      5001       16,585,052,664 CPU_OP_CYCLES_ALL
  001       24,876,546,492      5000       49,757,211,675 IA64_INST_RETIRED


On recent X86 hardware compiled with the right timer options, you should get
5000 iterations for each set using the multiplex2 command line options above.


Here are the technical details of what happens:

   There are certain situations where the timeout expires while
   interrupts are masked. Thus the timeout is kept pending.

   The issue is that during context switch out, we saved the
   remaining value of the timeout. We reinstall the value during
   context switch in. The problem is that is the timeout has expired
   the remaining timeout value is negative. The bug was that when
   passed a negative value, hrtimer_start() will basically set the
   timeout to the largest value possible, i.e., we will never get
   a new timeout and set switching is stopped.

   During context switch in, the fix now checks if the remaining
   timeout is negative and if so triggers set switching whih will
   reinstall a new timeout value.

   Because this type of timeout checking is needed also during
   umasking of monitoring, it is implemented by pfm_restart_timer().
   Expired timeouts are accounted for by the set_switch_exp statistics
   under debugfs.

   Note that for tools which were using duration-based scaling the
   scaled value looked fine. But clearly the number of activations
   of each set was wrong.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to