On 01/16/2015 01:29 PM, Zoltan Kiss wrote: > > > On 16/01/15 15:38, William Cohen wrote: >> On 01/16/2015 09:01 AM, Zoltan Kiss wrote: >>> Hi, >>> >>> I'm using OProfile to check some suspicious behaviour of dpdk-pktgen, >>> and I can see something which troubles me. Either the scheduler lies >>> about core affinity or Oprofile accounts some samples wrongly. >>> This userspace app runs in threads, which are assigned explicitly to one >>> single core with pthread_setaffinity_np (I've added some debug lines >>> which checks the affinity with pthread_getaffinity_np to make sure that >>> the affinity was correctly set indeed) >>> These threads run infinite loops, calling different functions. In my >>> example one of these functions (pktgen_main_rxtx_loop) should only be >>> called by the thread on core #1, but in the oprofile results I can see >>> samples from all other cores as well. I've added a sched_getcpu() call >>> to every iteration of that infinite loop to check if it runs anywhere >>> else than core #1, but it seems to be fine. >>> So my conclusion is that either the scheduler doesn't care about thread >>> affinity and even sched_getcpu() can't see that, OR, Oprofile falsely >>> accounts samples to the wrong CPU. Or to the wrong symbol, although I've >>> added "__attribute__ ((noinline))" to every called function to make sure >>> that inlineing doesn't screw up anything. >>> I'm running my tests on Ubuntu Servers with 14.04 with 3.13.0-32 kernel, >>> the CPU is a desktop kind, i5-4570 @ 3.20GHz (no-HT!) with oprofile 0.9.9. >>> Anyone have a suggestion about what happens? >>> >>> Regards, >>> >>> Zoltan >> >> Hi Zoltan, >> >> Are you using operf or opcontrol to set up the data measurements? It would >> be good to provide the command lines that you are using to do the data >> collection with oprofile. >> >> "perf stat <command>" or "perf stat -p <list_of_pids>" can show task >> migration. Look for the non-zero "cpu-migrations" in the output like in the >> output below. >> >> $ perf stat -p 29600 >> ^C >> Performance counter stats for process id '29600': >> >> 4303.761019 task-clock (msec) # 0.305 CPUs utilized >> [100.00%] >> 17,980 context-switches # 0.004 M/sec >> [100.00%] >> 595 cpu-migrations # 0.138 K/sec >> [100.00%] >> 44,212 page-faults # 0.010 M/sec >> 10,585,868,260 cycles # 2.460 GHz >> [100.00%] >> 6,463,554,435 stalled-cycles-frontend # 61.06% frontend >> cycles idle [100.00%] >> <not supported> stalled-cycles-backend >> 10,400,529,789 instructions # 0.98 insns per cycle >> # 0.62 stalled cycles >> per insn [100.00%] >> 2,049,813,299 branches # 476.284 M/sec >> [100.00%] >> 48,441,881 branch-misses # 2.36% of all branches >> >> 14.112614437 seconds time elapsed >> > I've checked with perf, it doesn't show any cpu-migrations. I've also > checked the actual vma addresses in the samples, they are indeed > referring to that particular function. Tried to compile the application > with -O0 and -Og, no luck so far
Hi Zoltan, Could you write up the steps used to produce the problem so we have a reproducer and include the header and the portion of the opreport output that shows the problem? -Will >> >> >> As a check you might try using "perf record <command>" or "perf record -p >> <list_of_pids>". Once data collection is done use "perf report --sort=cpu" >> to see whether samples were recorded on other processors. That would >> determine how much the monitored process is beoing scheduled on other cpus. > I tried to use "perf report" in various ways, but I couldn't get any > information out related to which CPU was used. >> >> There is also a systemtap (https://sourceware.org/systemtap/) script >> cycle_thief >> (https://sourceware.org/systemtap/examples/#process/cycle_thief.stp) that >> shows how often tasks are migrated and what other tasks are competing with >> the pid for time. >> >> These suggestions should help determine whether oprofile is getting the >> attribution of samples wrong or the scheduler is migrating the tasks. >> >> -Will >> > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > oprofile-list mailing list > oprofile-l...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oprofile-list > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/