Dmitry,

Ah, about the load it's because perf sched record adds too many events to the
recording (and configuring small buffers for perf). Using a smaller set
of events works much better.

One thing I did was to record on /tmp - You have enough memory for this to
work. I'll try to come up with the optimal record set later this day and
share.

Regards

-- Pantelis

On Apr 2, 2012, at 1:09 PM, Dmitry Antipov wrote:

> On 03/08/2012 05:20 PM, Pantelis Antoniou wrote:
> 
>> The current issue is that scheduler development is not easily shared between
>> developers. Each developer has their own 'itch', be it Android use cases, 
>> server
>> workloads, VM, etc. The risk is high of optimizing for one's own use case and
>> causing severe degradation on most other use cases.
>> 
>> One way to fix this problem would be the development of a method with which 
>> one
>> could perform a given use-case workload in a host, record the activity in a
>> interchangeable portable trace format file, and then play it back on another
>> host via a playback application that will generate an approximately similar 
>> load
>> which was observed during recording.
> 
> Have you tried to investigate whether 'perf' tool with 'sched record' and 
> 'sched replay'
> features might be useful for such a purpose?
> 
> I tried to record and replay the various types of commonly used benchmarks, 
> including
> CPU, I/O and network intensive workloads, and have to say that the recording 
> and
> (especially) replaying overhead is quite high, at least for the default Panda 
> board
> configuration (where main I/O is slow due to root file system on SD card). 
> Simple
> things like 'perf sched record sleep 10' works for the most of the cases (but 
> still
> may cause sample loss, up to 10-20%). But, when I tried to add some I/O, for 
> example,
> with 'find /', the total workload becomes too high and the system (almost) 
> hangs
> with a lot of messages like:
> 
> INFO: task kjournald:512 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> INFO: rcu_preempt detected stalls on CPUs/tasks: 8055ec64     0   512      2 
> 0x00000000
> INFO: Stall ended before state dump start
> INFO: task kjournald:512 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> INFO: task flush-179:0:511 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> INFO: task kjournald:512 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> 
> Now I'm checking whether it's possible to do some partial recording (by 
> skipping
> some kinds of unrelated samples) and offload the kernel tracing subsystem to 
> get
> more CPUs time for the user-space tasks.
> 
> Do you have any thoughts about this?
> 
> Thanks,
> Dmitry


_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Reply via email to