[lng-odp] [Bug 1449] odp_timer_test core dump

bugzilla-daemon Mon, 02 Nov 2015 03:08:06 -0800

https://bugs.linaro.org/show_bug.cgi?id=1449


--- Comment #12 from Ivan Khoronzhuk <ivan.khoronz...@linaro.org> ---
As were said previously we shouldn't set period less than resolution.
It's incorrect and makes example to work in it's own time.
The actual resolution of timer is not 1ns it's much more and includes
code path from thread creation till sending event on the queue and
depends on CPU freq. It can differ from system to system and circumstances.
On my PC it was ~2ms (I filtered picks spent on scheduling and it's like in
the best case, like CPU0 is isolated).

If set resolution to be more than jiffy, say 15ms and timeout 30ms the error
in question is still present. Because linux scheduler can switch worker task
that trying to set abs time, in this time CPU0 can count several ticks while
worker thread is sleeping. When it's back, it sees that time already spent and
cannot be set.

tick = odp_timer_current_tick(gbls->tp);
tick += period;
--------------------> here we can be interrupted for instance by scheduler
--------------------< here, when back, period is expired
odp_timer_set_abs(ttp->tim, tick, &ttp->ev); >>>>> here an error after check

Ony case to work it correctly is under kernel style spin_lock (that is not
needed in real app) or CPUs to be isolated, that is also problem of the OS.

So this is an issue of non-real-time Linux, not the timer.
Maybe it can be a little improved but it can be really fixed only in case
if each core (including CPU0) is isolated. So, to work this example test
correctly the timeout should be set including time spent by scheduler on
handling some other tasks, etc... in my case with 4CPUs (2 real, 2 virtual) and
overloaded by other tasks system it took about 4jiffies, like:
"./odp_timer_test -p 80000 -r 20000"

It's like question, why to set period less than your system can handle?

Also, don't forget that actual resolution of the timer directly depends on
how good CPU0 is isolated, as it handles timer notifications updating ticks.
Minimum timeout (period set as argument) value depends how CPUs of worker
threads are isolated.

The following patches already on review allow to improve situation a little.
Maybe I will send some more, but it's definitely not the problem of the timer
and this bug should be closed.

[lng-odp] [PATCH] example: timer: don't set timeout less than resolution
https://lists.linaro.org/pipermail/lng-odp/2015-October/016772.html

[lng-odp] [PATCH] linux-generic: odp_timer: abort if tick is lost
https://lists.linaro.org/pipermail/lng-odp/2015-October/016773.html

[lng-odp] [PATCH] linux-generic: cpumask: exclude CPU0 from
odp_cpumask_default_worker
https://lists.linaro.org/pipermail/lng-odp/2015-October/016542.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

[lng-odp] [Bug 1449] odp_timer_test core dump

Reply via email to