> From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Thursday, 26 October 2023 18.07 > > 26/10/2023 17:54, Bruce Richardson: > > On Thu, Oct 26, 2023 at 04:59:51PM +0200, Morten Brørup wrote: > > > > From: Morten Brørup [mailto:m...@smartsharesystems.com] > > > > Sent: Thursday, 26 October 2023 16.50 > > > > > > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > > > Sent: Thursday, 26 October 2023 16.31 > > > > > > > > > > 26/10/2023 16:08, Morten Brørup: > > > > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > > > > > Sent: Thursday, 26 October 2023 16.05 > > > > > > > > > > > > > > 26/10/2023 15:57, Morten Brørup: > > > > > > > > > From: Morten Brørup [mailto:m...@smartsharesystems.com] > > > > > > > > > Sent: Thursday, 26 October 2023 15.45 > > > > > > > > > > > > > > > > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > > > > > > > > Sent: Thursday, 26 October 2023 15.37 > > > > > > > > > > > > > > > > > > > > 25/10/2023 18:31, Thomas Monjalon: > > > > > > > > > > > Real-time thread priority was been forbidden on Unix > > > > > > > > > > > because of problems they can cause. > > > > > > > > > > > Warnings and helpers are added to avoid deadlocks, > > > > > > > > > > > so real-time can be allowed on all systems. > > > > > > > > > > > > > > > > > > > > Unit test is failing: > > > > > > > > > > DPDK:fast-tests / threads_autotest TIMEOUT 600.01 > s > > > > > > > > > > > > > > > > > > > > It is seen in only 1 target (maybe the failure > occurence is > > > > random): > > > > > > > > > > Debian 11 (Buster) (ARM) | PASS > > > > > > > > > > Fedora 37 (ARM) | PASS > > > > > > > > > > CentOS Stream 9 (ARM) | FAIL > > > > > > > > > > Fedora 38 (ARM) | PASS > > > > > > > > > > Fedora 38 (ARM Clang) | PASS > > > > > > > > > > Ubuntu 20.04 (ARM) | PASS > > > > > > > > > > > > > > > > > > > > I need to send a v4 with new implementation and better > comments. > > > > > > > > > > The Unix sleep will be upgraded from 1 ns to 1 us in > case it makes > > > > a > > > > > > > > > > difference. > > > > > > > > > > > > > > > > > > It will not make a difference. The kernel will go > through the > > > > sleeping > > > > > > > steps, > > > > > > > > > then wake up again and see the real-time thread is ready > to run, and > > > > > then > > > > > > > > > immediately schedule it. > > > > > > > > > > > > > > > > > > For testing purposes, consider sleeping 10 milliseconds > or something > > > > > > > > > significant like that. > > > > > > > > > > > > > > > > A bit more details... > > > > > > > > > > > > > > > > In our recent tests, nanosleep() itself took around 50 us. > So you need > > > > > to > > > > > > > sleep longer than that for your thread not to be runnable > when the > > > > > nanosleep() > > > > > > > wakes up again, because 50 us has already passed in > "nanosleep > > > > overhead". > > > > > > > > 10 milliseconds provides plenty of margin, and corresponds > to 10 > > > > jiffies > > > > > on > > > > > > > a 1000 Hz kernel. (I don't know if it makes any difference > for the > > > > kernel > > > > > > > scheduler if the timer crosses a jiffy border or not.) > > > > > > > > > > > > > > 10 ms looks like an eternity. > > > > > > > > > > > > Agree. It is only for functional testing, not for production! > > > > > > > > > > Realtime thread won't make any sense if we have to insert a long > sleep. > > > > > > > > It seems David came to our rescue here! > > > > > > > > I have just tried running our test again with > prctl(PR_SET_TIMERSLACK) of 1 > > > > ns, and the nanosleep(1 ns) delay dropped from ca. 50 us to ca. > 2.5 us. > > > > > > > > The timeout parameter to epoll_wait() is in milliseconds, which is > useless for > > > > low-latency. > > > > Perhaps real-time threads can be used with epoll() combined with > timerfd for > > > > nanosecond resolution timeout. > > > > > > Or epoll_pwait2(), which has nanosecond resolution timeout. > > > > > > Unfortunately, rte_epoll_wait() is not an experimental API anymore, > so we cannot change its timeout parameter from milliseconds to micro- or > nanoseconds. We would have to introduce a new API for this. > > > > > > > Just an idea - can we change the timeout parameter to float rather > than int, > > and then use function versioning for backward compatibility for any > > binaries passing int? > > That way the actual meaning of the parameter doesn't change, but it > still > > allows sub-millisecond values (all-be-it with some loss of accuracy > due to > > float).
Too exotic for my taste. I would rather introduce rte_epoll_wait_ns() with timeout in nanoseconds than pass a float. > > Sorry I'm not following why you want to use rte_epoll_wait()? I don't have experience with it yet, but it seems to be the official DPDK API for blocking I/O system call. > > If the realtime thread has some blocking system calls, > no sleep is needed I think. Correct. > For average realtime thread, I suggest the API > rte_thread_yield_realtime() > which could wait for 1 ms or less by default. If we introduce a yield() function, it should yield according to the O/S scheduling policy, e.g. the rest of the time slice allocated to the thread by the O/S scheduler (although that might not be available for real-time prioritized threads in Linux). I don't think we can make it O/S agnostic. I don't think it should wait a fixed amount of time - we already have rte_delay_us_sleep() for that. In my experiments with power saving, I ended up with a varying sleep duration, depending on traffic load. The l3fwd-power example also uses a varying sleep duration depending on traffic load. > For smaller sleep, we can use PR_SET_TIMERSLACK and > rte_delay_us_sleep(). Agree. > If we provide an API for PR_SET_TIMERSLACK, we could adapt the duration > of rte_thread_yield_realtime() dynamically after calling prctl(). > I'm not sure exposing an API for PR_SET_TIMERSLACK is the right solution. I would rather have the EAL set the timer slack to minimum (1 ns) at EAL initialization. An EAL command line parameter could be added to change the default from 1 ns. Also, something similar needs to be done for Windows.