> -----邮件原件----- > 发件人: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > 发送时间: 2021年2月26日 7:33 > 收件人: Feifei Wang <feifei.wa...@arm.com>; hemant.agra...@nxp.com; > nipun.gu...@nxp.com; jer...@marvell.com; harry.van.haa...@intel.com; > bruce.richard...@intel.com; dmitry.kozl...@gmail.com; > navas...@linux.microsoft.com; dmit...@microsoft.com; > pallavi.ka...@intel.com; tho...@monjalon.net; > david.march...@redhat.com; konstantin.anan...@intel.com > 抄送: dev@dpdk.org; Ruifeng Wang <ruifeng.w...@arm.com>; nd > <n...@arm.com>; nd <n...@arm.com> > 主题: RE: [RFC 3/5] eal: lcore state FINISHED is not required > > +Thomas, David, Konstantin for input > > <snip> > > > > Subject: [RFC 3/5] eal: lcore state FINISHED is not required > > > > > > FINISHED state seems to be used to indicate that the worker's update > > > of the 'state' is not visible to other threads. There seems to be no > > > requirement to have such a state. > > > > I am not sure "FINISHED" is necessary to be removed, and I propose > > some of my profiles for discussion. > > There are three states for lcore now: > > "WAIT": indicate lcore can start working > > "RUNNING": indicate lcore is working > > "FINISHED": indicate lcore has finished its working and wait to be > > reset > If you look at the definitions of "WAIT" and "FINISHED" states, they look > similar, except for "wait to be reset" in "FINISHED" state . The code really > does not do anything to reset the lcore. It just changes the state to "WAIT". > > > > > From the description above, we can find "FINISHED" is different from > > "WAIT", it can shows that lcore has done the work and finished it. > > Thus, if we remove "FINISHED", maybe we will not know whether the > > lcore finishes its work or just doesn't start, because this two state has > > the > same tag "WAIT". > Looking at "eal_thread_loop", the worker thread sets the state to > "RUNNING" before sending the ack back to main core. After that it is > guaranteed that the worker will run the assigned function. Only case where it > will not run the assigned function is when the 'write' syscall fails, in which > case it results in a panic. >
I agree that the worker can be guaranteed to run the assigned function. But I means that we cannot know when the worker start or when the worker finishes its working if "Finished" is removed. Please refer to the following Example for further explanation. > > > > Furthermore, consider such a scenario: > > Core 1 need to monitor Core 2 state, if Core 2 finishes one task, Core > > 1 can start its working. > > However, if there is only one tag "WAIT", Core 1 maybe start its > > work at the wrong time, when Core 2 still does not start its task at state > "WAIT". > > This is just my guess, and at present, there is no similar application > > scenario in dpdk. > To be able to do this effectively, core 1 needs to observe the state change > from WAIT->RUNNING->FINISHED. This requires that core 1 should be calling > rte_eal_remote_launch and rte_eal_wait_lcore functions. It is not possible > to observe this state transition from a 3rd core (for ex: a worker might go > from RUNNING->FINISHED->WAIT->RUNNING which a 3rd core might not be > able to observe). Time Slot Core 1 Core 2 Core 3(main core) 1 eal_thread_loop rte_eal_remote_launch 2 WAIT 3 if(rte_get_lcore_state ==FINISHED) RUNNING ^ 4 | execute f 5 v FINISHDED 6 do some operations 7 rte_eal_wait_lcore I means that Core 1 is an additional thread which observes Core 2. It can use rte_get_lcore_state API to know core 2 state. However, I just find "rte_get_lcore_state" API only can be called by the main core. This is the same as you say that worker can not be observed by each other. As a result, I agree with you that "FINISHED" state is redundant and it can be removed from this current dpdk version. > > > > > On the other hand, if we decide to remove "FINISHED", please consider > > the following files: > > 1. lib/librte_eal/linux/eal_thread.c: line 31 > > lib/librte_eal/windows/eal_thread.c: line 22 > > lib/librte_eal/freebsd/eal_thread.c: line 31 > I have looked at these lines, they do not capture "why" FINISHED state is > required. I mean if we have removed "FINISHED", we should change the description here, "FINISHED" should be replaced by "WAIT". > > 2. > > lib/librte_eal/include/rte_launch.h: line 24, 44, 121, 123, 131 3. > > examples/l2fwd- > > keepalive/main.c: line 510 > > rte_eal_wait_lcore(id_core) can be removed. Because the core state has > > been checked as "WAIT", this is a redundant operation > > > > Best Regards > > Feifei > > > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com> > > > --- > > > drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 2 +- > > > drivers/event/octeontx/ssovf_evdev_selftest.c | 2 +- > > > drivers/event/sw/sw_evdev_selftest.c | 4 ++-- > > > examples/l2fwd-keepalive/main.c | 2 +- > > > lib/librte_eal/common/eal_common_launch.c | 7 ++----- > > > lib/librte_eal/freebsd/eal_thread.c | 2 +- > > > lib/librte_eal/linux/eal_thread.c | 8 +------- > > > lib/librte_eal/windows/eal_thread.c | 8 +------- > > > 8 files changed, 10 insertions(+), 25 deletions(-) > > > > > > diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c > > > b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c > > > index cd7311a94..bbbd20951 100644 > > > --- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c > > > +++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c > > > @@ -468,7 +468,7 @@ wait_workers_to_join(int lcore, const > > > rte_atomic32_t > > > *count) RTE_SET_USED(count); > > > > > > print_cycles = cycles = rte_get_timer_cycles(); -while > > > (rte_eal_get_lcore_state(lcore) != FINISHED) { > > > +while (rte_eal_get_lcore_state(lcore) != WAIT) { > > > uint64_t new_cycles = rte_get_timer_cycles(); > > > > > > if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git > > > a/drivers/event/octeontx/ssovf_evdev_selftest.c > > > b/drivers/event/octeontx/ssovf_evdev_selftest.c > > > index 528f99dd8..d7b0d2211 100644 > > > --- a/drivers/event/octeontx/ssovf_evdev_selftest.c > > > +++ b/drivers/event/octeontx/ssovf_evdev_selftest.c > > > @@ -579,7 +579,7 @@ wait_workers_to_join(int lcore, const > > > rte_atomic32_t > > > *count) RTE_SET_USED(count); > > > > > > print_cycles = cycles = rte_get_timer_cycles(); -while > > > (rte_eal_get_lcore_state(lcore) != FINISHED) { > > > +while (rte_eal_get_lcore_state(lcore) != WAIT) { > > > uint64_t new_cycles = rte_get_timer_cycles(); > > > > > > if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git > > > a/drivers/event/sw/sw_evdev_selftest.c > > > b/drivers/event/sw/sw_evdev_selftest.c > > > index e4bfb3a0f..7847a8645 100644 > > > --- a/drivers/event/sw/sw_evdev_selftest.c > > > +++ b/drivers/event/sw/sw_evdev_selftest.c > > > @@ -3138,8 +3138,8 @@ worker_loopback(struct test *t, uint8_t > > > disable_implicit_release) > > > rte_eal_remote_launch(worker_loopback_worker_fn, t, w_lcore); > > > > > > print_cycles = cycles = rte_get_timer_cycles(); -while > > > (rte_eal_get_lcore_state(p_lcore) != FINISHED || > > > -rte_eal_get_lcore_state(w_lcore) != FINISHED) { > > > +while (rte_eal_get_lcore_state(p_lcore) != WAIT || > > > +rte_eal_get_lcore_state(w_lcore) != WAIT) { > > > > > > rte_service_run_iter_on_app_lcore(t->service_id, 1); > > > > > > diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd- > > > keepalive/main.c index e4c2b2793..dd777c46a 100644 > > > --- a/examples/l2fwd-keepalive/main.c > > > +++ b/examples/l2fwd-keepalive/main.c > > > @@ -506,7 +506,7 @@ dead_core(__rte_unused void *ptr_data, const > int > > > id_core) if (terminate_signal_received) return; printf("Dead core > > > %i - restarting..\n", id_core); -if > > > (rte_eal_get_lcore_state(id_core) == FINISHED) { > > > +if (rte_eal_get_lcore_state(id_core) == WAIT) { > > > rte_eal_wait_lcore(id_core); > > > rte_eal_remote_launch(l2fwd_launch_one_lcore, NULL, id_core); } > > > else { diff --git a/lib/librte_eal/common/eal_common_launch.c > > > b/lib/librte_eal/common/eal_common_launch.c > > > index 34f854ad8..78fd94026 100644 > > > --- a/lib/librte_eal/common/eal_common_launch.c > > > +++ b/lib/librte_eal/common/eal_common_launch.c > > > @@ -26,14 +26,11 @@ rte_eal_wait_lcore(unsigned worker_id) if > > > (lcore_config[worker_id].state == WAIT) return 0; > > > > > > -while (lcore_config[worker_id].state != WAIT && > > > - lcore_config[worker_id].state != FINISHED) > > > +while (lcore_config[worker_id].state != WAIT) > > > rte_pause(); > > > > > > rte_rmb(); > > > > > > -/* we are in finished state, go to wait state */ - > > > lcore_config[worker_id].state = WAIT; return > > > lcore_config[worker_id].ret; } > > > > > > @@ -62,7 +59,7 @@ rte_eal_mp_remote_launch(int (*f)(void *), void > > > *arg, > > > > > > if (call_main == CALL_MAIN) { > > > lcore_config[main_lcore].ret = f(arg); > > > -lcore_config[main_lcore].state = FINISHED; > > > +lcore_config[main_lcore].state = WAIT; > > > } > > > > > > return 0; > > > diff --git a/lib/librte_eal/freebsd/eal_thread.c > > > b/lib/librte_eal/freebsd/eal_thread.c > > > index 17b8f3996..6d6f1e2fd 100644 > > > --- a/lib/librte_eal/freebsd/eal_thread.c > > > +++ b/lib/librte_eal/freebsd/eal_thread.c > > > @@ -140,7 +140,7 @@ eal_thread_loop(__rte_unused void *arg) > > > lcore_config[lcore_id].f = NULL; lcore_config[lcore_id].arg = NULL; > > > rte_wmb(); -lcore_config[lcore_id].state = FINISHED; > > > +lcore_config[lcore_id].state = WAIT; > > > } > > > > > > /* never reached */ > > > diff --git a/lib/librte_eal/linux/eal_thread.c > > > b/lib/librte_eal/linux/eal_thread.c > > > index a0a009104..7b9463df3 100644 > > > --- a/lib/librte_eal/linux/eal_thread.c > > > +++ b/lib/librte_eal/linux/eal_thread.c > > > @@ -141,13 +141,7 @@ eal_thread_loop(__rte_unused void *arg) > > > lcore_config[lcore_id].arg = NULL; rte_wmb(); > > > > > > -/* when a service core returns, it should go directly to WAIT > > > - * state, because the application will not lcore_wait() for it. > > > - */ > > > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) - > > > lcore_config[lcore_id].state = WAIT; -else > > > -lcore_config[lcore_id].state = FINISHED; > > > +lcore_config[lcore_id].state = WAIT; > > > } > > > > > > /* never reached */ > > > diff --git a/lib/librte_eal/windows/eal_thread.c > > > b/lib/librte_eal/windows/eal_thread.c > > > index 7a9277c51..35d059a30 100644 > > > --- a/lib/librte_eal/windows/eal_thread.c > > > +++ b/lib/librte_eal/windows/eal_thread.c > > > @@ -125,13 +125,7 @@ eal_thread_loop(void *arg __rte_unused) > > > lcore_config[lcore_id].arg = NULL; rte_wmb(); > > > > > > -/* when a service core returns, it should go directly to WAIT > > > - * state, because the application will not lcore_wait() for it. > > > - */ > > > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) - > > > lcore_config[lcore_id].state = WAIT; -else > > > -lcore_config[lcore_id].state = FINISHED; > > > +lcore_config[lcore_id].state = WAIT; > > > } > > > } > > > > > > -- > > > 2.17.1 > > >