> -----邮件原件-----
> 发件人: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> 发送时间: 2021年2月26日 7:33
> 收件人: Feifei Wang <feifei.wa...@arm.com>; hemant.agra...@nxp.com;
> nipun.gu...@nxp.com; jer...@marvell.com; harry.van.haa...@intel.com;
> bruce.richard...@intel.com; dmitry.kozl...@gmail.com;
> navas...@linux.microsoft.com; dmit...@microsoft.com;
> pallavi.ka...@intel.com; tho...@monjalon.net;
> david.march...@redhat.com; konstantin.anan...@intel.com
> 抄送: dev@dpdk.org; Ruifeng Wang <ruifeng.w...@arm.com>; nd
> <n...@arm.com>; nd <n...@arm.com>
> 主题: RE: [RFC 3/5] eal: lcore state FINISHED is not required
> 
> +Thomas, David, Konstantin for input
> 
> <snip>
> 
> > > Subject: [RFC 3/5] eal: lcore state FINISHED is not required
> > >
> > > FINISHED state seems to be used to indicate that the worker's update
> > > of the 'state' is not visible to other threads. There seems to be no
> > > requirement to have such a state.
> >
> > I am not sure "FINISHED" is necessary to be removed, and I propose
> > some of my profiles for discussion.
> >  There are three states for lcore now:
> > "WAIT": indicate lcore can start working
> > "RUNNING": indicate lcore is working
> > "FINISHED": indicate lcore has finished its working and wait to be
> > reset
> If you look at the definitions of "WAIT" and "FINISHED" states, they look
> similar, except for "wait to be reset" in "FINISHED" state . The code really
> does not do anything to reset the lcore. It just changes the state to "WAIT".
> 
> >
> > From the description above, we can find "FINISHED" is different from
> > "WAIT", it can shows that lcore has done the work and finished it.
> > Thus, if we remove "FINISHED", maybe we will not know whether the
> > lcore finishes its work or just doesn't start, because this two state has 
> > the
> same tag "WAIT".
> Looking at "eal_thread_loop", the worker thread sets the state to
> "RUNNING" before sending the ack back to main core. After that it is
> guaranteed that the worker will run the assigned function. Only case where it
> will not run the assigned function is when the 'write' syscall fails, in which
> case it results in a panic.
> 

I agree that the worker can be guaranteed to run the assigned function. 
But I means that we cannot know when the worker start or when the worker
finishes its working if "Finished" is removed. Please refer to the following
Example for further explanation.

> >
> > Furthermore, consider such a scenario:
> > Core 1 need to monitor Core 2 state, if Core 2 finishes one task, Core
> > 1 can start its working.
> > However, if there is only  one tag "WAIT", Core 1 maybe  start its
> > work at the wrong time, when Core 2 still does not start its task at state
> "WAIT".
> > This is just my guess, and at present, there is no similar application
> > scenario in dpdk.
> To be able to do this effectively, core 1 needs to observe the state change
> from WAIT->RUNNING->FINISHED. This requires that core 1 should be calling
> rte_eal_remote_launch and rte_eal_wait_lcore functions. It is not possible
> to observe this state transition from a 3rd core (for ex: a worker might go
> from RUNNING->FINISHED->WAIT->RUNNING which a 3rd core might not be
> able to observe).

Time Slot               Core 1                                               
Core 2                       Core 3(main core)
      1                                                                         
 eal_thread_loop     rte_eal_remote_launch
      2                                                                         
          WAIT
      3       if(rte_get_lcore_state ==FINISHED)          RUNNING
                                       ^
      4                               |                                         
      execute f
      5                               v                                         
     FINISHDED 
      6                   do some operations
      7                                                                         
                                       rte_eal_wait_lcore

I means that Core 1 is an additional thread which observes Core 2. It can use 
rte_get_lcore_state API to
know core 2 state. However, I just find "rte_get_lcore_state" API only can be 
called by the main core. This
is the same as you say that worker can not be observed by each other. 
As a result, I agree with you that "FINISHED" state is  redundant  and it can 
be removed from this current dpdk version.  

> 
> >
> > On the other hand, if we decide to remove "FINISHED", please consider
> > the following files:
> > 1. lib/librte_eal/linux/eal_thread.c: line 31
> >     lib/librte_eal/windows/eal_thread.c: line 22
> >     lib/librte_eal/freebsd/eal_thread.c: line 31
> I have looked at these lines, they do not capture "why" FINISHED state is
> required.

I mean if we have removed "FINISHED", we should change the description here, 
"FINISHED"
should be replaced by "WAIT". 

> 
>  2.
> > lib/librte_eal/include/rte_launch.h: line 24, 44, 121, 123, 131 3.
> > examples/l2fwd-
> > keepalive/main.c: line 510
> > rte_eal_wait_lcore(id_core) can be removed. Because the core state has
> > been checked as "WAIT", this is a redundant operation
> >
> > Best Regards
> > Feifei
> >
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
> > > ---
> > >  drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 2 +-
> > > drivers/event/octeontx/ssovf_evdev_selftest.c | 2 +-
> > >  drivers/event/sw/sw_evdev_selftest.c          | 4 ++--
> > >  examples/l2fwd-keepalive/main.c               | 2 +-
> > >  lib/librte_eal/common/eal_common_launch.c     | 7 ++-----
> > >  lib/librte_eal/freebsd/eal_thread.c           | 2 +-
> > >  lib/librte_eal/linux/eal_thread.c             | 8 +-------
> > >  lib/librte_eal/windows/eal_thread.c           | 8 +-------
> > >  8 files changed, 10 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > > b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > > index cd7311a94..bbbd20951 100644
> > > --- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > > +++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > > @@ -468,7 +468,7 @@ wait_workers_to_join(int lcore, const
> > > rte_atomic32_t
> > > *count)  RTE_SET_USED(count);
> > >
> > >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > > (rte_eal_get_lcore_state(lcore) != FINISHED) {
> > > +while (rte_eal_get_lcore_state(lcore) != WAIT) {
> > >  uint64_t new_cycles = rte_get_timer_cycles();
> > >
> > >  if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git
> > > a/drivers/event/octeontx/ssovf_evdev_selftest.c
> > > b/drivers/event/octeontx/ssovf_evdev_selftest.c
> > > index 528f99dd8..d7b0d2211 100644
> > > --- a/drivers/event/octeontx/ssovf_evdev_selftest.c
> > > +++ b/drivers/event/octeontx/ssovf_evdev_selftest.c
> > > @@ -579,7 +579,7 @@ wait_workers_to_join(int lcore, const
> > > rte_atomic32_t
> > > *count)  RTE_SET_USED(count);
> > >
> > >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > > (rte_eal_get_lcore_state(lcore) != FINISHED) {
> > > +while (rte_eal_get_lcore_state(lcore) != WAIT) {
> > >  uint64_t new_cycles = rte_get_timer_cycles();
> > >
> > >  if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git
> > > a/drivers/event/sw/sw_evdev_selftest.c
> > > b/drivers/event/sw/sw_evdev_selftest.c
> > > index e4bfb3a0f..7847a8645 100644
> > > --- a/drivers/event/sw/sw_evdev_selftest.c
> > > +++ b/drivers/event/sw/sw_evdev_selftest.c
> > > @@ -3138,8 +3138,8 @@ worker_loopback(struct test *t, uint8_t
> > > disable_implicit_release)
> > > rte_eal_remote_launch(worker_loopback_worker_fn, t, w_lcore);
> > >
> > >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > > (rte_eal_get_lcore_state(p_lcore) != FINISHED ||
> > > -rte_eal_get_lcore_state(w_lcore) != FINISHED) {
> > > +while (rte_eal_get_lcore_state(p_lcore) != WAIT ||
> > > +rte_eal_get_lcore_state(w_lcore) != WAIT) {
> > >
> > >  rte_service_run_iter_on_app_lcore(t->service_id, 1);
> > >
> > > diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-
> > > keepalive/main.c index e4c2b2793..dd777c46a 100644
> > > --- a/examples/l2fwd-keepalive/main.c
> > > +++ b/examples/l2fwd-keepalive/main.c
> > > @@ -506,7 +506,7 @@ dead_core(__rte_unused void *ptr_data, const
> int
> > > id_core)  if (terminate_signal_received)  return;  printf("Dead core
> > > %i - restarting..\n", id_core); -if
> > > (rte_eal_get_lcore_state(id_core) == FINISHED) {
> > > +if (rte_eal_get_lcore_state(id_core) == WAIT) {
> > >  rte_eal_wait_lcore(id_core);
> > >  rte_eal_remote_launch(l2fwd_launch_one_lcore, NULL, id_core);  }
> > > else { diff --git a/lib/librte_eal/common/eal_common_launch.c
> > > b/lib/librte_eal/common/eal_common_launch.c
> > > index 34f854ad8..78fd94026 100644
> > > --- a/lib/librte_eal/common/eal_common_launch.c
> > > +++ b/lib/librte_eal/common/eal_common_launch.c
> > > @@ -26,14 +26,11 @@ rte_eal_wait_lcore(unsigned worker_id)  if
> > > (lcore_config[worker_id].state == WAIT)  return 0;
> > >
> > > -while (lcore_config[worker_id].state != WAIT &&
> > > -       lcore_config[worker_id].state != FINISHED)
> > > +while (lcore_config[worker_id].state != WAIT)
> > >  rte_pause();
> > >
> > >  rte_rmb();
> > >
> > > -/* we are in finished state, go to wait state */ -
> > > lcore_config[worker_id].state = WAIT;  return
> > > lcore_config[worker_id].ret;  }
> > >
> > > @@ -62,7 +59,7 @@ rte_eal_mp_remote_launch(int (*f)(void *), void
> > > *arg,
> > >
> > >  if (call_main == CALL_MAIN) {
> > >  lcore_config[main_lcore].ret = f(arg);
> > > -lcore_config[main_lcore].state = FINISHED;
> > > +lcore_config[main_lcore].state = WAIT;
> > >  }
> > >
> > >  return 0;
> > > diff --git a/lib/librte_eal/freebsd/eal_thread.c
> > > b/lib/librte_eal/freebsd/eal_thread.c
> > > index 17b8f3996..6d6f1e2fd 100644
> > > --- a/lib/librte_eal/freebsd/eal_thread.c
> > > +++ b/lib/librte_eal/freebsd/eal_thread.c
> > > @@ -140,7 +140,7 @@ eal_thread_loop(__rte_unused void *arg)
> > > lcore_config[lcore_id].f = NULL;  lcore_config[lcore_id].arg = NULL;
> > > rte_wmb(); -lcore_config[lcore_id].state = FINISHED;
> > > +lcore_config[lcore_id].state = WAIT;
> > >  }
> > >
> > >  /* never reached */
> > > diff --git a/lib/librte_eal/linux/eal_thread.c
> > > b/lib/librte_eal/linux/eal_thread.c
> > > index a0a009104..7b9463df3 100644
> > > --- a/lib/librte_eal/linux/eal_thread.c
> > > +++ b/lib/librte_eal/linux/eal_thread.c
> > > @@ -141,13 +141,7 @@ eal_thread_loop(__rte_unused void *arg)
> > > lcore_config[lcore_id].arg = NULL;  rte_wmb();
> > >
> > > -/* when a service core returns, it should go directly to WAIT
> > > - * state, because the application will not lcore_wait() for it.
> > > - */
> > > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) -
> > > lcore_config[lcore_id].state = WAIT; -else
> > > -lcore_config[lcore_id].state = FINISHED;
> > > +lcore_config[lcore_id].state = WAIT;
> > >  }
> > >
> > >  /* never reached */
> > > diff --git a/lib/librte_eal/windows/eal_thread.c
> > > b/lib/librte_eal/windows/eal_thread.c
> > > index 7a9277c51..35d059a30 100644
> > > --- a/lib/librte_eal/windows/eal_thread.c
> > > +++ b/lib/librte_eal/windows/eal_thread.c
> > > @@ -125,13 +125,7 @@ eal_thread_loop(void *arg __rte_unused)
> > > lcore_config[lcore_id].arg = NULL;  rte_wmb();
> > >
> > > -/* when a service core returns, it should go directly to WAIT
> > > - * state, because the application will not lcore_wait() for it.
> > > - */
> > > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) -
> > > lcore_config[lcore_id].state = WAIT; -else
> > > -lcore_config[lcore_id].state = FINISHED;
> > > +lcore_config[lcore_id].state = WAIT;
> > >  }
> > >  }
> > >
> > > --
> > > 2.17.1
> > >

Reply via email to