Hi Hans,

Thanks for your comment. 

But I has a concern that the service-monitoring function may not fully work 
if a service is crashed before the svc_monitor_thread goes to ready state?

Is it mandatory for monitoring thread to enter ready state before spawning
SAF services?

Regards, Vu

> -----Original Message-----
> From: Hans Nordebäck <hans.nordeb...@ericsson.com>
> Sent: Wednesday, February 27, 2019 8:23 PM
> To: Vu Minh Nguyen <vu.m.ngu...@dektech.com.au>; Gary Lee
> <gary....@dektech.com.au>
> Cc: opensaf-devel@lists.sourceforge.net; Vu Minh Nguyen
> <vu.m.ngu...@dektech.com.au>
> Subject: RE: [PATCH 1/1] nid: fix opensafd crashed during start-up [#3013]
> 
> Hi Vu,
> I discussed a bit with Anders, likely it should work if the socketpair is
changed
> to socketpair(AF_UNIX, SOCK_DGRAM .. from SOCK_STREAM. /BR Hans
> 
> -----Original Message-----
> From: Hans Nordebäck
> Sent: den 27 februari 2019 11:55
> To: 'Vu Minh Nguyen' <vu.m.ngu...@dektech.com.au>; Gary Lee
> <gary....@dektech.com.au>
> Cc: opensaf-devel@lists.sourceforge.net; Vu Minh Nguyen
> <vu.m.ngu...@dektech.com.au>
> Subject: RE: [PATCH 1/1] nid: fix opensafd crashed during start-up [#3013]
> 
> Hi Vu,
> ack, code review only/Thanks HansN
> 
> -----Original Message-----
> From: Vu Minh Nguyen <vu.m.ngu...@dektech.com.au>
> Sent: den 27 februari 2019 11:48
> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Gary Lee
> <gary....@dektech.com.au>
> Cc: opensaf-devel@lists.sourceforge.net; Vu Minh Nguyen
> <vu.m.ngu...@dektech.com.au>
> Subject: [PATCH 1/1] nid: fix opensafd crashed during start-up [#3013]
> 
> There is a dependency b/w svc_monitor_thread and spawn_services.
> The coredump happens when spawn_services is executed while the thread
> has not yet started. In this case, data is sent to the pipe but no one
consumed
> it. Later on, reading data from the pipe, will get unexpected data and
crash
> the program.
> 
> This patch ensures the order: svc_monitor_thread must be in ready state
> before spawn_services() is executed.
> ---
>  src/nid/nodeinit.cc | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/nid/nodeinit.cc b/src/nid/nodeinit.cc index
> 5f15916b4..b4945b05c 100644
> --- a/src/nid/nodeinit.cc
> +++ b/src/nid/nodeinit.cc
> @@ -134,6 +134,7 @@ static int start_monitor_svc(const char *svc);
>  /* Data declarations for service monitoring */  static int svc_mon_fd =
-1;
> static int next_svc_fds_slot = 0;
> +static bool svc_monitor_thread_running = false;
> 
>  struct SAFServices {
>    const std::string fifo_dir = PKGLOCALSTATEDIR; @@ -1517,6 +1518,7 @@
> void *svc_monitor_thread(void *fd) {
>    next_svc_fds_slot++;
> 
>    while (true) {
> +    svc_monitor_thread_running = true;
>      unsigned rc = osaf_poll(fds, next_svc_fds_slot, -1);
>      if (rc > 0) {
>        // check if any monitored service has exit @@ -1655,6 +1657,15 @@
int
> main(int argc, char *argv[]) {
>      exit(EXIT_FAILURE);
>    }
> 
> +  // Waiting until svc_monitor_thread is up and in ready state.
> +  // If spawn_services runs before the thread is in ready state,  //
> + receive side of the pipe s_pair will get unexpected data and  // may
> + crash the process.
> +  while (svc_monitor_thread_running == false) {
> +    usleep(100);
> +  }
> +
> +  LOG_NO("svc_monitor_thread is up and in ready state");
>    if (parse_nodeinit_conf(sbuf) != NCSCC_RC_SUCCESS) {
>      LOG_ER("Failed to parse file %s. Exiting", sbuf);
>      exit(EXIT_FAILURE);
> --
> 2.19.2




_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to