Hi Minh,

Before IMMND get re-intro  rsp from IMMD,
it get IMMND_EVT_D2ND_PBE_PRTO_PURGE_MUTATIONS from IMMD broadcast
then crash because it's used on different partition.
As IMMND crash then it cannot reboot node as expected.

The patch to help not incident get broadcast event before getting re-intro rsp.

Best Regards,
ThuanTr

-----Original Message-----
From: Minh Hon Chau <minh.c...@dektech.com.au> 
Sent: Friday, September 18, 2020 9:11 AM
To: Thuan Tran <thuan.t...@dektech.com.au>; Thang Duc Nguyen 
<thang.d.ngu...@dektech.com.au>; Thanh Nguyen <thanh.ngu...@dektech.com.au>; 
Thien Minh Huynh <thien.m.hu...@dektech.com.au>
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1/1] imm: fix immnd crash in multi partitioned clusters 
rejoin [#3219]

Hi Thuan,

Can you elaborate a bit more:

- how the crash is happened, what's the cause.

- how the patch would work.

Thanks

Minh

On 17/9/20 1:35 pm, thuan.tran wrote:
> - immnd prioritize re-introduce rsp from immd.
> - immnd ignore broadcast events from IMMD if re-introduce on-going.
> ---
>   src/imm/immnd/immnd_evt.c | 21 +++++++++++++++------
>   src/imm/immnd/immnd_mds.c |  5 ++++-
>   2 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/src/imm/immnd/immnd_evt.c b/src/imm/immnd/immnd_evt.c
> index afc2106a0..714a75ca2 100644
> --- a/src/imm/immnd/immnd_evt.c
> +++ b/src/imm/immnd/immnd_evt.c
> @@ -625,6 +625,21 @@ void immnd_process_evt(void)
>               return;
>       }
>   
> +     if ((cb->mIntroduced == 2) &&
> +         ((evt->info.immnd.type == IMMND_EVT_D2ND_SYNC_START) ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_SYNC_ABORT) ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_PBE_PRTO_PURGE_MUTATIONS) 
> ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_DUMP_OK) ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_LOADING_OK) ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_GLOB_FEVS_REQ) ||
> +          (evt->info.immnd.type == IMMND_EVT_D2ND_GLOB_FEVS_REQ_2))) {
> +             LOG_WA("DISCARD message %s from IMMD %x as re-intro on-going",
> +                 immsv_get_immnd_evt_name(evt->info.immnd.type),
> +                 evt->sinfo.node_id);
> +             immnd_evt_destroy(evt, true, __LINE__);
> +             return;
> +     }
> +
>       if ((evt->info.immnd.type != IMMND_EVT_D2ND_GLOB_FEVS_REQ) &&
>           (evt->info.immnd.type != IMMND_EVT_D2ND_GLOB_FEVS_REQ_2))
>               immsv_msg_trace_rec(evt->sinfo.dest, evt);
> @@ -10779,12 +10794,6 @@ static uint32_t immnd_evt_proc_fevs_rcv(IMMND_CB 
> *cb, IMMND_EVT *evt,
>                            : false;
>       TRACE_ENTER();
>   
> -     if (cb->mIntroduced == 2) {
> -             LOG_WA("DISCARD FEVS message:%llu from %x", msgNo, 
> sinfo->node_id);
> -             dequeue_outgoing(cb);
> -             return NCSCC_RC_FAILURE;
> -     }
> -
>       if (cb->highestProcessed >= msgNo) {
>               /*We have already received this message, discard it. */
>               LOG_WA(
> diff --git a/src/imm/immnd/immnd_mds.c b/src/imm/immnd/immnd_mds.c
> index 02cb4b552..d9cccd5d9 100644
> --- a/src/imm/immnd/immnd_mds.c
> +++ b/src/imm/immnd/immnd_mds.c
> @@ -552,7 +552,10 @@ static uint32_t immnd_mds_rcv(IMMND_CB *cb, 
> MDS_CALLBACK_RECEIVE_INFO *rcv_info)
>       }
>   
>       /* Put it in IMMND's Event Queue */
> -     if (pEvt->info.immnd.type == IMMND_EVT_A2ND_IMM_INIT)
> +     if (pEvt->info.immnd.type == IMMND_EVT_D2ND_INTRO_RSP)
> +             rc = m_NCS_IPC_SEND(&cb->immnd_mbx, (NCSCONTEXT)pEvt,
> +                                 NCS_IPC_PRIORITY_VERY_HIGH);
> +     else if (pEvt->info.immnd.type == IMMND_EVT_A2ND_IMM_INIT)
>               rc = m_NCS_IPC_SEND(&cb->immnd_mbx, (NCSCONTEXT)pEvt,
>                                   NCS_IPC_PRIORITY_HIGH);
>       else

_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to