I updated solution and sent out V2. -----Original Message----- From: Minh Hon Chau <minh.c...@dektech.com.au> Sent: Tuesday, April 21, 2020 2:23 PM To: Thuan Tran <thuan.t...@dektech.com.au>; Thang Duc Nguyen <thang.d.ngu...@dektech.com.au> Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get operation invoke name [#3178]
Agree. On 21/4/20 12:24 pm, Thuan Tran wrote: > Hi, > > If there is no way to get admin owner or object implementer in middle of one > CCB many operations. > Then a "unknown" invoker is better than keep restarting by each operation of > that CCB. > > Best Regards, > ThuanTr > > -----Original Message----- > From: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au> > Sent: Tuesday, April 21, 2020 8:39 AM > To: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au>; Minh Hon Chau > <minh.c...@dektech.com.au>; Thuan Tran <thuan.t...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get > operation invoke name [#3178] > > Update. > > If we accept to avoid coredump, there is @operation_invoke_name that needs to > be freed before exit? > [Thang]: as above can fill invoke_name as unknown in this case to avoid the > coredump. > And free in applyccbcb. > > -----Original Message----- > From: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au> > Sent: Tuesday, April 21, 2020 8:29 AM > To: Minh Hon Chau <minh.c...@dektech.com.au>; Thuan Tran > <thuan.t...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 1/1] ntf: restart ntfimcnd if it fails to > get operation invoke name [#3178] > > Hi Minh, > See my command inline. > > -----Original Message----- > From: Minh Hon Chau <minh.c...@dektech.com.au> > Sent: Monday, April 20, 2020 5:24 PM > To: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au>; Thuan Tran > <thuan.t...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get > operation invoke name [#3178] > > Hi Thang, > > I understand the invoke_name is only present in the first callback, thus > ntfimcn must memorize it in the userdata. My question is, is it ok that this > userdata being lost because ntfimcn restart? I think it is, since the ccb has > not committed. > [Thang]: can accept it and fill invoke_name as unknown instead of do nothing. > > If we accept the userdata being lost, then we can look at to avoid the > coredump, otherwise Thuan can give an idea if it is imm issue that causes the > lost userdata. > > If we accept to avoid coredump, there is @operation_invoke_name that needs to > be freed before exit? > [Thang]: as above can fill invoke_name as unknown in this case to avoid the > coredump. > > > thanks > > Minh > > On 20/4/20 6:30 pm, Thang Duc Nguyen wrote: >> Hi Minh, >> >> See my comment inline. >> >> -----Original Message----- >> From: Minh Hon Chau <minh.c...@dektech.com.au> >> Sent: Monday, April 20, 2020 11:51 AM >> To: Thuan Tran <thuan.t...@dektech.com.au>; Thang Duc Nguyen >> <thang.d.ngu...@dektech.com.au> >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >> operation invoke name [#3178] >> >> Hi, >> >> One similarity to #2859 is that the invoke_name is only present in the first >> callback, so ntfimcn must memorize it in ccb userdata. >> >> But after ntfimcn calls ccbutil_ccbAddModifyOperation, this userdata is not >> written to immnd and sync across the other immnd(s)? >> Meanings the userdata is only stored in imm agent? So after switchover, the >> next ccb callback does not have the invoke_name, and ntfimcn has lost its >> user data since restart. >> >> [Thang]: with a ccb with multi ops. The invoke_name, in this case only the >> first op contain the adminOwnername. And after ntfimcnd restarts, it >> received the seond or larger op modify. And this modify callback does not >> contain any more about this invoke_name. >> Maybe we can retrieve the invoke_name from imm db but we can not got all >> info about all ops in that ccb. >> >> Thanks >> >> Minh >> >> On 16/4/20 3:32 pm, Thuan Tran wrote: >>> Hi, >>> >>> I think this is just enhancement, not an urgent fix. >>> Then we should make it better if possible. >>> >>> About #2859, I am not reviewer at that time. >>> But I would not agree that solution as we can see service keep >>> restart if service still start in middle of one CCB many operations. >>> >>> Best Regards, >>> ThuanTr >>> >>> -----Original Message----- >>> From: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au> >>> Sent: Thursday, April 16, 2020 10:51 AM >>> To: Thuan Tran <thuan.t...@dektech.com.au>; Minh Hon Chau >>> <minh.c...@dektech.com.au> >>> Cc: opensaf-devel@lists.sourceforge.net >>> Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> Hi Thuan, >>> >>> Thanks for your comment. >>> First this issue happen only in specific situation. And I think restart it >>> is no cause big issue. >>> And the ccb is internal data based mange by ntf/ntfimcnd. After >>> ntfimcnd restart, it reinitialize CcbUtilCcbData and operation invoke name >>> is empty. >>> >>> Moreover, in current code in ntfimcn_imm.c, there are many place use >>> imcn_exit(EXIT_FAILURE) when detect the error. Example for this is #2859. >>> We consider to open a new ticket to consider your suggestion by >>> refactor/change current behavior of ntfimcnd. >>> >>> B.R/Thang >>> >>> -----Original Message----- >>> From: Thuan Tran <thuan.t...@dektech.com.au> >>> Sent: Thursday, April 16, 2020 10:16 AM >>> To: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au>; Minh Hon Chau >>> <minh.c...@dektech.com.au> >>> Cc: opensaf-devel@lists.sourceforge.net >>> Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> Hi Thang, >>> >>> From reproduce method, with solution after exit (instead of crash), user >>> continue input another operation then service exit again. >>> The point is why we cannot get admin owner or object implementer via 2nd >>> imm modify callback in this scenario? >>> Is it an IMM limit that don't include admin owner or object implementer >>> from 2nd modify callback? >>> >>> If limit, can we use another way to get admin owner or object implementer >>> base on object name? >>> By this, we can avoid continuous exit if user keep going on operations by >>> same CCB. >>> >>> Best Regards, >>> ThuanTr >>> >>> -----Original Message----- >>> From: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au> >>> Sent: Wednesday, April 15, 2020 3:43 PM >>> To: Minh Hon Chau <minh.c...@dektech.com.au>; Thuan Tran >>> <thuan.t...@dektech.com.au> >>> Cc: opensaf-devel@lists.sourceforge.net; Thang Duc Nguyen >>> <thang.d.ngu...@dektech.com.au> >>> Subject: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> If ntfimcnd is restarted during ccb modify, it will initialize >>> ccbUtilCcbData that not contain operation invoke name. >>> This causes ntfimcnd crashed due to operation invoke name not existed. >>> >>> The fix is to restart ntfimcnd instead of raising the coredump. >>> --- >>> src/ntf/ntfimcnd/ntfimcn_imm.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/ntf/ntfimcnd/ntfimcn_imm.c >>> b/src/ntf/ntfimcnd/ntfimcn_imm.c index 3c0a8c02a..3563a2264 100644 >>> --- a/src/ntf/ntfimcnd/ntfimcn_imm.c >>> +++ b/src/ntf/ntfimcnd/ntfimcn_imm.c >>> @@ -376,9 +376,9 @@ get_operation_invoke_name_modify(SaImmOiCcbIdT ccbId, >>> goto done; >>> } >>> } >>> - /* If we get here no name is found! */ >>> + /* ntfimcnd was restarted during ccb midify */ >>> LOG_ER("%s no name was found", __FUNCTION__); >>> - osafassert(0); >>> + imcn_exit(EXIT_FAILURE); >>> >>> done: >>> TRACE_LEAVE(); >>> -- >>> 2.17.1 >>> > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel