Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Hans, I agree for that. Ticket #2204 is pushed in default branch only but marked as defect. It should be marked as enhancement. Thanks, Praveen On 22-Dec-16 1:50 PM, Hans Nordebäck wrote: > Hi Minh & Praveen, > > Do we agree to close ticket #2158 as a duplicate of #2204? > > /Thanks HansN > > -Original Message- > From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] > Sent: den 20 december 2016 08:18 > To: praveen malviya <praveen.malv...@oracle.com>; nagendr...@oracle.com; Gary > Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO > [#2158] > > Hi Praveen, > > I agree about your summary below. > > /Thanks HansN > > -Original Message- > From: praveen malviya [mailto:praveen.malv...@oracle.com] > Sent: den 20 december 2016 07:57 > To: Hans Nordebäck <hans.nordeb...@ericsson.com>; nagendr...@oracle.com; Gary > Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158] > > Hi Hans, > > Please see inline with [Praveen]. > > Thanks, > Praveen > > On 16-Dec-16 4:45 PM, Hans Nordeback wrote: >> Hi Praveen, please see inline with [HansN]. >> >> /Thanks HansN >> >> >> On 12/16/2016 11:13 AM, praveen malviya wrote: >>> Hi Hans, >>> >>> Please see inline with [Praveen]. >>> >>> Thanks, >>> Praveen >>> >>> On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >>>> Hi Praveen, >>>> >>>> please see some comments/questions inline below. >>>> >>>> /Thanks HansN >>>> >>>> >>>> On 12/16/2016 07:42 AM, praveen malviya wrote: >>>>> Hi Hans, >>>>> >>>>> Currently, AMFND responds to NID when all MW components are in AMF >>>>> control. So there is window. >>>> [HansN] you mean 'no window'? >>> [Praveen] Sorry I missed 'no'. It is indeed 'no window'. >>>>> I have explicitly mentioned in the patch below (see Note in main() >>>>> in amfnd.cc below), that this may be usable only in future. >>>>> However there is one very weak case, when AMFND is just starting >>>>> NoRed components like CPD etc and immnd has still not become AMF >>>>> component and also AMFND has not responded to NID. Now at this >>>>> stage if immnd process crashes then node will reboot only after CPD >>>>> instantiation timer expires which is about 10 seconds. >>>> [HansN] with #2204 v2 nid will detect that immnd has crashed and it >>>> is configurable in the opensafd script to reboot the node or not at >>>> nid failure. If amfnd, with this patch, also detects that immnd has >>>> crashed, there may be a raise between amfnd and nid, where amfnd >>>> performs a reboot while nid may be configured to not reboot? >>> [Praveen] I agree with this. But this puts a restriction on NID >>> recovery that its recovery can be reboot only because if NID wants to >>> restart IMMND process again, then all other MW processes needs to be >>> examined to take this into account and be ready for re-initialization >>> with IMM in case the restarted IMM does not resurrect old handles. >>> Not only handles, there can be other resources also. >> >> [HansN] the configurable restart of e.g. IMMND will not be done after >> nid notify. There will be no restart attempts of e.g IMMND after >> >> nid notify. The FIFOmonitoring will not restart the service, but exit. > [Praveen] So the concern that I raised does not hold now as NID will not > restart IMMND after it has done NID notify. So we have following sequences of > handlings: > 1) If IMMND crashes before replying to NID then NID will restart it. > 2) If IMMND crashes after NID reply then NID will exit with error. Now > opensafd script will reboot the node based on the user configured value of > REBOOT_ON_FAIL_TIMEOUT. > > Now case2 above gives choice to the user to configure for node reboot in such > cases. > Coming back to this patch of #2158: AMFND is started by NID as last service. > By this time, IMMND has already notified to the NID. As I had already said > #2158 patch will reboot the node. Here NID's action can be to exit and based > on NID status opensaf script will take action. So rebooting the node because > o
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Hans, As there's no window that immnd is unsupervised, I agree #2158 is a duplicate of #2204. Thanks, Minh On 22/12/16 19:20, Hans Nordebäck wrote: > Hi Minh & Praveen, > > Do we agree to close ticket #2158 as a duplicate of #2204? > > /Thanks HansN > > -Original Message- > From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] > Sent: den 20 december 2016 08:18 > To: praveen malviya <praveen.malv...@oracle.com>; nagendr...@oracle.com; Gary > Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO > [#2158] > > Hi Praveen, > > I agree about your summary below. > > /Thanks HansN > > -Original Message- > From: praveen malviya [mailto:praveen.malv...@oracle.com] > Sent: den 20 december 2016 07:57 > To: Hans Nordebäck <hans.nordeb...@ericsson.com>; nagendr...@oracle.com; Gary > Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158] > > Hi Hans, > > Please see inline with [Praveen]. > > Thanks, > Praveen > > On 16-Dec-16 4:45 PM, Hans Nordeback wrote: >> Hi Praveen, please see inline with [HansN]. >> >> /Thanks HansN >> >> >> On 12/16/2016 11:13 AM, praveen malviya wrote: >>> Hi Hans, >>> >>> Please see inline with [Praveen]. >>> >>> Thanks, >>> Praveen >>> >>> On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >>>> Hi Praveen, >>>> >>>> please see some comments/questions inline below. >>>> >>>> /Thanks HansN >>>> >>>> >>>> On 12/16/2016 07:42 AM, praveen malviya wrote: >>>>> Hi Hans, >>>>> >>>>> Currently, AMFND responds to NID when all MW components are in AMF >>>>> control. So there is window. >>>> [HansN] you mean 'no window'? >>> [Praveen] Sorry I missed 'no'. It is indeed 'no window'. >>>>> I have explicitly mentioned in the patch below (see Note in main() >>>>> in amfnd.cc below), that this may be usable only in future. >>>>> However there is one very weak case, when AMFND is just starting >>>>> NoRed components like CPD etc and immnd has still not become AMF >>>>> component and also AMFND has not responded to NID. Now at this >>>>> stage if immnd process crashes then node will reboot only after CPD >>>>> instantiation timer expires which is about 10 seconds. >>>> [HansN] with #2204 v2 nid will detect that immnd has crashed and it >>>> is configurable in the opensafd script to reboot the node or not at >>>> nid failure. If amfnd, with this patch, also detects that immnd has >>>> crashed, there may be a raise between amfnd and nid, where amfnd >>>> performs a reboot while nid may be configured to not reboot? >>> [Praveen] I agree with this. But this puts a restriction on NID >>> recovery that its recovery can be reboot only because if NID wants to >>> restart IMMND process again, then all other MW processes needs to be >>> examined to take this into account and be ready for re-initialization >>> with IMM in case the restarted IMM does not resurrect old handles. >>> Not only handles, there can be other resources also. >> [HansN] the configurable restart of e.g. IMMND will not be done after >> nid notify. There will be no restart attempts of e.g IMMND after >> >> nid notify. The FIFOmonitoring will not restart the service, but exit. > [Praveen] So the concern that I raised does not hold now as NID will not > restart IMMND after it has done NID notify. So we have following sequences of > handlings: > 1) If IMMND crashes before replying to NID then NID will restart it. > 2) If IMMND crashes after NID reply then NID will exit with error. Now > opensafd script will reboot the node based on the user configured value of > REBOOT_ON_FAIL_TIMEOUT. > > Now case2 above gives choice to the user to configure for node reboot in such > cases. > Coming back to this patch of #2158: AMFND is started by NID as last service. > By this time, IMMND has already notified to the NID. As I had already said > #2158 patch will reboot the node. Here NID's action can be to exit and based > on NID status opensaf script will take action. So rebooting the node because > of #2158 patch will interfere with user's choice of R
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Minh & Praveen, Do we agree to close ticket #2158 as a duplicate of #2204? /Thanks HansN -Original Message- From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] Sent: den 20 december 2016 08:18 To: praveen malviya <praveen.malv...@oracle.com>; nagendr...@oracle.com; Gary Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158] Hi Praveen, I agree about your summary below. /Thanks HansN -Original Message- From: praveen malviya [mailto:praveen.malv...@oracle.com] Sent: den 20 december 2016 07:57 To: Hans Nordebäck <hans.nordeb...@ericsson.com>; nagendr...@oracle.com; Gary Lee <gary@dektech.com.au>; Minh Hon Chau <minh.c...@dektech.com.au> Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158] Hi Hans, Please see inline with [Praveen]. Thanks, Praveen On 16-Dec-16 4:45 PM, Hans Nordeback wrote: > Hi Praveen, please see inline with [HansN]. > > /Thanks HansN > > > On 12/16/2016 11:13 AM, praveen malviya wrote: >> Hi Hans, >> >> Please see inline with [Praveen]. >> >> Thanks, >> Praveen >> >> On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >>> Hi Praveen, >>> >>> please see some comments/questions inline below. >>> >>> /Thanks HansN >>> >>> >>> On 12/16/2016 07:42 AM, praveen malviya wrote: >>>> Hi Hans, >>>> >>>> Currently, AMFND responds to NID when all MW components are in AMF >>>> control. So there is window. >>> [HansN] you mean 'no window'? >> [Praveen] Sorry I missed 'no'. It is indeed 'no window'. >>>> I have explicitly mentioned in the patch below (see Note in main() >>>> in amfnd.cc below), that this may be usable only in future. >>>> However there is one very weak case, when AMFND is just starting >>>> NoRed components like CPD etc and immnd has still not become AMF >>>> component and also AMFND has not responded to NID. Now at this >>>> stage if immnd process crashes then node will reboot only after CPD >>>> instantiation timer expires which is about 10 seconds. >>> [HansN] with #2204 v2 nid will detect that immnd has crashed and it >>> is configurable in the opensafd script to reboot the node or not at >>> nid failure. If amfnd, with this patch, also detects that immnd has >>> crashed, there may be a raise between amfnd and nid, where amfnd >>> performs a reboot while nid may be configured to not reboot? >> [Praveen] I agree with this. But this puts a restriction on NID >> recovery that its recovery can be reboot only because if NID wants to >> restart IMMND process again, then all other MW processes needs to be >> examined to take this into account and be ready for re-initialization >> with IMM in case the restarted IMM does not resurrect old handles. >> Not only handles, there can be other resources also. > > [HansN] the configurable restart of e.g. IMMND will not be done after > nid notify. There will be no restart attempts of e.g IMMND after > > nid notify. The FIFOmonitoring will not restart the service, but exit. [Praveen] So the concern that I raised does not hold now as NID will not restart IMMND after it has done NID notify. So we have following sequences of handlings: 1) If IMMND crashes before replying to NID then NID will restart it. 2) If IMMND crashes after NID reply then NID will exit with error. Now opensafd script will reboot the node based on the user configured value of REBOOT_ON_FAIL_TIMEOUT. Now case2 above gives choice to the user to configure for node reboot in such cases. Coming back to this patch of #2158: AMFND is started by NID as last service. By this time, IMMND has already notified to the NID. As I had already said #2158 patch will reboot the node. Here NID's action can be to exit and based on NID status opensaf script will take action. So rebooting the node because of #2158 patch will interfere with user's choice of REBOOT_ON_FAIL_TIMEOUT. Because of this fact #2158 becomes a duplicate of #2204. > >> Since #2204 is a defect so, I think, enhance capabilities can be >> ignored. In that case #2158 becomes almost a duplicate of #2204. >> >>>> But this patch will do an immediate reboot. In this case it can be >>>> argued that NID can also take action but I think since this is the >>>> last service NID will also act very lately. Here other NoRed >>>> components instantiation will fail as, I think, they want to use >>
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Praveen, I agree about your summary below. /Thanks HansN -Original Message- From: praveen malviya [mailto:praveen.malv...@oracle.com] Sent: den 20 december 2016 07:57 To: Hans Nordebäck; nagendr...@oracle.com; Gary Lee ; Minh Hon Chau Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158] Hi Hans, Please see inline with [Praveen]. Thanks, Praveen On 16-Dec-16 4:45 PM, Hans Nordeback wrote: > Hi Praveen, please see inline with [HansN]. > > /Thanks HansN > > > On 12/16/2016 11:13 AM, praveen malviya wrote: >> Hi Hans, >> >> Please see inline with [Praveen]. >> >> Thanks, >> Praveen >> >> On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >>> Hi Praveen, >>> >>> please see some comments/questions inline below. >>> >>> /Thanks HansN >>> >>> >>> On 12/16/2016 07:42 AM, praveen malviya wrote: Hi Hans, Currently, AMFND responds to NID when all MW components are in AMF control. So there is window. >>> [HansN] you mean 'no window'? >> [Praveen] Sorry I missed 'no'. It is indeed 'no window'. I have explicitly mentioned in the patch below (see Note in main() in amfnd.cc below), that this may be usable only in future. However there is one very weak case, when AMFND is just starting NoRed components like CPD etc and immnd has still not become AMF component and also AMFND has not responded to NID. Now at this stage if immnd process crashes then node will reboot only after CPD instantiation timer expires which is about 10 seconds. >>> [HansN] with #2204 v2 nid will detect that immnd has crashed and it >>> is configurable in the opensafd script to reboot the node or not at >>> nid failure. If amfnd, with this patch, also detects that immnd has >>> crashed, there may be a raise between amfnd and nid, where amfnd >>> performs a reboot while nid may be configured to not reboot? >> [Praveen] I agree with this. But this puts a restriction on NID >> recovery that its recovery can be reboot only because if NID wants to >> restart IMMND process again, then all other MW processes needs to be >> examined to take this into account and be ready for re-initialization >> with IMM in case the restarted IMM does not resurrect old handles. >> Not only handles, there can be other resources also. > > [HansN] the configurable restart of e.g. IMMND will not be done after > nid notify. There will be no restart attempts of e.g IMMND after > > nid notify. The FIFOmonitoring will not restart the service, but exit. [Praveen] So the concern that I raised does not hold now as NID will not restart IMMND after it has done NID notify. So we have following sequences of handlings: 1) If IMMND crashes before replying to NID then NID will restart it. 2) If IMMND crashes after NID reply then NID will exit with error. Now opensafd script will reboot the node based on the user configured value of REBOOT_ON_FAIL_TIMEOUT. Now case2 above gives choice to the user to configure for node reboot in such cases. Coming back to this patch of #2158: AMFND is started by NID as last service. By this time, IMMND has already notified to the NID. As I had already said #2158 patch will reboot the node. Here NID's action can be to exit and based on NID status opensaf script will take action. So rebooting the node because of #2158 patch will interfere with user's choice of REBOOT_ON_FAIL_TIMEOUT. Because of this fact #2158 becomes a duplicate of #2204. > >> Since #2204 is a defect so, I think, enhance capabilities can be >> ignored. In that case #2158 becomes almost a duplicate of #2204. >> But this patch will do an immediate reboot. In this case it can be argued that NID can also take action but I think since this is the last service NID will also act very lately. Here other NoRed components instantiation will fail as, I think, they want to use IMM which is not available. I will see if something can be done to make IMM as first Nored comp to get instantiated using compinstantiationlevel. In that case this patch may not be needed. Thanks, Praveen On 15-Dec-16 7:45 PM, Hans Nordeback wrote: > Hi Praveen, > > I'll review this ticket tomorrow, but one question in advance, > what is the use case for this ticket? > > Looking at the ticket description, the use case seems to be immnd > crashes during the nid phase, > > but that use case is handled by ticket #2204. Is there a "window" > after > amfnd has called "nid_notify" where > > immnd is not monitored by amf? /Thanks HansN > > > On 12/13/2016 11:48 AM, praveen.malv...@oracle.com wrote: >> osaf/services/saf/amf/amfnd/clc.cc | 11 +++ >> osaf/services/saf/amf/amfnd/comp.cc | 13 >>
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Hans, Please see inline with [Praveen]. Thanks, Praveen On 16-Dec-16 4:45 PM, Hans Nordeback wrote: > Hi Praveen, please see inline with [HansN]. > > /Thanks HansN > > > On 12/16/2016 11:13 AM, praveen malviya wrote: >> Hi Hans, >> >> Please see inline with [Praveen]. >> >> Thanks, >> Praveen >> >> On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >>> Hi Praveen, >>> >>> please see some comments/questions inline below. >>> >>> /Thanks HansN >>> >>> >>> On 12/16/2016 07:42 AM, praveen malviya wrote: Hi Hans, Currently, AMFND responds to NID when all MW components are in AMF control. So there is window. >>> [HansN] you mean 'no window'? >> [Praveen] Sorry I missed 'no'. It is indeed 'no window'. I have explicitly mentioned in the patch below (see Note in main() in amfnd.cc below), that this may be usable only in future. However there is one very weak case, when AMFND is just starting NoRed components like CPD etc and immnd has still not become AMF component and also AMFND has not responded to NID. Now at this stage if immnd process crashes then node will reboot only after CPD instantiation timer expires which is about 10 seconds. >>> [HansN] with #2204 v2 nid will detect that immnd has crashed and it is >>> configurable in the opensafd script to reboot the node or not at >>> nid failure. If amfnd, with this patch, also detects that immnd has >>> crashed, there may be a raise between amfnd and nid, >>> where amfnd performs a reboot while nid may be configured to not reboot? >> [Praveen] I agree with this. But this puts a restriction on NID >> recovery that its recovery can be reboot only because if NID wants to >> restart IMMND process again, then all other MW processes needs to be >> examined to take this into account and be ready for re-initialization >> with IMM in case the restarted IMM does not resurrect old handles. Not >> only handles, there can be other resources also. > > [HansN] the configurable restart of e.g. IMMND will not be done after > nid notify. There will be no restart attempts of e.g IMMND after > > nid notify. The FIFOmonitoring will not restart the service, but exit. [Praveen] So the concern that I raised does not hold now as NID will not restart IMMND after it has done NID notify. So we have following sequences of handlings: 1) If IMMND crashes before replying to NID then NID will restart it. 2) If IMMND crashes after NID reply then NID will exit with error. Now opensafd script will reboot the node based on the user configured value of REBOOT_ON_FAIL_TIMEOUT. Now case2 above gives choice to the user to configure for node reboot in such cases. Coming back to this patch of #2158: AMFND is started by NID as last service. By this time, IMMND has already notified to the NID. As I had already said #2158 patch will reboot the node. Here NID's action can be to exit and based on NID status opensaf script will take action. So rebooting the node because of #2158 patch will interfere with user's choice of REBOOT_ON_FAIL_TIMEOUT. Because of this fact #2158 becomes a duplicate of #2204. > >> Since #2204 is a defect so, I think, enhance capabilities can be >> ignored. In that case #2158 becomes almost a duplicate of #2204. >> But this patch will do an immediate reboot. In this case it can be argued that NID can also take action but I think since this is the last service NID will also act very lately. Here other NoRed components instantiation will fail as, I think, they want to use IMM which is not available. I will see if something can be done to make IMM as first Nored comp to get instantiated using compinstantiationlevel. In that case this patch may not be needed. Thanks, Praveen On 15-Dec-16 7:45 PM, Hans Nordeback wrote: > Hi Praveen, > > I'll review this ticket tomorrow, but one question in advance, what is > the use case for this ticket? > > Looking at the ticket description, the use case seems to be immnd > crashes during the nid phase, > > but that use case is handled by ticket #2204. Is there a "window" > after > amfnd has called "nid_notify" where > > immnd is not monitored by amf? /Thanks HansN > > > On 12/13/2016 11:48 AM, praveen.malv...@oracle.com wrote: >> osaf/services/saf/amf/amfnd/clc.cc | 11 +++ >> osaf/services/saf/amf/amfnd/comp.cc | 13 >> osaf/services/saf/amf/amfnd/evt.cc | 4 + >> osaf/services/saf/amf/amfnd/include/avnd_cb.h | 6 ++ >> osaf/services/saf/amf/amfnd/include/avnd_comp.h | 1 + >> osaf/services/saf/amf/amfnd/include/avnd_evt.h | 1 + >> osaf/services/saf/amf/amfnd/include/avnd_proc.h | 1 + >> osaf/services/saf/amf/amfnd/main.cc | 72 >> +++- >> 8 files changed, 105 insertions(+), 4 deletions(-) >> >> >> If IMMND
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Praveen, please see inline with [HansN]. /Thanks HansN On 12/16/2016 11:13 AM, praveen malviya wrote: > Hi Hans, > > Please see inline with [Praveen]. > > Thanks, > Praveen > > On 16-Dec-16 3:14 PM, Hans Nordeback wrote: >> Hi Praveen, >> >> please see some comments/questions inline below. >> >> /Thanks HansN >> >> >> On 12/16/2016 07:42 AM, praveen malviya wrote: >>> Hi Hans, >>> >>> Currently, AMFND responds to NID when all MW components are in AMF >>> control. So there is window. >> [HansN] you mean 'no window'? > [Praveen] Sorry I missed 'no'. It is indeed 'no window'. >>> I have explicitly mentioned in the patch below (see Note in main() in >>> amfnd.cc below), that this may be usable only in future. >>> However there is one very weak case, when AMFND is just starting NoRed >>> components like CPD etc and immnd has still not become AMF component >>> and also AMFND has not responded to NID. Now at this stage if immnd >>> process crashes then node will reboot only after CPD instantiation >>> timer expires which is about 10 seconds. >> [HansN] with #2204 v2 nid will detect that immnd has crashed and it is >> configurable in the opensafd script to reboot the node or not at >> nid failure. If amfnd, with this patch, also detects that immnd has >> crashed, there may be a raise between amfnd and nid, >> where amfnd performs a reboot while nid may be configured to not reboot? > [Praveen] I agree with this. But this puts a restriction on NID > recovery that its recovery can be reboot only because if NID wants to > restart IMMND process again, then all other MW processes needs to be > examined to take this into account and be ready for re-initialization > with IMM in case the restarted IMM does not resurrect old handles. Not > only handles, there can be other resources also. [HansN] the configurable restart of e.g. IMMND will not be done after nid notify. There will be no restart attempts of e.g IMMND after nid notify. The FIFOmonitoring will not restart the service, but exit. > Since #2204 is a defect so, I think, enhance capabilities can be > ignored. In that case #2158 becomes almost a duplicate of #2204. > >>> But this patch will do an immediate reboot. In this case it can be >>> argued that NID can also take action but I think since this is the >>> last service NID will also act very lately. Here other NoRed >>> components instantiation will fail as, I think, they want to use IMM >>> which is not available. >>> I will see if something can be done to make IMM as first Nored comp to >>> get instantiated using compinstantiationlevel. In that case this patch >>> may not be needed. >>> >>> >>> Thanks, >>> Praveen >>> >>> On 15-Dec-16 7:45 PM, Hans Nordeback wrote: Hi Praveen, I'll review this ticket tomorrow, but one question in advance, what is the use case for this ticket? Looking at the ticket description, the use case seems to be immnd crashes during the nid phase, but that use case is handled by ticket #2204. Is there a "window" after amfnd has called "nid_notify" where immnd is not monitored by amf? /Thanks HansN On 12/13/2016 11:48 AM, praveen.malv...@oracle.com wrote: > osaf/services/saf/amf/amfnd/clc.cc | 11 +++ > osaf/services/saf/amf/amfnd/comp.cc | 13 > osaf/services/saf/amf/amfnd/evt.cc | 4 + > osaf/services/saf/amf/amfnd/include/avnd_cb.h | 6 ++ > osaf/services/saf/amf/amfnd/include/avnd_comp.h | 1 + > osaf/services/saf/amf/amfnd/include/avnd_evt.h | 1 + > osaf/services/saf/amf/amfnd/include/avnd_proc.h | 1 + > osaf/services/saf/amf/amfnd/main.cc | 72 > +++- > 8 files changed, 105 insertions(+), 4 deletions(-) > > > If IMMND dies before it becomes AMF component there is no entity to > restart it. > With ticket #2204, NID will monitor using existing FIFO each of the > started process until it itself exits. > This patch will also monitor IMMND using FIFO. > If IMMND dies before becoming AMF component and after NID exit then > with this patch AMFND will > reboot the node because AMFND does not have configuration to restart > IMMND. > > diff --git a/osaf/services/saf/amf/amfnd/clc.cc > b/osaf/services/saf/amf/amfnd/clc.cc > --- a/osaf/services/saf/amf/amfnd/clc.cc > +++ b/osaf/services/saf/amf/amfnd/clc.cc > @@ -909,6 +909,7 @@ uint32_t avnd_comp_clc_fsm_run(AVND_CB * > TRACE_LEAVE2("%u", rc); > return rc; > } > + > > / > > > > > Name : avnd_comp_clc_st_chng_prc >@@ -1486,6 +1487,12 @@ uint32_t avnd_comp_clc_st_chng_prc(AVND_ > (comp->su->pres == SA_AMF_PRESENCE_INSTANTIATED)) > rc = avnd_compdb_rec_del(cb, comp->name);
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Hans, Please see inline with [Praveen]. Thanks, Praveen On 16-Dec-16 3:14 PM, Hans Nordeback wrote: > Hi Praveen, > > please see some comments/questions inline below. > > /Thanks HansN > > > On 12/16/2016 07:42 AM, praveen malviya wrote: >> Hi Hans, >> >> Currently, AMFND responds to NID when all MW components are in AMF >> control. So there is window. > [HansN] you mean 'no window'? [Praveen] Sorry I missed 'no'. It is indeed 'no window'. >> I have explicitly mentioned in the patch below (see Note in main() in >> amfnd.cc below), that this may be usable only in future. >> However there is one very weak case, when AMFND is just starting NoRed >> components like CPD etc and immnd has still not become AMF component >> and also AMFND has not responded to NID. Now at this stage if immnd >> process crashes then node will reboot only after CPD instantiation >> timer expires which is about 10 seconds. > [HansN] with #2204 v2 nid will detect that immnd has crashed and it is > configurable in the opensafd script to reboot the node or not at > nid failure. If amfnd, with this patch, also detects that immnd has > crashed, there may be a raise between amfnd and nid, > where amfnd performs a reboot while nid may be configured to not reboot? [Praveen] I agree with this. But this puts a restriction on NID recovery that its recovery can be reboot only because if NID wants to restart IMMND process again, then all other MW processes needs to be examined to take this into account and be ready for re-initialization with IMM in case the restarted IMM does not resurrect old handles. Not only handles, there can be other resources also. Since #2204 is a defect so, I think, enhance capabilities can be ignored. In that case #2158 becomes almost a duplicate of #2204. >> But this patch will do an immediate reboot. In this case it can be >> argued that NID can also take action but I think since this is the >> last service NID will also act very lately. Here other NoRed >> components instantiation will fail as, I think, they want to use IMM >> which is not available. >> I will see if something can be done to make IMM as first Nored comp to >> get instantiated using compinstantiationlevel. In that case this patch >> may not be needed. >> >> >> Thanks, >> Praveen >> >> On 15-Dec-16 7:45 PM, Hans Nordeback wrote: >>> Hi Praveen, >>> >>> I'll review this ticket tomorrow, but one question in advance, what is >>> the use case for this ticket? >>> >>> Looking at the ticket description, the use case seems to be immnd >>> crashes during the nid phase, >>> >>> but that use case is handled by ticket #2204. Is there a "window" after >>> amfnd has called "nid_notify" where >>> >>> immnd is not monitored by amf? /Thanks HansN >>> >>> >>> On 12/13/2016 11:48 AM, praveen.malv...@oracle.com wrote: osaf/services/saf/amf/amfnd/clc.cc | 11 +++ osaf/services/saf/amf/amfnd/comp.cc | 13 osaf/services/saf/amf/amfnd/evt.cc | 4 + osaf/services/saf/amf/amfnd/include/avnd_cb.h | 6 ++ osaf/services/saf/amf/amfnd/include/avnd_comp.h | 1 + osaf/services/saf/amf/amfnd/include/avnd_evt.h | 1 + osaf/services/saf/amf/amfnd/include/avnd_proc.h | 1 + osaf/services/saf/amf/amfnd/main.cc | 72 +++- 8 files changed, 105 insertions(+), 4 deletions(-) If IMMND dies before it becomes AMF component there is no entity to restart it. With ticket #2204, NID will monitor using existing FIFO each of the started process until it itself exits. This patch will also monitor IMMND using FIFO. If IMMND dies before becoming AMF component and after NID exit then with this patch AMFND will reboot the node because AMFND does not have configuration to restart IMMND. diff --git a/osaf/services/saf/amf/amfnd/clc.cc b/osaf/services/saf/amf/amfnd/clc.cc --- a/osaf/services/saf/amf/amfnd/clc.cc +++ b/osaf/services/saf/amf/amfnd/clc.cc @@ -909,6 +909,7 @@ uint32_t avnd_comp_clc_fsm_run(AVND_CB * TRACE_LEAVE2("%u", rc); return rc; } + / Name : avnd_comp_clc_st_chng_prc @@ -1486,6 +1487,12 @@ uint32_t avnd_comp_clc_st_chng_prc(AVND_ (comp->su->pres == SA_AMF_PRESENCE_INSTANTIATED)) rc = avnd_compdb_rec_del(cb, comp->name); +if ((final_st == SA_AMF_PRESENCE_INSTANTIATED) && +(comp->is_comp_immnd() == true)) { +AVND_EVT *evt = avnd_evt_create(cb, AVND_EVT_IMMND_COMP_UP, 0, 0, 0, 0, 0); +if (evt) +rc = avnd_evt_send(cb, evt); +} done: TRACE_LEAVE2("%u", rc); return rc; @@ -1553,6 +1560,10 @@ uint32_t avnd_comp_clc_uninst_inst_hdler
Re: [devel] [PATCH 1 of 1] amfnd: monitor immnd process using FIFO [#2158]
Hi Praveen, I'll review this ticket tomorrow, but one question in advance, what is the use case for this ticket? Looking at the ticket description, the use case seems to be immnd crashes during the nid phase, but that use case is handled by ticket #2204. Is there a "window" after amfnd has called "nid_notify" where immnd is not monitored by amf? /Thanks HansN On 12/13/2016 11:48 AM, praveen.malv...@oracle.com wrote: > osaf/services/saf/amf/amfnd/clc.cc | 11 +++ > osaf/services/saf/amf/amfnd/comp.cc | 13 > osaf/services/saf/amf/amfnd/evt.cc | 4 + > osaf/services/saf/amf/amfnd/include/avnd_cb.h | 6 ++ > osaf/services/saf/amf/amfnd/include/avnd_comp.h | 1 + > osaf/services/saf/amf/amfnd/include/avnd_evt.h | 1 + > osaf/services/saf/amf/amfnd/include/avnd_proc.h | 1 + > osaf/services/saf/amf/amfnd/main.cc | 72 > +++- > 8 files changed, 105 insertions(+), 4 deletions(-) > > > If IMMND dies before it becomes AMF component there is no entity to restart > it. > With ticket #2204, NID will monitor using existing FIFO each of the started > process until it itself exits. > This patch will also monitor IMMND using FIFO. > If IMMND dies before becoming AMF component and after NID exit then with this > patch AMFND will > reboot the node because AMFND does not have configuration to restart IMMND. > > diff --git a/osaf/services/saf/amf/amfnd/clc.cc > b/osaf/services/saf/amf/amfnd/clc.cc > --- a/osaf/services/saf/amf/amfnd/clc.cc > +++ b/osaf/services/saf/amf/amfnd/clc.cc > @@ -909,6 +909,7 @@ uint32_t avnd_comp_clc_fsm_run(AVND_CB * > TRACE_LEAVE2("%u", rc); > return rc; > } > + > > / > Name : avnd_comp_clc_st_chng_prc > > @@ -1486,6 +1487,12 @@ uint32_t avnd_comp_clc_st_chng_prc(AVND_ > (comp->su->pres == SA_AMF_PRESENCE_INSTANTIATED)) > rc = avnd_compdb_rec_del(cb, comp->name); > > + if ((final_st == SA_AMF_PRESENCE_INSTANTIATED) && > + (comp->is_comp_immnd() == true)) { > + AVND_EVT *evt = avnd_evt_create(cb, AVND_EVT_IMMND_COMP_UP, 0, > 0, 0, 0, 0); > + if (evt) > + rc = avnd_evt_send(cb, evt); > + } >done: > TRACE_LEAVE2("%u", rc); > return rc; > @@ -1553,6 +1560,10 @@ uint32_t avnd_comp_clc_uninst_inst_hdler > > /* transition to 'instantiating' state */ > avnd_comp_pres_state_set(cb, comp, > SA_AMF_PRESENCE_INSTANTIATING); > + if (comp->is_comp_immnd() == true) { > + TRACE("immnd is instantiating"); > + avnd_cb->imm_comp_state = AMFND_IMM_COMP_INSTANTIATING; > + } > } > > done: > diff --git a/osaf/services/saf/amf/amfnd/comp.cc > b/osaf/services/saf/amf/amfnd/comp.cc > --- a/osaf/services/saf/amf/amfnd/comp.cc > +++ b/osaf/services/saf/amf/amfnd/comp.cc > @@ -3016,3 +3016,16 @@ uint32_t avnd_amfa_mds_info_evh(AVND_CB > return NCSCC_RC_SUCCESS; > } > > +bool AVND_COMP::is_comp_immnd() const { > + if ((su->is_ncs == true) && > +(name.find("safSg=NoRed,safApp=OpenSAF") != std::string::npos) && > +(name.find("safComp=IMMND") != std::string::npos)) > +return true; > + return false; > +} > + > +uint32_t avnd_immnd_comp_evh(AVND_CB *cb, AVND_EVT *evt) { > + avnd_cb->imm_comp_state = AMFND_IMM_COMP_UP; > + TRACE("IMMND is now AMF component"); > + return NCSCC_RC_SUCCESS; > +} > diff --git a/osaf/services/saf/amf/amfnd/evt.cc > b/osaf/services/saf/amf/amfnd/evt.cc > --- a/osaf/services/saf/amf/amfnd/evt.cc > +++ b/osaf/services/saf/amf/amfnd/evt.cc > @@ -185,6 +185,8 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb, > > case AVND_EVT_AMFA_MDS_VER_INFO: > break; > + case AVND_EVT_IMMND_COMP_UP: > + break; > default: > delete evt; > evt = nullptr; > @@ -316,6 +318,8 @@ void avnd_evt_destroy(AVND_EVT *evt) > break; > case AVND_EVT_AMFA_MDS_VER_INFO: > break; > + case AVND_EVT_IMMND_COMP_UP: > + break; > > default: > LOG_NO("%s: unknown event type %u", __FUNCTION__, type); > diff --git a/osaf/services/saf/amf/amfnd/include/avnd_cb.h > b/osaf/services/saf/amf/amfnd/include/avnd_cb.h > --- a/osaf/services/saf/amf/amfnd/include/avnd_cb.h > +++ b/osaf/services/saf/amf/amfnd/include/avnd_cb.h > @@ -34,6 +34,11 @@ > #define AVND_CB_H > #include > > +typedef enum { > +AMFND_IMM_COMP_BASE = 1, > +AMFND_IMM_COMP_INSTANTIATING = 2, > +AMFND_IMM_COMP_UP = 3 > +} AMFND_IMM_COMP_STATUS; > > typedef struct avnd_cb_tag { > SYSF_MBX mbx; /* mailbox on which AvND waits */ > @@ -121,6 +126,7 @@ typedef struct avnd_cb_tag { > SaTimeT