Hi, >>According to the log, the PL-4 joined cluster, it means the cluster is not in headless state, doesn't it?
Not exactly , best my knowledge CPSV application was running on PL4 ( cluster is up and running ) , the restarted both controllers it seem because of some other problem CPND restated , then i saw this issue Currently i don't have those logs , i will try to re-produce the issue. But in brad , we need to re-integrate CLM CPSV again , based on new behavior of CLMD. -AVM On 2/22/2016 9:38 AM, Nhat Pham wrote: > RE: [devel] [PATCH 0 of 1] Review Request for cpsv: Support preserving > and recovering checkpoint replicas during headless state V2 [#1621] > > Hi Mahesh, > > Could you please clarify which case the error below happened? > > *Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO SERVER STATE: > IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY* > > Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO Implementer connected: 45 > (safClmService) <0, 2010f> > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: ER cpnd clm init failed with > return value:31 > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: ER cpnd init failed > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: ER cpnd_lib_req FAILED > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: __init_cpnd() failed > > *Feb 19 11:18:28 PL-4 osafclmna[5432]: NO > safNode=PL-4,safCluster=myClmCluster Joined cluster, nodeid=2040f* > > According to the log, the PL-4 joined cluster, it means the cluster is > not in headless state, doesn't it? > > Best regards, > > Nhat Pham > > -----Original Message----- > From: Nhat Pham [mailto:[email protected]] > Sent: Monday, February 22, 2016 9:19 AM > To: 'A V Mahesh' <[email protected]>; 'Anders Widell' > <[email protected]> > Cc: 'Beatriz Brandao' <[email protected]>; 'Minh Chau H' > <[email protected]>; [email protected] > Subject: Re: [devel] [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state V2 > [#1621] > > Hi Mahesh and Anders, > > Please see my comment below. > > BTW, have you finished the review and test? > > Best regards, > > Nhat Pham > > From: A V Mahesh [mailto:[email protected]] > > Sent: Friday, February 19, 2016 2:28 PM > > To: Nhat Pham > <[email protected]<mailto:[email protected]>>; 'Anders > Widell' > > <[email protected]<mailto:[email protected]>>; 'Minh > Chau H' <[email protected]<mailto:[email protected]>> > > Cc:[email protected]<mailto:[email protected]>; > > 'Beatriz Brandao' > > <[email protected]<mailto:[email protected]>> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state V2 > [#1621] > > Hi Nhat Pham, > > On 2/19/2016 12:28 PM, Nhat Pham wrote: > > Could you please give more detailed information about steps to > reproduce the problem below? Thanks. > > > Don't see this as specific bug , we need to see the issue as CLM > integrated service point of view , by considering Anders Widell > explication about CLM application behavior during headless state we > need to reintegrate CPND with CLM ( before this headless state > feature no case of CPND existence in the obscene of CLMD , but now it > is ). > > And this will be the consistent across the all services who integrated > with CLM ( you may need some changes in CLM also ) > > [Nhat Pham] I think CLM should return SA_AIS_ERR_TRY_AGAIN in this case. > > @Anders. How would you think? > > To start with let us consider case CPND on payload restarted on PL > during headless state and an application is in running on PL. > > [Nhat Pham] Regarding the CPND as CLM application, I'm not sure what > it can do in this case. In case it restarts, it is monitored by AMF. > > If it blocks for too long, AMF will also trigger a node reboot. > > In my test case, the CPND get blocked by CLM. It doesn't get out of > the saClmInitialize. How do you get the "ER cpnd clm init failed with > return value:31"? > > Following is the cpnd trace. > > Feb 22 8:56:41.188122 osafckptnd [736:cpnd_init.c:0183] >> cpnd_lib_init > > Feb 22 8:56:41.188332 osafckptnd [736:cpnd_init.c:0412] >> > cpnd_cb_db_init > > Feb 22 8:56:41.188600 osafckptnd [736:cpnd_init.c:0437] << > cpnd_cb_db_init > > Feb 22 8:56:41.188778 osafckptnd [736:clma_api.c:0503] >> saClmInitialize > > Feb 22 8:56:41.188945 osafckptnd [736:clma_api.c:0593] >> clmainitialize > > Feb 22 8:56:41.190052 osafckptnd [736:clma_util.c:0100] >> clma_startup: > > clma_use_count: 0 > > Feb 22 8:56:41.190273 osafckptnd [736:clma_mds.c:1124] >> clma_mds_init > > Feb 22 8:56:41.190825 osafckptnd [736:clma_mds.c:1170] << clma_mds_init > > -AVM > > On 2/19/2016 12:28 PM, Nhat Pham wrote: > > Hi Mahesh, > > Could you please give more detailed information about steps to > reproduce the problem below? Thanks. > > Best regards, > > Nhat Pham > > From: A V Mahesh [mailto:[email protected]] > > Sent: Friday, February 19, 2016 1:06 PM > > To: Anders Widell <mailto:[email protected]> > > <[email protected]<mailto:[email protected]>>; Nhat > Pham <mailto:[email protected]> > <[email protected]<mailto:[email protected]>>; 'Minh > Chau H' <mailto:[email protected]> > <[email protected]<mailto:[email protected]>> > > Cc:[email protected]<mailto:[email protected]> > > <mailto:[email protected]> ; 'Beatriz Brandao' > > <mailto:[email protected]> > <[email protected]<mailto:[email protected]>> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state V2 > [#1621] > > Hi Anders Widell, > > Thanks for the detailed explanation about CLM during headless state. > > HI Nhat Pham , > > Comment : 3 > > Please see below the problem I was interpreted now I seeing it > during CLMD obscene ( during headless state ), so now CPND/CLMA need > to to address below case , currently cpnd clm init > > failed with return value: SA_AIS_ERR_UNAVAILABLE > > but should be SA_AIS_ERR_TRY_AGAIN > > ================================================== > > Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO NODE STATE-> > IMM_NODE_FULLY_AVAILABLE 17418 Feb 19 11:18:28 PL-4 osafimmloadd: NO > Sync ending normally Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO Epoch > set to 9 in ImmModel Feb 19 11:18:28 PL-4 cpsv_app: IN Received > PROC_STALE_CLIENTS Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO > Implementer connected: 42 > > (MsgQueueService132111) <108, 2040f> > > Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO Implementer connected: 43 > > (MsgQueueService131855) <0, 2030f> > > Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO Implementer connected: 44 > > (safLogService) <0, 2010f> > > Feb 19 11:18:28 PL-4 osafimmnd[5422]: NO SERVER STATE: > > IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY Feb 19 11:18:28 PL-4 > osafimmnd[5422]: NO Implementer connected: 45 > > (safClmService) <0, 2010f> > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: ER cpnd clm init failed with return > > value:31 > > Feb 19 11:18:28 PL-4 osafckptnd[7718]: ER cpnd init failed Feb 19 > 11:18:28 PL-4 osafckptnd[7718]: ER cpnd_lib_req FAILED Feb 19 11:18:28 > PL-4 osafckptnd[7718]: __init_cpnd() failed Feb 19 11:18:28 PL-4 > osafclmna[5432]: NO safNode=PL-4,safCluster=myClmCluster Joined > cluster, nodeid=2040f Feb 19 11:18:28 PL-4 osafamfnd[5441]: NO AVD > NEW_ACTIVE, adest:1 Feb 19 11:18:28 PL-4 osafamfnd[5441]: NO Sending > node up due to NCSMDS_NEW_ACTIVE Feb 19 11:18:28 PL-4 osafamfnd[5441]: > NO 1 SISU states sent Feb 19 11:18:28 PL-4 osafamfnd[5441]: NO 1 SU > states sent Feb 19 11:18:28 PL-4 osafamfnd[5441]: NO 7 CSICOMP states > synced Feb 19 11:18:28 PL-4 osafamfnd[5441]: NO 7 SU states sent Feb > 19 11:18:28 PL-4 osafimmnd[5422]: NO Implementer connected: 46 > > (safAmfService) <0, 2010f> > > Feb 19 11:18:30 PL-4 osafamfnd[5441]: NO > 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Component or SU restart > probation timer expired Feb 19 11:18:35 PL-4 osafamfnd[5441]: NO > Instantiation of 'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' > failed Feb 19 11:18:35 PL-4 osafamfnd[5441]: NO Reason: component > registration timer expired Feb 19 11:18:35 PL-4 osafamfnd[5441]: WA > 'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State > RESTARTING => INSTANTIATION_FAILED Feb 19 11:18:35 PL-4 > osafamfnd[5441]: NO Component Failover trigerred for > > 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF': Failed component: > > 'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' > > Feb 19 11:18:35 PL-4 osafamfnd[5441]: ER > 'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF'got Inst failed > Feb 19 11:18:35 PL-4 osafamfnd[5441]: Rebooting OpenSAF NodeId = > 132111 EE Name = , Reason: NCS component Instantiation failed, > OwnNodeId = 132111, SupervisionTime = 60 Feb 19 11:18:36 PL-4 > opensaf_reboot: Rebooting local node; timeout=60 Feb 19 11:18:39 PL-4 > kernel: [ 4877.338518] md: stopping all md devices. > > ================================================== > > -AVM > > On 2/15/2016 5:11 PM, Anders Widell wrote: > > Hi! > > Please find my answer inline, marked [AndersW]. > > regards, > > Anders Widell > > On 02/15/2016 10:38 AM, Nhat Pham wrote: > > Hi Mahesh, > > It's good. Thank you. :) > > [AVM] Up on rejoining of the SC`s The replica should be re-created > regardless of another application opens it on PL4. > > ( Note : this comment is based on your explanation have > not yet reviewed/tested , > > currently i am struggling with SC`s not rejoining > > after headless state , i can provide you more on this once i complte my > > review/testing) > > [Nhat] To make cloud resilience works, you need the patches from other > services (log, amf, clm, ntf). > > @Minh: I heard that you created tar file which includes all patches. > Could you please send it to Mahesh? Thanks > > [AVM] I understand that , before I comment more on this please allow > me to > > understand > > I am not still not very clear of the headless design in > detail. > > For example cluster membership of PL`s during headless > state > > , > > In the absence of SC`s (CLMD) dose the PLs is > considered as > > cluster nodes or not (cluster membership) ? > > [Nhat] I don't know much about this. > > @ Anders: Could you please have comment about this? Thanks > > [AndersW] First of all, keep in mind that the "headless" state should > ideally not last a very long time. Once we have the spare SC feature > in place (ticket [#79]), a new SC should become active within a matter > of a few seconds after we have lost both the active and the standby SC. > > I think you should view the state of the cluster in the headless state > in the same way as you view the state of the cluster during a failover > between the active and the standby SC. Imagine that the active SC > dies. It takes the standby SC 1.5 seconds to detect the failure of the > active SC (this is due to the TIPC timeout). If you have configured > the PROMOTE_ACTIVE_TIMER, there is an additional delay before the > standby takes over as active. What is the state of the cluster during > the time after the active SC failed and before the standby takes over? > > The state of the cluster while it is headless is very similar. The > difference is that this state may last a little bit longer (though not > more than a few seconds, until one of the spare SCs becomes active). > Another difference is that we may have lost some state. With a "perfect" > > implementation of the headless feature we should not lose any state at > all, but with the current set of patches we do lose state. > > So specifically if we talk about cluster membership and ask the > question: is a particular PL a member of the cluster or not during the > headless state? > > Well, if you ask CLM about this during the headless state, then you > will not know - because CLM doesn't provide any service during the > headless state. If you keep retrying you query to CLM, you will > eventually get an answer - but you will not get this answer until > there is an active SC again and we have exited the headless state. > When viewed in this way, the answer to the question about a node's > membership is undefined during the headless state, since CLM will not > provide you with any answer until there is an active SC. > > However, if you asked CLM about the node's cluster membership status > before the cluster went headless, you probably saved a cached copy of > the cluster membership state. Maybe you also installed a CLM track > callback and intend to update this cached copy every time the cluster > membership status changes. > > The question then is: can you continue using this cached copy of the > cluster membership state during the headless state? The answer is YES: > since CLM doesn't provide any service during the headless state, it > also means that the cluster membership view cannot change during this > time. Nodes can of course reboot or die, but CLM will not notice and > hence the cluster view will not be updated. You can argue that this is > bad because the cluster view doesn't reflect reality, but notice that > this will always be the case. We can never propagate information > instantaneously, and detection of node failures will take 1.5 seconds > due to the TIPC timeout. You can never be sure that a node is alive at > this very moment just because CLM tells you that it is a member of the > cluster. If we are unfortunate enough to lose both system controller > nodes simultaneously, updates to the cluster membership view will be > delayed a few seconds longer than usual. > > > > > Best regards, > > Nhat Pham > > -----Original Message----- > > From: A V Mahesh [mailto:[email protected]] > > Sent: Monday, February 15, 2016 11:19 AM > > To: Nhat Pham <mailto:[email protected]> > <[email protected]<mailto:[email protected]>>;[email protected]<mailto:[email protected]><mailto:[email protected]> > > Cc:[email protected]<mailto:[email protected]> > > <mailto:[email protected]> ; 'Beatriz Brandao' > > <mailto:[email protected]> > <[email protected]<mailto:[email protected]>> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state V2 > [#1621] > > Hi Nhat Pham, > > How is your holiday went > > Please find my comments below > > On 2/15/2016 8:43 AM, Nhat Pham wrote: > > Hi Mahesh, > > For the comment 1, the patch will be updated accordingly. > > [AVM] Please hold , I will provide more comments in this week , so we > can have consolidated V3 > > For the comment 2, I think the CKPT service will not be backward > compatible if the scAbsenceAllowed is true. > > The client can't create non-collocated checkpoint on SCs. > > Furthermore, this solution only protects the CKPT service from the > case "The non-collocated checkpoint is created on a SC" > > there are still the cases where the replicas are completely lost. Ex: > > - The non-collocated checkpoint created on a PL. The PL reboots. Both > replicas now locate on SCs. Then, headless state happens. All replicas > are lost. > > - The non-collocated checkpoint has active replica locating on a PL > and this PL restarts during headless state > > - The non-collocated checkpoint is created on PL3. This checkpoint is > also opened on PL4. Then SCs and PL3 reboot. > > [AVM] Up on rejoining of the SC`s The replica should be re-created > regardless of another application opens it on PL4. > > ( Note : this comment is based on your explanation have > not yet reviewed/tested , > > currently i am struggling with SC`s not rejoining > > after headless state , i can provide you more on this once i complte my > > review/testing) > > In this case, all replicas are lost and the client has to create it > again. > > In case multiple nodes (which including SCs) reboot, losing replicas > is unpreventable. The patch is to recover the checkpoints in possible > cases. > > How do you think? > > [AVM] I understand that , before I comment more on this please allow > > me to understand > > I am not still not very clear of the headless design in > detail. > > For example cluster membership of PL`s during headless > > state , > > In the absence of SC`s (CLMD) dose the PLs is > considered as > > cluster nodes or not (cluster membership) ? > > - if not consider as NON cluster nodes > Checkpoint Service API should leverage the SA Forum Cluster > > Membership Service and API's can fail with > SA_AIS_ERR_UNAVAILABLE > > - if considers as cluster nodes we need to > follow all the defined rules which are defined in SAI-AIS-CKPT-B.02.02 > specification > > so give me some more time to review it completely , so > that we > > can have consolidated patch V3 > > -AVM > > Best regards, > > Nhat Pham > > -----Original Message----- > > From: A V Mahesh [mailto:[email protected]] > > Sent: Friday, February 12, 2016 11:10 AM > > To: Nhat Pham <mailto:[email protected]> > <[email protected]<mailto:[email protected]>>;[email protected]<mailto:[email protected]><mailto:[email protected]> > > Cc:[email protected]<mailto:[email protected]> > > <mailto:[email protected]> ; Beatriz Brandao > <mailto:[email protected]> > <[email protected]<mailto:[email protected]>> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state V2 > [#1621] > > > Comment 2 : > > After incorporating the comment one all the Limitations should be > prevented based on Hydra configuration is enabled in IMM status. > > Foe example : if some application is trying to create > > non-collocated checkpoint active replica getting generated/locating on > SC then ,regardless of the heads (SC`s) status exist not exist should > return SA_AIS_ERR_NOT_SUPPORTED > > In other words, rather that allowing to created non-collocated > checkpoint when > > heads(SC`s) are exit , and non-collocated checkpoint getting > unrecoverable after heads(SC`s) rejoins. > > ====================================================================== > > ======================= > > Limitation: The CKPT service doesn't support recovering > checkpoints in > > following cases: > > . The checkpoint which is unlinked before headless. > > . The non-collocated checkpoint has active replica locating on SC. > > . The non-collocated checkpoint has active replica locating on a > PL and this PL > > restarts during headless state. In this cases, the checkpoint > replica is > > destroyed. The fault code SA_AIS_ERR_BAD_HANDLE is returned when > the client > > accesses the checkpoint in these cases. The client must re-open the > > checkpoint. > > ====================================================================== > > ======================= > > -AVM > > > On 2/11/2016 12:52 PM, A V Mahesh wrote: > > Hi, > > I jut starred reviewing patch , I will be giving comments as soon as > I crossover any , to save some time. > > Comment 1 : > > This functionality should be under checks if Hydra configuration is > enabled in IMM attrName = > > const_cast<SaImmAttrNameT>("scAbsenceAllowed") > > Please see example how LOG/AMF services implemented it. > > -AVM > > > On 1/29/2016 1:02 PM, Nhat Pham wrote: > > Hi Mahesh, > > As described in the README, the CKPT service returns > SA_AIS_ERR_TRY_AGAIN fault code in this case. > > I guess it's same for other services. > > @Anders: Could you please confirm this? > > Best regards, > > Nhat Pham > > -----Original Message----- > > From: A V Mahesh [mailto:[email protected]] > > Sent: Friday, January 29, 2016 2:11 PM > > To: Nhat Pham <mailto:[email protected]> > <[email protected]<mailto:[email protected]>>;[email protected]<mailto:[email protected]><mailto:[email protected]> > > Cc:[email protected]<mailto:[email protected]> > > <mailto:[email protected]> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state > > V2 [#1621] > > Hi, > > On 1/29/2016 11:45 AM, Nhat Pham wrote: > > - The behavior of application will be consistent with other saf > services like imm/amf behavior during headless state. > > [Nhat] I'm not clear what you mean about "consistent"? > > In the obscene of Director (SC's) , what is expected return values of > SAF API should ( all services ) , > > which are not in aposition to provide service at that moment. > > I think all services should return same SAF ERRS., I thinks currently > we don't have it , may be Anders Widel will help us. > > -AVM > > > On 1/29/2016 11:45 AM, Nhat Pham wrote: > > Hi Mahesh, > > Please see the attachment for the README. Let me know if there is any > more information required. > > Regarding your comments: > > - during headless state applications may behave like during > CPND restart case [Nhat] Headless state and CPND restart are different > events. Thus, the behavior is different. > > Headless state is a case where both SCs go down. > > - The behavior of application will be consistent with other saf > services like imm/amf behavior during headless state. > > [Nhat] I'm not clear what you mean about "consistent"? > > Best regards, > > Nhat Pham > > -----Original Message----- > > From: A V Mahesh [mailto:[email protected]] > > Sent: Friday, January 29, 2016 11:12 AM > > To: Nhat Pham <mailto:[email protected]> > <[email protected]<mailto:[email protected]>>; > > [email protected]<mailto:[email protected]><mailto:[email protected]> > > Cc:[email protected]<mailto:[email protected]> > > <mailto:[email protected]> > > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support > preserving and recovering checkpoint replicas during headless state > > V2 [#1621] > > Hi Nhat Pham, > > I stared reviewing this patch , so can please provide README file > with scope and limitations , that will help to define > testing/reviewing scope . > > Following are minimum things we can keep in mind while > reviewing/accepting patch , > > - Not effecting existing functionality > > - during headless state applications may behave like during > CPND restart case > > - The minimum functionally of application works > > - The behavior of application will be consistent with > > other saf services like imm/amf behavior during headless state. > > So please do provide any additional detailed in README if any of the > above is deviated , that allow users to know about the > limitations/deviation. > > -AVM > > On 1/4/2016 3:15 PM, Nhat Pham wrote: > > Summary: cpsv: Support preserving and recovering checkpoint replicas > during headless state [#1621] Review request for Trac > > Ticket(s): > > #1621 Peer > Reviewer(s):[email protected]<mailto:[email protected]><mailto:[email protected]> > > ;[email protected]<mailto:[email protected]><mailto:[email protected]> > > Pull request > > to: > > [email protected]<mailto:[email protected]><mailto:[email protected]> > > Affected > > branch(es): default Development > > branch: default > > -------------------------------- > > Impacted area Impact y/n > > -------------------------------- > > Docs n > > Build system n > > RPM/packaging n > > Configuration files n > > Startup scripts n > > SAF services y > > OpenSAF services n > > Core libraries n > > Samples n > > Tests n > > Other n > > > Comments (indicate scope for each "y" above): > > --------------------------------------------- > > changeset faec4a4445a4c23e8f630857b19aabb43b5af18d > > Author: Nhat Pham <mailto:[email protected]> > > <[email protected]<mailto:[email protected]>> > > Date: Mon, 04 Jan 2016 16:34:33 +0700 > > cpsv: Support preserving and recovering checkpoint replicas > during headless state [#1621] > > Background: > > ---------- This enhancement supports to preserve checkpoint > replicas > > in case > > both SCs down (headless state) and recover replicas in case one of > > SCs up > > again. If both SCs goes down, checkpoint replicas on surviving > nodes > > still > > remain. When a SC is available again, surviving replicas are > > automatically > > registered to the SC checkpoint database. Content in surviving > > replicas are > > intacted and synchronized to new replicas. > > When no SC is available, client API calls changing checkpoint > > configuration > > which requires SC communication, are rejected. Client API calls > > reading and > > writing existing checkpoint replicas still work. > > Limitation: The CKPT service does not support recovering > checkpoints > > in > > following cases: > > - The checkpoint which is unlinked before headless. > > - The non-collocated checkpoint has active replica locating on SC. > > - The non-collocated checkpoint has active replica locating on > a PL > > and this > > PL restarts during headless state. In this cases, the checkpoint > > replica is > > destroyed. The fault code SA_AIS_ERR_BAD_HANDLE is returned when > the > > client > > accesses the checkpoint in these cases. The client must re-open the > > checkpoint. > > While in headless state, accessing checkpoint replicas does not > work > > if the > > node which hosts the active replica goes down. It will back working > > when a > > SC available again. > > Solution: > > --------- The solution for this enhancement includes 2 parts: > > 1. To destroy un-recoverable checkpoint described above when both > > SCs are > > down: When both SCs are down, the CPND deletes un-recoverable > > checkpoint > > nodes and replicas on PLs. Then it requests CPA to destroy > > corresponding > > checkpoint node by using new message CPA_EVT_ND2A_CKPT_DESTROY > > 2. To update CPD with checkpoint information When an active SC > is up > > after > > headless, CPND will update CPD with checkpoint information by using > > new > > message CPD_EVT_ND2D_CKPT_INFO_UPDATE instead of using > > CPD_EVT_ND2D_CKPT_CREATE. This is because the CPND will create new > > ckpt_id > > for the checkpoint which might be different with the current > ckpt id > > if the > > CPD_EVT_ND2D_CKPT_CREATE is used. The CPD collects checkpoint > > information > > within 6s. During this updating time, following requests is > rejected > > with > > fault code SA_AIS_ERR_TRY_AGAIN: > > - CPD_EVT_ND2D_CKPT_CREATE > > - CPD_EVT_ND2D_CKPT_UNLINK > > - CPD_EVT_ND2D_ACTIVE_SET > > - CPD_EVT_ND2D_CKPT_RDSET > > > Complete diffstat: > > ------------------ > > osaf/libs/agents/saf/cpa/cpa_proc.c | 52 > > +++++++++++++++++++++++++++++++++++ > > osaf/libs/common/cpsv/cpsv_edu.c | 43 > > +++++++++++++++++++++++++++++ > > osaf/libs/common/cpsv/include/cpd_cb.h | 3 ++ > > osaf/libs/common/cpsv/include/cpd_imm.h | 1 + > > osaf/libs/common/cpsv/include/cpd_proc.h | 7 ++++ > > osaf/libs/common/cpsv/include/cpd_tmr.h | 3 +- > > osaf/libs/common/cpsv/include/cpnd_cb.h | 1 + > > osaf/libs/common/cpsv/include/cpnd_init.h | 2 + > > osaf/libs/common/cpsv/include/cpsv_evt.h | 20 +++++++++++++ > > osaf/services/saf/cpsv/cpd/Makefile.am | 3 +- > > osaf/services/saf/cpsv/cpd/cpd_evt.c | 229 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > ++++ > > osaf/services/saf/cpsv/cpd/cpd_imm.c | 112 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > osaf/services/saf/cpsv/cpd/cpd_init.c | 20 ++++++++++++- > > osaf/services/saf/cpsv/cpd/cpd_proc.c | 309 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > osaf/services/saf/cpsv/cpd/cpd_tmr.c | 7 ++++ > > osaf/services/saf/cpsv/cpnd/cpnd_db.c | 16 ++++++++++ > > osaf/services/saf/cpsv/cpnd/cpnd_evt.c | 22 +++++++++++++++ > > osaf/services/saf/cpsv/cpnd/cpnd_init.c | 23 ++++++++++++++- > > osaf/services/saf/cpsv/cpnd/cpnd_mds.c | 13 ++++++++ > > osaf/services/saf/cpsv/cpnd/cpnd_proc.c | 314 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- > > 20 files changed, 1189 insertions(+), 11 deletions(-) > > > Testing Commands: > > ----------------- > > - > > Testing, Expected Results: > > -------------------------- > > - > > > Conditions of Submission: > > ------------------------- > > <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>> > > > Arch Built Started Linux distro > > ------------------------------------------- > > mips n n > > mips64 n n > > x86 n n > > x86_64 n n > > powerpc n n > > powerpc64 n n > > > Reviewer Checklist: > > ------------------- > > [Submitters: make sure that your review doesn't trigger any > > checkmarks!] > > > Your checkin has not passed review because (see checked entries): > > ___ Your RR template is generally incomplete; it has too many > > blank > > entries > > that need proper data filled in. > > ___ You have failed to nominate the proper persons for review and > > push. > > ___ Your patches do not have proper short+long header > > ___ You have grammar/spelling in your header that is unacceptable. > > ___ You have exceeded a sensible line length in your > > headers/comments/text. > > ___ You have failed to put in a proper Trac Ticket # into your > > commits. > > ___ You have incorrectly put/left internal data in your comments/files > > (i.e. internal bug tracking tool IDs, product names etc) > > ___ You have not given any evidence of testing beyond basic build > > tests. > > Demonstrate some level of runtime or other sanity testing. > > ___ You have ^M present in some of your files. These have to be > > removed. > > ___ You have needlessly changed whitespace or added whitespace crimes > > like trailing spaces, or spaces before tabs. > > ___ You have mixed real technical changes with whitespace and other > > cosmetic code cleanup changes. These have to be separate > > commits. > > ___ You need to refactor your submission into logical chunks; there is > > too much content into a single commit. > > ___ You have extraneous garbage in your review (merge commits etc) > > ___ You have giant attachments which should never have been sent; > > Instead you should place your content in a public tree to > > be pulled. > > ___ You have too many commits attached to an e-mail; resend as > > threaded > > commits, or place in a public tree for a pull. > > ___ You have resent this content multiple times without a clear > > indication > > of what has changed between each re-send. > > ___ You have failed to adequately and individually address all of the > > comments and change requests that were proposed in the > > initial > > review. > > ___ You have a misconfigured ~/.hgrc file (i.e. username, email > > etc) > > ___ Your computer have a badly configured date and time; confusing the > > the threaded patch review. > > ___ Your changes affect IPC mechanism, and you don't present any > > results > > for in-service upgradability test. > > ___ Your changes affect user manual and documentation, your patch > > series > > do not contain the patch that updates the Doxygen manual. > > ------------------------------------------------------------------------------ > > Site24x7 APM Insight: Get Deep Visibility into Application Performance > > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > > Monitor end-to-end web transactions and take corrective actions now > > Troubleshoot faster and improve end-user experience. Signup Now! > > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > > _______________________________________________ > > Opensaf-devel mailing list > > [email protected]<mailto:[email protected]> > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
