Hi, On Tue, Oct 28, 2014 at 2:43 PM, Reka Thirunavukkarasu <r...@wso2.com> wrote:
> Hi > > On Tue, Oct 28, 2014 at 2:22 PM, Isuru Haththotuwa <isu...@apache.org> > wrote: > >> >> >> On Tue, Oct 28, 2014 at 1:48 PM, Reka Thirunavukkarasu <r...@wso2.com> >> wrote: >> >>> Hi >>> >>> On Tue, Oct 28, 2014 at 1:16 PM, Isuru Haththotuwa <isu...@apache.org> >>> wrote: >>> >>>> Thanks Reka for starting this Thread. >>>> >>>> Found two issues related to undeploying an Application: >>>> 1. https://issues.apache.org/jira/browse/STRATOS-918 - Fixed now. >>>> >>>> 2. Undeploying an Application doesn't remove it properly until the >>>> Member is activated. Looking in to this now. >>>> >>> >>> >>> We will need this fix for the member fault as well. If cluster monitor >>> starts a member upon member fault before the whole cluster termination, >>> then that cluster monitor is becoming active. Hence not going to terminated >>> state. Looking into that now.. >>> >> What is the State Transition in this case? Is it Terminating to Active? >> If so we might be able to generically handle this, since its a invalid >> state Transfer and mark the cluster as Invalid, and then terminate. For >> this, we need to introduce a new error state to cluster statuses. WDYT? >> > > +1 to introduce error states. So that those which are in error state can > be terminated by relevant monitors. > > But in this case the cluster should go through active --> inActive --> > terminating --> terminated. But due to network delay in receiving inActive > when member fault receives, cluster monitor tries to satisfy the min rule > by bringing one new member instead of the one got terminated. Then when > cluster monitor receives inActive, it tries to notify parent and etc. But > the newly spawned member got activated. then cluster monitor becomes > activated. After that, parent monitors send terminating notification. But > somehow this active monitor skips the terminating event. > Not sure if this is a silly suggestion since I might not have understood the scenario fully here. As I understand, the problem is that Cluster Monitor's mincheck getting triggered before the Cluster Monitor is marked as inactive. Since member fault is a case where we need to give the control to the parent (if dependency flag is set), can we pause the Cluster Monitor till the decision is taken from the parent? The Cluster Monitor can be resumed after parent gives back the control to the child. > >>> Thanks, >>> Reka >>> >>> >>> >>>> >>>> On Tue, Oct 28, 2014 at 1:11 PM, Reka Thirunavukkarasu <r...@wso2.com> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> This is to update the testing Developer Preview-3 for the end to end >>>>> work flow. Since we have introduced the termination behaviour, we are >>>>> executing the following steps to verify flow. >>>>> >>>>> * Deploy an composite application with nested groups >>>>> * Autoscaler wil bring them using defined startup order >>>>> * Application will become Active >>>>> >>>>> Case 1: >>>>> >>>>> * Terminate one cluster's VM from IaaS (where this cluster is >>>>> *independent* from all other siblings) >>>>> * Nothing will happen to parents >>>>> * Cluster eventually become active. >>>>> >>>>> This is working fine. >>>>> >>>>> Case 2: >>>>> >>>>> * Terminate one cluster's VM from IaaS (where this cluster is >>>>> *dependent* on some siblings) >>>>> * It will notify the parent about inActive state >>>>> * Parent will behave according its specified termination behaviour and >>>>> notify its parent >>>>> * When this notification stops where a parent has *kill-none or at >>>>> application level, *that parent will push all the children to be >>>>> terminated. >>>>> * Once all the children are terminated from the sub section, that >>>>> parent will bring them in parallel. >>>>> >>>>> Finalising this by identifying issues. >>>>> >>>>> Case 3: >>>>> >>>>> * Unsubscribing from application >>>>> - all the cluster will be marked as terminated and they will >>>>> gradually terminated.. >>>>> - once all the clusters are terminated, parent will be terminated >>>>> - Eventually application will be terminated and send the >>>>> application terminated event >>>>> - all others act upon application terminated event and remove the >>>>> application related information from their side. >>>>> >>>> >>>>> The above is working fine now.. >>>>> >>>>> - Metadata service will also remove app details (We are testing >>>>> this) >>>>> >>>>> FYI: >>>>> All the identified sibling to be terminated, will be terminated in >>>>> parallel as of now. We are not maintaining any order when terminating as i >>>>> explained in the earlier mail. >>>>> >>>>> Isuruh/Udara, can you also add, if i miss any testing steps? >>>>> >>>>> Thanks, >>>>> Reka >>>>> >>>>> -- >>>>> Reka Thirunavukkarasu >>>>> Senior Software Engineer, >>>>> WSO2, Inc.:http://wso2.com, >>>>> Mobile: >>>>> +94776442007 >>>>> >>>>> -- >>>>> <%2B94776442007> >>>>> Thanks and Regards, >>>>> >>>>> Isuru H. >>>>> <%2B94776442007> >>>>> +94 716 358 048 <%2B94776442007>* <http://wso2.com/>* >>>>> >>>>> >>>>> * <http://wso2.com/>* >>>>> >>>>> >>>>> >>> >>> >>> -- >>> Reka Thirunavukkarasu >>> Senior Software Engineer, >>> WSO2, Inc.:http://wso2.com, >>> Mobile: +94776442007 >>> >>> -- >>> <%2B94776442007> >>> Thanks and Regards, >>> >>> Isuru H. >>> <%2B94776442007> >>> +94 716 358 048 <%2B94776442007>* <http://wso2.com/>* >>> >>> >>> * <http://wso2.com/>* >>> >>> >>> > > > -- > Reka Thirunavukkarasu > Senior Software Engineer, > WSO2, Inc.:http://wso2.com, > Mobile: +94776442007 > > -- > <%2B94776442007> > Thanks and Regards, > > Isuru H. > <%2B94776442007> > +94 716 358 048 <%2B94776442007>* <http://wso2.com/>* > > > * <http://wso2.com/>* > > >