On Sat, Oct 18, 2014 at 12:19 PM, Imesh Gunaratne <im...@apache.org> wrote:
> Hi Martin, > > Please find my comments inline: > > On Fri, Oct 17, 2014 at 11:38 PM, Martin Eppel (meppel) <mep...@cisco.com> > wrote: > >> I would like to discuss what it would take to achieve 100% uptime for >> stratos in a production environment (aiming high to reach the five nines) >> - if it had been discussed before please point me to the email thread. >> > > Unlike other software systems a quite small downtime of a PaaS might not > affect the deployed services because it will not bring the services > (running instances) down. However yes we need to provide 100% uptime. > >> >> >> The goal is to identify recommended deployment scenarios and possible >> shortcomings (or readiness ) to reach five nines. >> >> >> >> This includes the following scenarios: >> >> + maintenance cycles, >> >> + upgrades, >> >> + hardware and software failures >> >> + scalability >> >> + ... ? >> > +1 We need to address all of these > >> >> >> Generally, it seems the suggested system model to reach 100% uptime (or >> the highest possible uptime) is a n way redundancy model with multiple >> active / standby assignments. >> >> >> >> I looked in the HA for 4.1, see web link >> https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Providing+High+Availability+for+Stratos >> : >> >> >> >> Stratos allows for 2 deployment models, single JVM and distributed >> deployment model. >> >> >> >> Which one will be better suited to reach the stated goal of 100% uptime / >> n way redundancy model ? >> > > Deployment model is about the level of capacity Stratos can provide > (number of instances that it can support) not the level of HA. In both > deployment models we should be able to provide same level of HA. > >> >> >> According the link (and please correct me if I am wrong), it seems that >> currently the components to allow n-way redundancy are: >> >> >> >> + BAM (doc is not updated yet, see >> https://docs.wso2.com/display/CLUSTER420/Clustering+BAM+2.5.0 ? >> > > Yes we are still using BAM 2.4.1 I believe, please see the below link: > > https://docs.wso2.com/display/CLUSTER420/Fully-Distributed%2C+High-Availability+BAM+Setup > > >> >> >> + core component (Manager, Autoscaler, Cloud controller) in >> active/passive mode through Linux HA >> RDBMS used for registry needs to support n-way redundancy as well >> > > Currently I'm doing a POC on this using Pacemaker/Heartbeat, will provide > details soon: > https://issues.apache.org/jira/browse/STRATOS-897 > Imesh is working on how you can achieved active/passive with Stratos 4.0.0. But we are working on active/active for all stratos core with clustering support which going to address both high availability and scalability. IMO we should release it with next immediate release. > > > >> >> + ActiveMq >> multiple models suggested, Zookeeper, shared DB or shared file >> systems. Which one would be recommended to achieve h-way redundancy ? >> > > Yes we need to do more investigations here. > >> >> >> CEP seems to allow a 2 node configuration only or is there support for >> n-way redundancy ? >> > > In distributed cache mode deployment it supports many CEP instances, will > check on this further: > > https://docs.wso2.com/display/CLUSTER420/Clustering+Complex+Event+Processor#ClusteringComplexEventProcessor-Distributedcachemodedeployment > > >> >> >> Stratos Load Balancer, lists some caveat like session affinity not >> supported in distributed environment, n-way ready ? >> >> >> > Yes still load balancer does not have features to replicate session > information in a distributed deployment. > > > Thanks > > -- > Imesh Gunaratne > > Technical Lead, WSO2 > Committer & PMC Member, Apache Stratos > -- Lakmal Warusawithana Vice President, Apache Stratos Director - Cloud Architecture; WSO2 Inc. Mobile : +94714289692 Blog : http://lakmalsview.blogspot.com/