I would like to discuss what it would take to achieve 100% uptime for stratos in a production environment (aiming high to reach the five nines) - if it had been discussed before please point me to the email thread.
The goal is to identify recommended deployment scenarios and possible shortcomings (or readiness ) to reach five nines. This includes the following scenarios: + maintenance cycles, + upgrades, + hardware and software failures + scalability + ... ? Generally, it seems the suggested system model to reach 100% uptime (or the highest possible uptime) is a n way redundancy model with multiple active / standby assignments. I looked in the HA for 4.1, see web link https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Providing+High+Availability+for+Stratos : Stratos allows for 2 deployment models, single JVM and distributed deployment model. Which one will be better suited to reach the stated goal of 100% uptime / n way redundancy model ? According the link (and please correct me if I am wrong), it seems that currently the components to allow n-way redundancy are: + BAM (doc is not updated yet, see https://docs.wso2.com/display/CLUSTER420/Clustering+BAM+2.5.0 ? + core component (Manager, Autoscaler, Cloud controller) in active/passive mode through Linux HA RDBMS used for registry needs to support n-way redundancy as well + ActiveMq multiple models suggested, Zookeeper, shared DB or shared file systems. Which one would be recommended to achieve h-way redundancy ? CEP seems to allow a 2 node configuration only or is there support for n-way redundancy ? Stratos Load Balancer, lists some caveat like session affinity not supported in distributed environment, n-way ready ? Any other component I have missed ? What are the missing pieces to reach n-way redundancy (or 100% uptime) ? Are there any other models to reach the stated goal and what would it take to get stratos there ? Thanks Martin