Hi Devs, *Requirement 1: * Improve Autoscaling to predict the number of instances required in the next time interval. Currently it predicts the load for next time interval. Then a threshold is used to decide on scale up or down. In the current configuration it scales up or down one instance at a time. [1]
*Implementation:* Number of required instance at the next minute is calculated based on three factors. Requests In Flight Memory Consumption Load Average In the *requests in flight* based required instance calculation, a new approach has been used. Currently autoscalar expects threshold values like upper limit and lower limit from the autoscaling policy. In the new implementation it's not expected and a threshold is calculated using lb stats. How that threshold is calculated: A new stat "request served count" is introduced in LB. It keeps the value of how many requests have been served since the last time the stat events has been published to CEP. Using CEP with related execution plan the average number of requests that an instance can handle stat is being calculated. It's aggregated over a time window of 10 minutes and a average is calculated and sent to autoscalar. Based on that value and the the predicted value of rif by the prevailing predicting algorithm, the number of instances required is being calculated. For the *Memory Consumption *and* Load Average *it's not possible to calculate thresholds for them like in the rif. So the current stats are being used to calculate required number of instances for the next minute. Once the required number of instances are calculated autoscalar will scale up or down instances based on the logic written in rule file. As suggested when scaling down it's done slowly to reduce the high variation in spawning and terminating instances. This configuration is successfully tested. In testing considered most of the issues which can occur during an actual production environment. This may look like ,that the instances are spawned a lot with less control and with rapid rate of spawning and terminating . But actually it doesn't. Because the predicted values of considering metrics are included in the equation and the scaling down slowly provides more stability to the implementation. It's more responsive than the previous configuration and provides high availability. Pull request [2] has been already made. *Requirement 2: * Predict the load according to a schedule defined by end user Seasonal load expectation will be handled by this aspect. [1] *Implementation Plan : *Define the deployment policy allowing to add schedule info with attributes for time , maximum and minimum partition count. The partition maxes and mins defined in the partition are kept as default values while the system is operational. Once the scheduled time starts those values will be replaced with schedule defined values. Couldn't complete this aspect due to unexpected time amount had to spend for setting up the stratos development environment. I'll continue the work and will complete it. Thank you so much stratos community, for the support offered so far. Related Mail threads [3] [1] : https://issues.apache.org/jira/browse/STRATOS-488 [2]: https://github.com/apache/stratos/pull/17 [3] :*https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html <https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html>* *http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3ccafwrs++z_czay7uvaaghvwrwnkuaoh_vrawehp_sndbsjqz...@mail.gmail.com%3E <http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3ccafwrs++z_czay7uvaaghvwrwnkuaoh_vrawehp_sndbsjqz...@mail.gmail.com%3E>* Regards, Asiri