[gsoc] Improvements to Autoscaling in Apache Stratos

Asiri Liyana Arachchi Mon, 18 Aug 2014 04:24:30 -0700

Hi Devs,

*Requirement  1: * Improve Autoscaling to predict the number of instances
required in the next time interval. Currently it predicts the load for next
time interval. Then a  threshold is used to decide on scale up or down.
In the current configuration it scales up or down one instance at a time.
[1]


*Implementation:* Number of required instance at the next minute is
calculated based on three factors.

Requests In Flight
Memory Consumption
Load Average

In the *requests in flight* based required instance calculation, a new
approach has been used.
Currently autoscalar expects threshold values like upper limit and lower
limit from the autoscaling policy. In the new implementation it's not
expected and a threshold is calculated using lb stats.

How that threshold is calculated:
A new stat "request served count" is introduced in LB. It keeps the value
of how many requests have been served since the last time the stat events
has been published to CEP.
Using CEP with related execution plan the average number of requests that
an instance can handle stat is being calculated. It's aggregated over a
time window of 10 minutes and a average is calculated and sent to
autoscalar.
Based on that value and the the predicted value of rif by the prevailing
predicting algorithm, the number of instances required is being calculated.

For the *Memory Consumption *and* Load Average *it's not possible to
calculate thresholds for them like in the rif. So the current stats are
being used to calculate required number of instances for  the next minute.
Once the required number of instances are calculated autoscalar will scale
up or down instances based on the logic written in rule file.

As suggested when scaling down it's done slowly to reduce the high
variation  in spawning and terminating instances.
This configuration is successfully tested. In testing considered most of the
 issues which can occur during an actual production environment.

This may look like ,that the  instances are spawned a lot with less control
and with rapid rate of spawning and terminating . But actually it doesn't.
Because the predicted values of considering metrics are included in the
equation and the scaling down slowly provides more stability to the
implementation.
It's more responsive than the previous configuration and provides high
availability.

Pull request [2] has been already made.


*Requirement  2: * Predict the load according to a schedule defined by end
user
Seasonal load expectation will be handled by this aspect. [1]

*Implementation Plan : *Define the deployment  policy allowing to add
schedule info with attributes for time , maximum and minimum partition
count.





The partition maxes and mins defined in the partition are kept as default
values while the system is operational. Once the scheduled time starts
those values will be replaced with schedule defined values.
Couldn't complete this aspect due to unexpected time amount had to spend
for setting up the stratos development environment. I'll continue the work
and will complete it.


Thank you so much  stratos community, for the support offered so far.




Related Mail threads [3]

[1] : https://issues.apache.org/jira/browse/STRATOS-488
[2]:  https://github.com/apache/stratos/pull/17
[3] :*https://www.mail-archive.com/[email protected]/msg00077.html
<https://www.mail-archive.com/[email protected]/msg00077.html>*
      
*http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3ccafwrs++z_czay7uvaaghvwrwnkuaoh_vrawehp_sndbsjqz...@mail.gmail.com%3E
<http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3ccafwrs++z_czay7uvaaghvwrwnkuaoh_vrawehp_sndbsjqz...@mail.gmail.com%3E>*


Regards,
Asiri

[gsoc] Improvements to Autoscaling in Apache Stratos

Reply via email to