Re: Improvements to Autoscaling in Apache Stratos [gsoc]

Lahiru Sandaruwan Tue, 29 Apr 2014 17:29:29 -0700

Hi Nirmal,

I thought the scenario a bit and explained at thread [1]. There, Isuru
Perera has sent a usecase that i used to explain how things happen. With
the new approach, we need a value from user, but it is not a threshold. It
is "the number of concurrent requests that one instance can handle".


Anyway we need more people think through this :)

Everyones ideas are highly appreciated since this is like the "brain" of
Stratos(Can live without it, but no use ;)).

Thanks.

[1] Load Balancer Statistics Publishing Sliding Window


On Tue, Apr 29, 2014 at 11:39 PM, Nirmal Fernando <[email protected]>wrote:

> Guys,
>
> What's the plan of finding value of the T (threshold)? To me, we need to
> get it from the user via auto-scaling policy.
>
>
> On Mon, Mar 31, 2014 at 11:40 PM, Lahiru Sandaruwan <[email protected]>wrote:
>
>> Hi,
>>
>>
>> On Sat, Mar 29, 2014 at 5:29 AM, Asiri Liyana Arachchi <
>> [email protected]> wrote:
>>
>>>
>>> *Predicting the Number of Instances.*
>>>
>>> Lets take
>>>
>>> n - predicted number of instances
>>> m - active instances
>>> T - threshold
>>> L - predicted next minute Load / memory consumption ( return value of
>>> the
>>> *org.apache.stratos.autoscaler.rule.RuleTasksDelegator#getPredictedValueForNextMinute()*method
>>>  )
>>> 0.8 - scale up factor
>>> 0.2 - scale down factor
>>>
>>> *Since Request in flight* is per Cluster
>>>
>>> Therefor as I understood threshold value for requestInFlight pretty much
>>> means how many requests that are inflight will be handled by an instance.
>>>
>>> n = L/(T*0.8)
>>>
>>> scale down is done only when predicted value is lower than the T*0.2
>>>
>>>
>>> *Memory Consumption (mc ) and Load Average (la )* is per member.
>>>
>>
>> We get these stats clusterwise as well. Currently clusterwise stat is
>> used for taking decision. Memberwise stats are used when we choosing nodes
>> for terminating. Least loaded node at the moment will be selected to
>> terminate.
>>
>>>
>>>
>>> m * L <= n * (T*0.8)
>>>
>>> Hence n can be calculated getting the ceiling value of  (m*L) / T as an
>>> int
>>> scale down is done only when predicted value is lower than the T*0.2
>>>
>>>
>>> *getPredictedValueForNextMinute() *predicts the next minute values. So
>>> rather than writing instance prediction algorithm from scratch using
>>> provided next minutes values , needed instances can be calculated easily.
>>> (IMO)
>>> Currently stratos auotoscaler is capable only of scaling up or down by
>>> one instance based on predicted values. But using this method it's capable
>>> of predicting exactly how many instances that should be spawned to handle
>>> the next minute load and even when scaling down it will predict how many
>>> instances that should be terminated.
>>> Code : [1]
>>>
>>> I would like to know your comments on this approach.
>>>
>>>
>>> [1] :
>>> https://github.com/asiriwork/autoscaler-stratos/blob/a770787dca78ecfa3649624613fbb505280a2fb9/org.apache.stratos.autoscaler/src/main/java/org/apache/stratos/autoscaler/rule/RuleTasksDelegator.java
>>>
>>>
>>> Regards,
>>> Asiri
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Mar 23, 2014 at 11:53 AM, Lahiru Sandaruwan <[email protected]>wrote:
>>>
>>>> Great to hear that.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> On Sat, Mar 22, 2014 at 1:53 AM, Asiri Liyana Arachchi <
>>>> [email protected]> wrote:
>>>>
>>>>> I've submit the proposal for "Improvements to Autoscaling for Apache
>>>>> Stratos" project at google-melange.
>>>>>
>>>>> Here is the link
>>>>>
>>>>>
>>>>> https://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/asiria/5629499534213120
>>>>>
>>>>>
>>>>> Regards
>>>>> Asiri
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 18, 2014 at 4:29 AM, Asiri Liyana Arachchi <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Thanks a lot for the elaborated reply.
>>>>>>
>>>>>> It helped a lot in getting familiar with Drools by running samples as
>>>>>> you've pointed. And I've built the code base.
>>>>>>
>>>>>> After going through scaling.drl
>>>>>> (products/autoscaler/modules/distribution/src/main/conf/scaling.drl) it 
>>>>>> was
>>>>>> clear that currently stratos uses
>>>>>> RuleTasksDelegator.getPredictedValueForNextMinute() method to compare, 
>>>>>> stat
>>>>>> values against the thresholds.
>>>>>>
>>>>>> *Approach on deciding the number of instances that might need to
>>>>>> handle the load:*
>>>>>>
>>>>>> Using existing method on predicting next minute Requests inflight,
>>>>>> Load average and Memory Consumption.
>>>>>>
>>>>>>    - Assumption: current thresholds of those metrics are the optimal
>>>>>>    values for an instance.
>>>>>>    - Based on that implementing a simple algorithm to decide, how
>>>>>>    many number of instances that might need for the next minute using
>>>>>>    predicted values for those metrics.
>>>>>>    - That algorithm will be implemented in such a way that it always
>>>>>>    will keep the instances under thresholds (or near thresholds ) of one 
>>>>>> or
>>>>>>    more metrics , with out exceeding them.
>>>>>>    - Assumption : metrics act inverse or direct proportionally when
>>>>>>    instances are spawned. (for an ex. load  is equally distributed among 
>>>>>> all
>>>>>>    the instances + newly spawned instances. )
>>>>>>
>>>>>> *Predict the load according to a schedule defined by end user *
>>>>>>
>>>>>> *Does this mean providing a functionality in web UI to define a
>>>>>> schedule and make it active? *It's not clear to me.
>>>>>> *Can this be achieved by generating an auto scale policy xml with
>>>>>> user defined thresholds similar to how it’s done currently and making it
>>>>>> possible to override the *auto-scaling* algorithm in use when it’s
>>>>>> needed (like in a specific time *which is already defined) ? .
>>>>>>
>>>>>> Thanks
>>>>>> Asiri
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 12, 2014 at 8:05 AM, Lahiru Sandaruwan 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Hi Asiri,
>>>>>>>
>>>>>>> It is a pleasure to see your interest. Sorry for the late reply. I
>>>>>>> missed the mail.
>>>>>>>
>>>>>>> Get the code base and build as a starting point for Stratos.
>>>>>>>
>>>>>>> You will not find Drools hard, after running some samples. [1] looks
>>>>>>> like a good sample. You can just run those in WSO2 BRS. You can use your
>>>>>>> Java knowledge as we can write Java code in "then" section.
>>>>>>>
>>>>>>> AMQP knowledge means you have to understand pub/sub model with
>>>>>>> topics. Conceptually thats it. In addition, handling subs/pubs using 
>>>>>>> java
>>>>>>> codes.
>>>>>>>
>>>>>>> Great research, find the comments inline.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 11, 2014 at 11:23 AM, Asiri Liyana Arachchi <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> 1. Improve Auto-scaling to predict the number of instances required
>>>>>>>> in the next time interval.
>>>>>>>>
>>>>>>>> As far as I understood, this project aims at introducing a new auto
>>>>>>>> scaling strategy apart from the threshold based auto scaling which is
>>>>>>>> currently in use, to stratos  making it more proactive on auto-scaling.
>>>>>>>>
>>>>>>>
>>>>>>> Correct. So system should scale, understanding the load and hence
>>>>>>> the number of instances that would require to handle that load.
>>>>>>>
>>>>>>> We have 3 types of information about load, and should consider all 3
>>>>>>> for our decision.
>>>>>>>
>>>>>>>    - Requests inflight(Information about how many requests are
>>>>>>>    waiting to get the response)
>>>>>>>    - Load average of cartridge instances running
>>>>>>>    - Memory consumption of cartridge instances running
>>>>>>>
>>>>>>>
>>>>>>> To do that there are several strategies suggested.
>>>>>>>>
>>>>>>>> 1. Kalman Filter
>>>>>>>> 2. Control theory
>>>>>>>> 3. Time Series Analysis.
>>>>>>>> 4. FFT
>>>>>>>>
>>>>>>>> As I've gone through these techniques as for now I felt that Kalman
>>>>>>>> Filter would be the most viable candidate and it can be used to address
>>>>>>>> this issue effectively.There is an apache API for Kalman filter [1].
>>>>>>>>
>>>>>>>
>>>>>>> We should find an efficient, yet simplest way to get the job done.
>>>>>>>  We currently use S = u*t + 0.5 *a*t*t prediction(motion) equation. 
>>>>>>> This is
>>>>>>> one of the equations that Kalman filter used to do prediction. But with
>>>>>>> this, we have to compare with a threshold to take the decision.
>>>>>>>
>>>>>>> We receive second derivative, gradient and average values at a given
>>>>>>> time. Lets say we time interval we consider is minute. So we can predict
>>>>>>> the load in the next minute using them.
>>>>>>> Also we know the number of instances that are running at the moment.
>>>>>>> The algorithm does not need to be complex. It should be just intelligent
>>>>>>> enough to find the matching number of instances that should be there in 
>>>>>>> the
>>>>>>> next minute.
>>>>>>>
>>>>>>> [1] https://docs.wso2.org/display/BRS200/Sample+Rule+Definition
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>>
>>>>>>>> But I think selecting an auto scaling algorithm would involve more
>>>>>>>> of research and testing. Even selecting metrics to predict on will 
>>>>>>>> also be
>>>>>>>> challenging because some of the metrics for an example *load
>>>>>>>> average *depends on autos-scalling causing predictions to deviate
>>>>>>>> from the actual values.
>>>>>>>>
>>>>>>> I would appreciate if you can comment on this.
>>>>>>>>
>>>>>>>> [1] :
>>>>>>>> http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/filter/KalmanFilter.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Asiri
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 6, 2014 at 7:38 AM, Udara Liyanage <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> Hi Asiri,
>>>>>>>>>
>>>>>>>>> Glad to hear your interest on Stratos. I don't think it will take
>>>>>>>>> more than few days to learn drools and amqp. You will be able to do it
>>>>>>>>> within given time period.
>>>>>>>>> Happy to see your project proposal soon.
>>>>>>>>>
>>>>>>>>> Touched, not typed. Erroneous words are a feature, not a typo.
>>>>>>>>> On Mar 6, 2014 7:13 AM, "Asiri Liyana Arachchi" <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I'm Asiri Liyana Arachchi , third year student studying Computer
>>>>>>>>>> Science and Engineering in University of Moratuwa , Sri Lanka.
>>>>>>>>>> I would like to start contributing towards the project $subject
>>>>>>>>>> .I've gone through the resources about this project including stratos
>>>>>>>>>> documentation and the code-base.
>>>>>>>>>>
>>>>>>>>>> As expected I'm familiur with java , json and SOA . I would like
>>>>>>>>>> to know how well and in what cases Drools and APQM skills are 
>>>>>>>>>> required.
>>>>>>>>>> Also would it be feasible to complete the project in the projects 
>>>>>>>>>> limited
>>>>>>>>>> time, considered that the Drools and APQM are to be learnt along 
>>>>>>>>>> with the
>>>>>>>>>> total work of the project.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> Asiri
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --
>>>>>>> Lahiru Sandaruwan
>>>>>>> Software Engineer,
>>>>>>> Platform Technologies,
>>>>>>> WSO2 Inc., http://wso2.com
>>>>>>> lean.enterprise.middleware
>>>>>>>
>>>>>>> email: [email protected] cell: (+94) 773 325 954
>>>>>>> blog: http://lahiruwrites.blogspot.com/
>>>>>>> twitter: http://twitter.com/lahirus
>>>>>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Lahiru Sandaruwan
>>>> Software Engineer,
>>>> Platform Technologies,
>>>> WSO2 Inc., http://wso2.com
>>>> lean.enterprise.middleware
>>>>
>>>> email: [email protected] cell: (+94) 773 325 954
>>>> blog: http://lahiruwrites.blogspot.com/
>>>> twitter: http://twitter.com/lahirus
>>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>>
>>>>
>>>
>>
>>
>> --
>> --
>> Lahiru Sandaruwan
>> Software Engineer,
>> Platform Technologies,
>> WSO2 Inc., http://wso2.com
>> lean.enterprise.middleware
>>
>> email: [email protected] cell: (+94) 773 325 954
>> blog: http://lahiruwrites.blogspot.com/
>> twitter: http://twitter.com/lahirus
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>
>
> --
> Best Regards,
> Nirmal
>
> Nirmal Fernando.
> PPMC Member & Committer of Apache Stratos,
> Senior Software Engineer, WSO2 Inc.
>
> Blog: http://nirmalfdo.blogspot.com/
>



-- 
--
Lahiru Sandaruwan
Committer and PPMC member, Apache Stratos(incubating),
Senior Software Engineer,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware

email: [email protected] cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146

Re: Improvements to Autoscaling in Apache Stratos [gsoc]

Reply via email to