Re: Improvements to Autoscaling in Apache Stratos [gsoc]

Lahiru Sandaruwan Tue, 29 Apr 2014 20:34:14 -0700

On Wed, Apr 30, 2014 at 8:24 AM, Nirmal Fernando <[email protected]>wrote:


> Hi Lahiru,
>
> I still don't understand what's the difference here. This is the same
> concept we had from pre-Apache era. In the requests-in-flight case, user
> gives the # requests that an instance could bear and based on the current
> load we would scale.
>

Please note the difference of this # i have mentioned at the thread i
pointed. This number is bit different now and then.

>
> And AFAIS what we need to improve is the prediction logic.
>

No. We do not stop after the prediction. We calculate the number of
instances, that we did not do before.

Then we do not have to worry about upper limit and lower limit.


>
> On Wed, Apr 30, 2014 at 5:58 AM, Lahiru Sandaruwan <[email protected]>wrote:
>
>> Hi Nirmal,
>>
>> I thought the scenario a bit and explained at thread [1]. There, Isuru
>> Perera has sent a usecase that i used to explain how things happen. With
>> the new approach, we need a value from user, but it is not a threshold. It
>> is "the number of concurrent requests that one instance can handle".
>>
>> Anyway we need more people think through this :)
>>
>> Everyones ideas are highly appreciated since this is like the "brain" of
>> Stratos(Can live without it, but no use ;)).
>>
>> Thanks.
>>
>> [1] Load Balancer Statistics Publishing Sliding Window
>>
>>
>> On Tue, Apr 29, 2014 at 11:39 PM, Nirmal Fernando <[email protected]
>> > wrote:
>>
>>> Guys,
>>>
>>> What's the plan of finding value of the T (threshold)? To me, we need to
>>> get it from the user via auto-scaling policy.
>>>
>>>
>>> On Mon, Mar 31, 2014 at 11:40 PM, Lahiru Sandaruwan <[email protected]>wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> On Sat, Mar 29, 2014 at 5:29 AM, Asiri Liyana Arachchi <
>>>> [email protected]> wrote:
>>>>
>>>>>
>>>>> *Predicting the Number of Instances.*
>>>>>
>>>>> Lets take
>>>>>
>>>>> n - predicted number of instances
>>>>> m - active instances
>>>>> T - threshold
>>>>> L - predicted next minute Load / memory consumption ( return value of
>>>>> the
>>>>> *org.apache.stratos.autoscaler.rule.RuleTasksDelegator#getPredictedValueForNextMinute()*method
>>>>>  )
>>>>> 0.8 - scale up factor
>>>>> 0.2 - scale down factor
>>>>>
>>>>> *Since Request in flight* is per Cluster
>>>>>
>>>>> Therefor as I understood threshold value for requestInFlight pretty
>>>>> much means how many requests that are inflight will be handled by an
>>>>> instance.
>>>>>
>>>>> n = L/(T*0.8)
>>>>>
>>>>> scale down is done only when predicted value is lower than the T*0.2
>>>>>
>>>>>
>>>>> *Memory Consumption (mc ) and Load Average (la )* is per member.
>>>>>
>>>>
>>>> We get these stats clusterwise as well. Currently clusterwise stat is
>>>> used for taking decision. Memberwise stats are used when we choosing nodes
>>>> for terminating. Least loaded node at the moment will be selected to
>>>> terminate.
>>>>
>>>>>
>>>>>
>>>>> m * L <= n * (T*0.8)
>>>>>
>>>>> Hence n can be calculated getting the ceiling value of  (m*L) / T as
>>>>> an int
>>>>> scale down is done only when predicted value is lower than the T*0.2
>>>>>
>>>>>
>>>>> *getPredictedValueForNextMinute() *predicts the next minute values.
>>>>> So rather than writing instance prediction algorithm from scratch using
>>>>> provided next minutes values , needed instances can be calculated easily.
>>>>> (IMO)
>>>>> Currently stratos auotoscaler is capable only of scaling up or down by
>>>>> one instance based on predicted values. But using this method it's capable
>>>>> of predicting exactly how many instances that should be spawned to handle
>>>>> the next minute load and even when scaling down it will predict how many
>>>>> instances that should be terminated.
>>>>> Code : [1]
>>>>>
>>>>> I would like to know your comments on this approach.
>>>>>
>>>>>
>>>>> [1] :
>>>>> https://github.com/asiriwork/autoscaler-stratos/blob/a770787dca78ecfa3649624613fbb505280a2fb9/org.apache.stratos.autoscaler/src/main/java/org/apache/stratos/autoscaler/rule/RuleTasksDelegator.java
>>>>>
>>>>>
>>>>> Regards,
>>>>> Asiri
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Mar 23, 2014 at 11:53 AM, Lahiru Sandaruwan 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Great to hear that.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 22, 2014 at 1:53 AM, Asiri Liyana Arachchi <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> I've submit the proposal for "Improvements to Autoscaling for Apache
>>>>>>> Stratos" project at google-melange.
>>>>>>>
>>>>>>> Here is the link
>>>>>>>
>>>>>>>
>>>>>>> https://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/asiria/5629499534213120
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>> Asiri
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 18, 2014 at 4:29 AM, Asiri Liyana Arachchi <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Thanks a lot for the elaborated reply.
>>>>>>>>
>>>>>>>> It helped a lot in getting familiar with Drools by running samples
>>>>>>>> as you've pointed. And I've built the code base.
>>>>>>>>
>>>>>>>> After going through scaling.drl
>>>>>>>> (products/autoscaler/modules/distribution/src/main/conf/scaling.drl) 
>>>>>>>> it was
>>>>>>>> clear that currently stratos uses
>>>>>>>> RuleTasksDelegator.getPredictedValueForNextMinute() method to compare, 
>>>>>>>> stat
>>>>>>>> values against the thresholds.
>>>>>>>>
>>>>>>>> *Approach on deciding the number of instances that might need to
>>>>>>>> handle the load:*
>>>>>>>>
>>>>>>>> Using existing method on predicting next minute Requests inflight,
>>>>>>>> Load average and Memory Consumption.
>>>>>>>>
>>>>>>>>    - Assumption: current thresholds of those metrics are the
>>>>>>>>    optimal values for an instance.
>>>>>>>>    - Based on that implementing a simple algorithm to decide, how
>>>>>>>>    many number of instances that might need for the next minute using
>>>>>>>>    predicted values for those metrics.
>>>>>>>>    - That algorithm will be implemented in such a way that it
>>>>>>>>    always will keep the instances under thresholds (or near thresholds 
>>>>>>>> ) of
>>>>>>>>    one or more metrics , with out exceeding them.
>>>>>>>>    - Assumption : metrics act inverse or direct proportionally
>>>>>>>>    when instances are spawned. (for an ex. load  is equally 
>>>>>>>> distributed among
>>>>>>>>    all the instances + newly spawned instances. )
>>>>>>>>
>>>>>>>> *Predict the load according to a schedule defined by end user *
>>>>>>>>
>>>>>>>> *Does this mean providing a functionality in web UI to define a
>>>>>>>> schedule and make it active? *It's not clear to me.
>>>>>>>> *Can this be achieved by generating an auto scale policy xml with
>>>>>>>> user defined thresholds similar to how it’s done currently and making 
>>>>>>>> it
>>>>>>>> possible to override the *auto-scaling* algorithm in use when it’s
>>>>>>>> needed (like in a specific time *which is already defined) ? .
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Asiri
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 12, 2014 at 8:05 AM, Lahiru Sandaruwan <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Asiri,
>>>>>>>>>
>>>>>>>>> It is a pleasure to see your interest. Sorry for the late reply. I
>>>>>>>>> missed the mail.
>>>>>>>>>
>>>>>>>>> Get the code base and build as a starting point for Stratos.
>>>>>>>>>
>>>>>>>>> You will not find Drools hard, after running some samples. [1]
>>>>>>>>> looks like a good sample. You can just run those in WSO2 BRS. You can 
>>>>>>>>> use
>>>>>>>>> your Java knowledge as we can write Java code in "then" section.
>>>>>>>>>
>>>>>>>>> AMQP knowledge means you have to understand pub/sub model with
>>>>>>>>> topics. Conceptually thats it. In addition, handling subs/pubs using 
>>>>>>>>> java
>>>>>>>>> codes.
>>>>>>>>>
>>>>>>>>> Great research, find the comments inline.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Mar 11, 2014 at 11:23 AM, Asiri Liyana Arachchi <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> 1. Improve Auto-scaling to predict the number of instances
>>>>>>>>>> required in the next time interval.
>>>>>>>>>>
>>>>>>>>>> As far as I understood, this project aims at introducing a new
>>>>>>>>>> auto scaling strategy apart from the threshold based auto scaling 
>>>>>>>>>> which is
>>>>>>>>>> currently in use, to stratos  making it more proactive on 
>>>>>>>>>> auto-scaling.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Correct. So system should scale, understanding the load and hence
>>>>>>>>> the number of instances that would require to handle that load.
>>>>>>>>>
>>>>>>>>> We have 3 types of information about load, and should consider all
>>>>>>>>> 3 for our decision.
>>>>>>>>>
>>>>>>>>>    - Requests inflight(Information about how many requests are
>>>>>>>>>    waiting to get the response)
>>>>>>>>>    - Load average of cartridge instances running
>>>>>>>>>    - Memory consumption of cartridge instances running
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To do that there are several strategies suggested.
>>>>>>>>>>
>>>>>>>>>> 1. Kalman Filter
>>>>>>>>>> 2. Control theory
>>>>>>>>>> 3. Time Series Analysis.
>>>>>>>>>> 4. FFT
>>>>>>>>>>
>>>>>>>>>> As I've gone through these techniques as for now I felt that
>>>>>>>>>> Kalman Filter would be the most viable candidate and it can be used 
>>>>>>>>>> to
>>>>>>>>>> address this issue effectively.There is an apache API for Kalman 
>>>>>>>>>> filter [1].
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We should find an efficient, yet simplest way to get the job done.
>>>>>>>>>  We currently use S = u*t + 0.5 *a*t*t prediction(motion) equation. 
>>>>>>>>> This is
>>>>>>>>> one of the equations that Kalman filter used to do prediction. But 
>>>>>>>>> with
>>>>>>>>> this, we have to compare with a threshold to take the decision.
>>>>>>>>>
>>>>>>>>> We receive second derivative, gradient and average values at a
>>>>>>>>> given time. Lets say we time interval we consider is minute. So we can
>>>>>>>>> predict the load in the next minute using them.
>>>>>>>>> Also we know the number of instances that are running at the
>>>>>>>>> moment. The algorithm does not need to be complex. It should be just
>>>>>>>>> intelligent enough to find the matching number of instances that 
>>>>>>>>> should be
>>>>>>>>> there in the next minute.
>>>>>>>>>
>>>>>>>>> [1] https://docs.wso2.org/display/BRS200/Sample+Rule+Definition
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But I think selecting an auto scaling algorithm would involve
>>>>>>>>>> more of research and testing. Even selecting metrics to predict on 
>>>>>>>>>> will
>>>>>>>>>> also be challenging because some of the metrics for an example *load
>>>>>>>>>> average *depends on autos-scalling causing predictions to
>>>>>>>>>> deviate from the actual values.
>>>>>>>>>>
>>>>>>>>> I would appreciate if you can comment on this.
>>>>>>>>>>
>>>>>>>>>> [1] :
>>>>>>>>>> http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/filter/KalmanFilter.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Asiri
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Mar 6, 2014 at 7:38 AM, Udara Liyanage <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Asiri,
>>>>>>>>>>>
>>>>>>>>>>> Glad to hear your interest on Stratos. I don't think it will
>>>>>>>>>>> take more than few days to learn drools and amqp. You will be able 
>>>>>>>>>>> to do it
>>>>>>>>>>> within given time period.
>>>>>>>>>>> Happy to see your project proposal soon.
>>>>>>>>>>>
>>>>>>>>>>> Touched, not typed. Erroneous words are a feature, not a typo.
>>>>>>>>>>> On Mar 6, 2014 7:13 AM, "Asiri Liyana Arachchi" <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm Asiri Liyana Arachchi , third year student studying
>>>>>>>>>>>> Computer Science and Engineering in University of Moratuwa , Sri 
>>>>>>>>>>>> Lanka.
>>>>>>>>>>>> I would like to start contributing towards the project $subject
>>>>>>>>>>>> .I've gone through the resources about this project including 
>>>>>>>>>>>> stratos
>>>>>>>>>>>> documentation and the code-base.
>>>>>>>>>>>>
>>>>>>>>>>>> As expected I'm familiur with java , json and SOA . I would
>>>>>>>>>>>> like to know how well and in what cases Drools and APQM skills are
>>>>>>>>>>>> required. Also would it be feasible to complete the project in the 
>>>>>>>>>>>> projects
>>>>>>>>>>>> limited time, considered that the Drools and APQM are to be learnt 
>>>>>>>>>>>> along
>>>>>>>>>>>> with the total work of the project.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> Asiri
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> Lahiru Sandaruwan
>>>>>>>>> Software Engineer,
>>>>>>>>> Platform Technologies,
>>>>>>>>> WSO2 Inc., http://wso2.com
>>>>>>>>> lean.enterprise.middleware
>>>>>>>>>
>>>>>>>>> email: [email protected] cell: (+94) 773 325 954
>>>>>>>>> blog: http://lahiruwrites.blogspot.com/
>>>>>>>>> twitter: http://twitter.com/lahirus
>>>>>>>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --
>>>>>> Lahiru Sandaruwan
>>>>>> Software Engineer,
>>>>>> Platform Technologies,
>>>>>> WSO2 Inc., http://wso2.com
>>>>>> lean.enterprise.middleware
>>>>>>
>>>>>> email: [email protected] cell: (+94) 773 325 954
>>>>>> blog: http://lahiruwrites.blogspot.com/
>>>>>> twitter: http://twitter.com/lahirus
>>>>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Lahiru Sandaruwan
>>>> Software Engineer,
>>>> Platform Technologies,
>>>> WSO2 Inc., http://wso2.com
>>>> lean.enterprise.middleware
>>>>
>>>> email: [email protected] cell: (+94) 773 325 954
>>>> blog: http://lahiruwrites.blogspot.com/
>>>> twitter: http://twitter.com/lahirus
>>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Nirmal
>>>
>>> Nirmal Fernando.
>>> PPMC Member & Committer of Apache Stratos,
>>> Senior Software Engineer, WSO2 Inc.
>>>
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>
>>
>>
>> --
>> --
>> Lahiru Sandaruwan
>> Committer and PPMC member, Apache Stratos(incubating),
>> Senior Software Engineer,
>> WSO2 Inc., http://wso2.com
>> lean.enterprise.middleware
>>
>> email: [email protected] cell: (+94) 773 325 954
>> blog: http://lahiruwrites.blogspot.com/
>> twitter: http://twitter.com/lahirus
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>
>
> --
> Best Regards,
> Nirmal
>
> Nirmal Fernando.
> PPMC Member & Committer of Apache Stratos,
> Senior Software Engineer, WSO2 Inc.
>
> Blog: http://nirmalfdo.blogspot.com/
>



-- 
--
Lahiru Sandaruwan
Committer and PPMC member, Apache Stratos(incubating),
Senior Software Engineer,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware

email: [email protected] cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146

Re: Improvements to Autoscaling in Apache Stratos [gsoc]

Reply via email to