Hi Lahiru,
My comments below.

On Tue, Nov 11, 2014 at 11:30 PM, Lahiru Sandaruwan <lahi...@wso2.com>
wrote:

> Also this is a good read to learn how Netflix use these algorithms for
> scaling.
>
> http://techblog.netflix.com/search/label/prediction
>
> On Tue, Nov 11, 2014 at 11:07 PM, Lahiru Sandaruwan <lahi...@wso2.com>
> wrote:
>
>> Hi Seshika,
>>
>> Thanks for the detailed response,
>>
>> On Tue, Nov 11, 2014 at 10:08 PM, Seshika Fernando <sesh...@wso2.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I have 2 comments.
>>>
>>> a. The timeseries extension to CEP which supports uni-variate and
>>> multi-variate linear regression [1] can be used for this. We can use the
>>> multi-variate regression to solve the curve fitting stated in Lahiru's
>>> email. Basically what we need to do is use *t *and *t^2* as x1 and x2.
>>> There by if we run linear regression we get  a,b,c such that V=a+b*t+c*t^2.
>>>
>>
>> Nice to see this exists in CEP already :)
>>
>> +1 for using multi-variate regression to the curve fitting. Is it
>> available in CEP 3.1.0? or can we plug it to 3.1.0?
>>
>

> Its not available in CEP3.1.0, but since its an extension it can be easily
>> plugged in.
>>
>>
>>> As Lasantha has mentioned we do have a forecasting facility as well, but
>>> currently it only works for uni-variate regression, which is not the case
>>> here. But if you really need it I might be able to extend it for this
>>> use-case, for the moment. You can still use the existing regression
>>> facility to determine the coefficients and do the forecasting yourself
>>> (which is just plugging those values in to the above equation, with the
>>> relevant t values.
>>> Let me also just mention, that even though the function is 'linear'
>>> regression, we can use linear regression to fit polynomial curves as long
>>> as we know the degree of the polynomial function (which in this case we do).
>>>
>>>
>> In Stratos case, i would like to have the forecasting flexibility in
>> Autoscaler side. Because then we can let the user decide prediction time
>> duration even in runtime.
>>
>> BTW. Is it possible to do this in CEP, by redeploying execution plans or
>> so?
>>
>  Well currently we have a forecast function for univariate regression.
Where you can provide the y stream and the x stream and a value for the x
value you want to predict for. So for example,

timeseries:forecast(x+5, y, x) - will use y and x streams to compute the
regression equation and then provide the forecast y value for x+5, so you
can provide the prediction time here. However, the current implementation
only caters to 1 independent variable scenario. So if you want to do this
on CEP side you can simply get the coefficients using the
timeseries:regress() function and then write a further siddhi query, where
you get the v = at + bt^2 + c value for any incoming t value.

>
>> b. Can't we also consider using exponentially weighted moving averages
>>> for the previous approach. So instead of using average gradient and average
>>> second derivative we can use 'decaying windows' in CEP and get the
>>> exponentially weighted moving average of the gradient and second
>>> derivative. This will eliminate the spawning of new instances due to sudden
>>> 'spikes' as we can control the decaying factor such that we give a
>>> practically acceptable weightage to the most recent events compared to
>>> older events.
>>>
>>>
>> A great thought. Exponentially weighted moving average filter would be a
>> good addition for spike avoidance.
>>
>> Thanks.
>>
>>
>>> Seshika
>>>
>>> 1. https://docs.wso2.com/display/CEP400/Regression
>>>
>>> On Tue, Nov 11, 2014 at 8:51 PM, Lasantha Fernando <
>>> lasantha....@gmail.com> wrote:
>>>
>>>> Hi Lahiru,
>>>>
>>>> Would it be possible to use linear regression already available as
>>>> Siddhi extensions in [1] or maybe improve on that existing extensions
>>>> to extend it to fit polynomial curves? The code is available here [2].
>>>>
>>>> I think forecasting is also available which can be useful in this
>>>> usecase. WDYT? Just sharing my 2 cents.. :-)
>>>>
>>>> [1]
>>>> http://mail.wso2.org/mailarchive/architecture/2014-March/015696.html
>>>> [2]
>>>> https://github.com/wso2-dev/siddhi/tree/master/modules/siddhi-extensions
>>>>
>>>> Thanks,
>>>> Lasantha
>>>>
>>>> On Tue, Nov 11, 2014 at 3:58 PM, Lahiru Sandaruwan <lahi...@wso2.com>
>>>> wrote:
>>>> > Hi all,
>>>> >
>>>> > This contains the content i already sent to Stratos dev. Idea is to
>>>> > highlight and separate the new improvement.
>>>> >
>>>> > Current implementation
>>>> >
>>>> > Currently CEP calculates average, gradient, and second derivative and
>>>> send
>>>> > those values to Autoscaler. Then Autoscaler predicts the values using
>>>> S =
>>>> > u*t + 0.5*a*t*t.
>>>> >
>>>> > In this method CEP calculation is not very much accurate as it does
>>>> not
>>>> > consider all the events when calculating the gradient and second
>>>> derivative.
>>>> > Therefore the equation we apply doesn't yield the best prediction.
>>>> >
>>>> > Proposed Implementation
>>>> >
>>>> > CEP's task
>>>> >
>>>> > I think best approach is to do "curve fitting"[1] for received event
>>>> sample
>>>> > in a particular time window. Refer "Locally weighted linear
>>>> regression"
>>>> > section at [2] for more details.
>>>> >
>>>> > We would need a second degree polynomial fitter for this, where we
>>>> can use
>>>> > Apache commons math library for this. Refer the sample at [3], we can
>>>> run
>>>> > this with any degree. e.g. 2, 3. Just increase the degree to increase
>>>> the
>>>> > accuracy.
>>>> >
>>>> > E.g.
>>>> > So if get degree 2 polynomial fitter, we will have an equation like
>>>> below
>>>> > where value(v) is our statistic value and time(t) is the time of
>>>> event.
>>>> >
>>>> > Equation we get from received events,
>>>> > v = a*t*t + b*t + c
>>>> >
>>>> > So the solution is,
>>>> >
>>>> > Find memberwise curves that fits events received in specific
>>>> window(say 10
>>>> > minutes) at CEP
>>>> > Send the parameters of fitted line(a, b, and c in above equation)
>>>> with the
>>>> > timestamp of last event(T) in the window, to Autoscaler
>>>> >
>>>> > Autoscaler's task
>>>> >
>>>> > Autoscaler use v = a*t*t + b*t + c function to predict the value in
>>>> any
>>>> > timestamp from the last timestamp
>>>> >
>>>> > E.g. Say we need to find the value(v) after 1 minute(assuming we
>>>> carried all
>>>> > the calculations in milliseconds),
>>>> >
>>>> > v = a * (T+60000) * (T+60000) + b * (T+60000) + c
>>>> >
>>>> > So we have memberwise predictions and we can find clusterwise
>>>> prediction by
>>>> > averaging all the memberwise values.
>>>> >
>>>> >
>>>> > Please send your thoughts.
>>>> >
>>>> > Thanks.
>>>> >
>>>> > [1] http://en.wikipedia.org/wiki/Curve_fitting
>>>> > [2] http://cs229.stanford.edu/notes/cs229-notes1.pdf
>>>> > [3]
>>>> http://commons.apache.org/proper/commons-math/userguide/fitting.html
>>>> >
>>>> >
>>>> > --
>>>> > --
>>>> > Lahiru Sandaruwan
>>>> > Committer and PMC member, Apache Stratos,
>>>> > Senior Software Engineer,
>>>> > WSO2 Inc., http://wso2.com
>>>> > lean.enterprise.middleware
>>>> >
>>>> > email: lahi...@wso2.com blog: http://lahiruwrites.blogspot.com/
>>>> > linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>> >
>>>>
>>>
>>>
>>
>>
>> --
>> --
>> Lahiru Sandaruwan
>> Committer and PMC member, Apache Stratos,
>> Senior Software Engineer,
>> WSO2 Inc., http://wso2.com
>> lean.enterprise.middleware
>>
>> email: lahi...@wso2.com blog: http://lahiruwrites.blogspot.com/
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>
>
> --
> --
> Lahiru Sandaruwan
> Committer and PMC member, Apache Stratos,
> Senior Software Engineer,
> WSO2 Inc., http://wso2.com
> lean.enterprise.middleware
>
> email: lahi...@wso2.com blog: http://lahiruwrites.blogspot.com/
> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>
>

Reply via email to