Hi Lahiru, Would it be possible to use linear regression already available as Siddhi extensions in [1] or maybe improve on that existing extensions to extend it to fit polynomial curves? The code is available here [2].
I think forecasting is also available which can be useful in this usecase. WDYT? Just sharing my 2 cents.. :-) [1] http://mail.wso2.org/mailarchive/architecture/2014-March/015696.html [2] https://github.com/wso2-dev/siddhi/tree/master/modules/siddhi-extensions Thanks, Lasantha On Tue, Nov 11, 2014 at 3:58 PM, Lahiru Sandaruwan <lahi...@wso2.com> wrote: > Hi all, > > This contains the content i already sent to Stratos dev. Idea is to > highlight and separate the new improvement. > > Current implementation > > Currently CEP calculates average, gradient, and second derivative and send > those values to Autoscaler. Then Autoscaler predicts the values using S = > u*t + 0.5*a*t*t. > > In this method CEP calculation is not very much accurate as it does not > consider all the events when calculating the gradient and second derivative. > Therefore the equation we apply doesn't yield the best prediction. > > Proposed Implementation > > CEP's task > > I think best approach is to do "curve fitting"[1] for received event sample > in a particular time window. Refer "Locally weighted linear regression" > section at [2] for more details. > > We would need a second degree polynomial fitter for this, where we can use > Apache commons math library for this. Refer the sample at [3], we can run > this with any degree. e.g. 2, 3. Just increase the degree to increase the > accuracy. > > E.g. > So if get degree 2 polynomial fitter, we will have an equation like below > where value(v) is our statistic value and time(t) is the time of event. > > Equation we get from received events, > v = a*t*t + b*t + c > > So the solution is, > > Find memberwise curves that fits events received in specific window(say 10 > minutes) at CEP > Send the parameters of fitted line(a, b, and c in above equation) with the > timestamp of last event(T) in the window, to Autoscaler > > Autoscaler's task > > Autoscaler use v = a*t*t + b*t + c function to predict the value in any > timestamp from the last timestamp > > E.g. Say we need to find the value(v) after 1 minute(assuming we carried all > the calculations in milliseconds), > > v = a * (T+60000) * (T+60000) + b * (T+60000) + c > > So we have memberwise predictions and we can find clusterwise prediction by > averaging all the memberwise values. > > > Please send your thoughts. > > Thanks. > > [1] http://en.wikipedia.org/wiki/Curve_fitting > [2] http://cs229.stanford.edu/notes/cs229-notes1.pdf > [3] http://commons.apache.org/proper/commons-math/userguide/fitting.html > > > -- > -- > Lahiru Sandaruwan > Committer and PMC member, Apache Stratos, > Senior Software Engineer, > WSO2 Inc., http://wso2.com > lean.enterprise.middleware > > email: lahi...@wso2.com blog: http://lahiruwrites.blogspot.com/ > linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146 >