​Charini,

What happens when you have 10 events within last 5 mins and your batch size
is 5. Do you get the latest 5 out of the 10?
Both time and length are defining ceiling here. So if we are having to
different parameters defining ceilings ​there needs to be precedence.
IMO it will be better if we can have two separate implementations, one for
length and one for time.

WDYT?

Thanks
/Tishan

On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <chari...@wso2.com>
wrote:

> Hi All,
>
> I have planned to extend the existent Regression Function by adding time
> parameter. Regression is a functionality available for the Siddhi stream
> processor extension known as timeseries. In the current implementation, the
> regression function consumes two or more parameters and performs regression
> as follows.
>
> The mandatory parameters to be given are the dependent attribute Y and the
> independent attribute(s) X1, X2,....Xn. For performing simple linear
> regression, merely one independent attribute would be given. Two or more
> independent attributes are consumed for executing multiple linear
> regression.
>
> timeseries:regress(Y, X1, X2......,Xn)
>
> The other three optional parameters to be specified are calculation
> interval, batch size and confidence interval (ci). In the case where those
> are not specified, the default values would be assumed.
>
> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn)
>
> Batch size works as a length window in this implementation, which allows
> one to restrict the number of events considered when executing regression
> in real time. For example, if length is 5, only the latest 5 events
> (current event and the 4 events prior to it) would be used for performing
> regression.
>
> *This suggested extension would allow the user to restrict the number of
> events based on a time window as well, apart from constraining based on
> length only. Therefore regression function would consume duration as an
> additional parameter, subsequent to the completion of my task. *
>
> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1,
> X2......,Xn).*
>
> Here the parameter 'duration' would comprise of two parts, where the first
> part specifies the number and the second part specifies the unit (e.g. 2
> sec, 5 mins, 7 days). On arrival of each event, the past events to be
> considered for performing regression would be based on this 'duration'
> (i.e. If a new event arrives at 10.00 a.m and the duration is 5  mins, only
> the events which arrived within the time period of 9.55 a.m to 10.00 a.m
> are considered for regression).
>
> Suggestions and comments are most welcome.
>
> Thank you.
>
> --
> Charini Vimansha Nanayakkara
> Software Engineer at WSO2
> Mobile: 0714126293
>
>
> _______________________________________________
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Tishan Dahanayakage
Software Engineer
WSO2, Inc.
Mobile:+94 716481328

Disclaimer: This communication may contain privileged or other confidential
information and is intended exclusively for the addressee/s. If you are not
the intended recipient/s, or believe that you may have received this
communication in error, please reply to the sender indicating that fact and
delete the copy you received and in addition, you should not print, copy,
re-transmit, disseminate, or otherwise use the information contained in
this communication. Internet communications cannot be guaranteed to be
timely, secure, error or virus-free. The sender does not accept liability
for any errors or omissions.
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to