We are planning to integrate Frequency algorithm as a part of training
project[1] with Siddhi CEP.

Basically this is the algorithm calculates the number of occurrences
(frequency) of a specified attribute for a given input stream in CEP.

We have selected a Siddhi Transformer to implement this functionality by
using stream-lib[2] as a third party library which is licensed under Apache
Software Foundation.

Standard Siddhi query for using this algorithm would look like below,

* from
inputStream#transform.custom:getFrequencies(desiredAttribute)select
desiredAttribute,
frequency*
*insert into frequencyStream;*

Where,

inputStream : Input Stream to CEP

custom : namespace

getFrequencies : function name

desiredAttribute : Attribute name from input stream for which frequencies
need to be calculated
frequencyStream : Output Stream from CEP that contains frequency related
information

The stream-lib library supports only Top-K and cardinality algorithms
directly where the Top-K algorithm takes ‘K’ value as an argument from user
and gives distinct K number of elements which have highest frequency values
with related frequency values. The library provides no functions for
getting frequencies of all elements. So what we are planning to do is
giving a maximum integer value(Integer.MAX_VALUE) as an argument to the
Top-K algorithm. So obviously, we will be able to get frequencies for all
distinct event attributes provided that distinct event attribute count does
not exceed Integer.MAX_VALUE value.

There won’t be any memory issues as giving of Integer.MAX_VALUE for Top-K
algorithm because it is increasing it’s bucket size dynamically as new
distinct events come.

We have already implemented the above design and basic testings seem to be
ok.

Kindly comment on the implementation.

[1] - https://redmine.wso2.com/issues/2884

[2] - https://github.com/addthis/stream-lib

-- 
Best Regards,
V.Rajeevan
Software Engineer,
WSO2 Inc. :http://wso2.com

Mobile : +94 773090875
Email : rajeev...@wso2.com
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to