IMO, use of window batch is consistant (My argument is, it is interpreted as batch process when used with event tables). But lets think more on that.
--Srinath On Sat, Nov 22, 2014 at 7:56 AM, Sriskandarajah Suhothayan <s...@wso2.com> wrote: > +1 > I like the idea, only worried about the use of *#window.batch()* > How about using *WeatherStream#pull#window.time(24h) *instead? > > Suho > > On Fri, Nov 21, 2014 at 8:33 PM, Srinath Perera <srin...@wso2.com> wrote: > >> Hi All, >> >> >> I think I have finally figured out a way do the Paul's idea (See the >> thread "Unifying CEP and BAM Languages" >> http://mail.wso2.org/mailarchive/architecture/2014-May/016366.html for >> more details). >> >> >> Currently event tables in Siddhi are backed by a disk and can be used to >> joins >> >> However, you must use it with a stream and it is triggered from a stream. >> For example currently *from WeatherStream insert into ..* is not >> defined. *Idea is to use that for batch processing!* >> >> Proposal is to extend event tables definition >> >> 1. Add an timestamp field to event tables (Then BAM stream also become a >> event table) >> >> 2. *Define “from” operation on event table to cover batch processing*. >> For example >> >> from EvetTable#window.batch(2h) avg(pressure) as avgPressure will run as >> batch job with 2h data using data in the event table. (We will execute this >> in Spark using Siddhi engine). If #window.batch(2h) is omitted, >> #window.batch(5m) is assumed. >> >> 3. You can define an event table on top of DB, disk, Cassandra, hdfs, BAM >> stream stored etc .. >> >> Let me take an example >> >> Say you want to calculate avg and stddev of pressure once every 24h as a >> batch job, and raise an alarm if current pressure is less than 3 stddev >> from the mean calculated in batch process. (this is a extended Lambda >> architecture scenario) Then you can do all this with Siddhi, and query will >> look like following. >> >> *define eventTable WeatherTable using BAMStream … * >> >> *define eventTable WeatherStatsTable using BAMStream … * >> >> *//batch job* >> >> *from **WeatherTable**#window.batch(24h) stddev(pressure) , avg >> (pressure)* >> >> * insert into **WeatherStatsTable**; * >> >> *//use results from batch job in realtime * >> >> *from WeatherStream as p join **WeatherStatsTable**#window.last() as s * >> >> * on pressure < s.mean -2*s.stddev* >> >> * insert into WeatherAlarmStream; * >> >> First query runs once every 24 hours and calculate mean and stddev and >> write to disk. That value is joined against the live stream as they come in >> via join. >> >> Few more rules (and there are more details that we need to figure out) >> >> 1. You read from event table, then it runs as batch processes >> >> 2. If you join with event table, it works as now. However, you can define >> a window when joining on top of event tables as well. e.g. >> WeatherStats#window.batch(5m) means takes events came in last 5 mins. >> >> 3. We need to define how it behaves when timestamp field is not defined. >> Best is to only support *joins* and not support *from* in that case. >> >> 4. When processing batches, it runs in parallel if partitions are >> defined. For example, if you want to calculate mean in map reduce style, it >> will look like following. >> >> *define partition StationParition WeatherStream.stationID; * >> >> *from WeatherStream#window.batch(24h) avg(pressure) * >> >> * insert into WeatherMeanStream using StationParition; * >> >> If no partitions, it will run sequentially (We can improve on this >> later). >> >> For execution, we want to init siddhi within Spark, and run thing in >> parallel using spark, but actual evaluation will done by Siddhi. >> >> 5. Users can define other windows like sliding windows with event tables. >> However, Siddhi will read data from disk once every 5 minutes or so. So >> results will be the same, however, it might come bit later than with >> streams. >> >> >> Please comment >> >> Thanks >> >> Srinath >> >> >> >> >> -- >> ============================ >> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >> Site: http://people.apache.org/~hemapani/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >> > > > > -- > > *S. Suhothayan* > Technical Lead & Team Lead of WSO2 Complex Event Processor > *WSO2 Inc. *http://wso2.com > * <http://wso2.com/>* > lean . enterprise . middleware > > > *cell: (+94) 779 756 757 <%28%2B94%29%20779%20756%20757> | blog: > http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/>twitter: > http://twitter.com/suhothayan <http://twitter.com/suhothayan> | linked-in: > http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>* > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture