Dulitha, there are several problems

   1. for some use cases, we can aggregate at the client and reaggregate at
   the server. However, for some use cases, that is not possible. It depends
   on what we calculate.
   2. Also this add overhead at the client side (e.g. we need to run Siddhi
   at the client), which might be a problem
   3. We need to deploy and manage queries at the clients, which will add
   a lot of complexity.

--Srinath

On Thu, Apr 21, 2016 at 9:27 PM, Dulitha Wijewantha <duli...@wso2.com>
wrote:

> Hi Srinath,
>
> If we follow the model where some events would be aggregated and some
> events won't be aggregated - won't that be a problem in the processing side
> in DAS? IMO - Handling this event aggregation in the client-side (in an off
> the main thread manner) would significantly reduce the overhead on DAS
> side. My understanding was that we are using an event sink (maybe it was
> called data-bridge) to send events via thrift to DAS from APIM. In that
> case - before writing off to thrift, won't we be able to aggregate and
> combine events?
> ​
>
> ​Cheers~​
>
> On Tue, Mar 29, 2016 at 11:14 PM, Srinath Perera <srin...@wso2.com> wrote:
>
>> Hi Sanjiva,
>>
>> That might work, but we need to try it out with a workload. ( CEP joins
>> are bit slower than other operations, so have to see).
>>
>> DAS, when writing treat data as name-value pairs. It only tries to
>> understand it when it is processing data. So storage model should be OK.
>>
>> My belief is network load is not the bottleneck ( again have to verify).
>>
>> --Srinath
>>
>>
>> On Wed, Mar 30, 2016 at 8:19 AM, Sanjiva Weerawarana <sanj...@wso2.com>
>> wrote:
>>
>>> Srinath what if we come up with a way in the event receiver side to
>>> aggregate a set of events to one based on some correlation field? We can do
>>> this in an embedded Siddhi in the receiver ... basically keep like a 5 sec
>>> window to aggregate all events that carry the same correlation field into
>>> one, combine and then send forward for storage + processing. Sometimes we
>>> will miss but most of the time it won't. The storage model needs to be
>>> sufficiently flexible but HBase should be fine (?). The real time feed must
>>> not have this feature of course.
>>>
>>> With multiple servers firing events related to one interaction its not
>>> possible to do this from the source ends without distributed caching and
>>> that's not a good model.
>>>
>>> It does not address the network load issue of course.
>>>
>>> Sanjiva.
>>>
>>> On Tue, Mar 29, 2016 at 2:49 PM, Srinath Perera <srin...@wso2.com>
>>> wrote:
>>>
>>>> Nuwan, regarding Q1, we can setup such a way that we publisher auto
>>>> publisher the events after timeout or after N events are accumelated.
>>>>
>>>> Nuwan, Chathura ( regarding Q2),
>>>>
>>>> We already do event batching. Above numbers are after event batching.
>>>> There are two bottlenecks. One is sending events over the network and the
>>>> other is writing them to DB. Batching helps a lot in moving it over the
>>>> network, but does not help much when writing to DB.
>>>>
>>>> Regarding null, one option is to group event generated by a single
>>>> message together, which will avoid most nulls. I think our main concern is
>>>> single message triggering multiple events. We also need to write queries to
>>>> copy the values from single big events to different streams and use those
>>>> streams to write queries.
>>>>
>>>> e.g. We can copy values from Big stream to HTTPStream, using which we
>>>> will write HTTP analytics queries.
>>>>
>>>> --Srinath
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Mar 29, 2016 at 1:29 PM, Chathura Ekanayake <chath...@wso2.com>
>>>> wrote:
>>>>
>>>>> As we can reduce the number of event transfers with event batching, I
>>>>> think the advantage of using a single event stream is to reduce number of
>>>>> disk writes at DAS side. But as Nuwan mentioned, dealing with null fields
>>>>> can be a problem in writing analytics scripts.
>>>>>
>>>>> Regards,
>>>>> Chathura
>>>>>
>>>>> On Tue, Mar 29, 2016 at 10:40 AM, Nuwan Dias <nuw...@wso2.com> wrote:
>>>>>
>>>>>> Having to publish a single event after collecting all possible data
>>>>>> records from the server would be good in terms of scalability aspects of
>>>>>> the DAS/Analytics platform. However I see that it introduces new 
>>>>>> challenges
>>>>>> for which we would need solutions.
>>>>>>
>>>>>> 1. How to guarantee a event is always published to DAS? In the case
>>>>>> of API Manager, a request has multiple exit points. Such as auth 
>>>>>> failures,
>>>>>> throttling out, back-end failures, message processing failures, etc. So 
>>>>>> we
>>>>>> need a way to guarantee that an event is always sent out whatever the 
>>>>>> state.
>>>>>>
>>>>>> 2. With this model, I'm assuming we only have 1 stream definition. Is
>>>>>> this correct? If so would this not make the analytics part complicated? 
>>>>>> For
>>>>>> example, say I have a spark query to summarize the throttled out events
>>>>>> from an App, since I can only see a single stream the query would have to
>>>>>> deal with null fields and have to deal with the whole bulk of data even 
>>>>>> if
>>>>>> in reality it might only have to deal with a few. The same complexity 
>>>>>> would
>>>>>> arise for the CEP based throttling engine and the new alerts we're 
>>>>>> building
>>>>>> as well.
>>>>>>
>>>>>> Thanks,
>>>>>> NuwanD.
>>>>>>
>>>>>> On Sat, Mar 26, 2016 at 1:22 AM, Inosh Goonewardena <in...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1. With combined event approach we can avoid sending duplicate
>>>>>>> information to some level as well. For example, in API analytics 
>>>>>>> scenario
>>>>>>> both request and response streams have consumerKey, context, 
>>>>>>> api_version,
>>>>>>> api, resourcePath, etc properties which the values will be same for both
>>>>>>> request event and corresponding response event. With single event 
>>>>>>> approach
>>>>>>> we can avoid such.
>>>>>>>
>>>>>>> On Fri, Mar 25, 2016 at 1:23 AM, Gihan Anuruddha <gi...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Janaka,
>>>>>>>>
>>>>>>>> We do have event batching at the moment as well. You can configure
>>>>>>>> that in data-agent-config.xml [1]. AFAIU, what we are trying to do 
>>>>>>>> here is
>>>>>>>> to combine several events into a single event.  Apart from that, 
>>>>>>>> wouldn't
>>>>>>>> be a good idea to compress the event after we merge and before we send 
>>>>>>>> to
>>>>>>>> DAS?
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>> https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml
>>>>>>>>
>>>>>>>> On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <jan...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Srinath,
>>>>>>>>>
>>>>>>>>> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <srin...@wso2.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana,
>>>>>>>>>> Miyuru, Seshika, Suho, Nirmal, Nuwan)
>>>>>>>>>>
>>>>>>>>>> Currently we generate several events per message from our
>>>>>>>>>> products. For example, when a message hits APIM, following events 
>>>>>>>>>> will be
>>>>>>>>>> generated.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    1. One from HTTP level
>>>>>>>>>>    2. 1-2 from authentication and authorization logic
>>>>>>>>>>    3. 1 from Throttling
>>>>>>>>>>    4. 1 for ESB level stats
>>>>>>>>>>    5. 2 for request and response
>>>>>>>>>>
>>>>>>>>>> If APIM is handling 10K TPS, that means DAS is receiving events
>>>>>>>>>> in about 80K TPS. Although data bridge that transfers events are 
>>>>>>>>>> fast,
>>>>>>>>>> writing to Disk ( via RDBMS or Hbase) is a problem. We can scale 
>>>>>>>>>> Hbase.
>>>>>>>>>> However, that will run to a scenario where APIM deployment will need 
>>>>>>>>>> a very
>>>>>>>>>> large deployment of DAS.
>>>>>>>>>>
>>>>>>>>>> We decided to figure out a way to collect all the events and send
>>>>>>>>>> a single event to DAS. Basically idea is to extend the data publisher
>>>>>>>>>> library such that user can keep adding readings to the library, and 
>>>>>>>>>> it will
>>>>>>>>>> collect the readings and send them over as a single event to the 
>>>>>>>>>> server.
>>>>>>>>>>
>>>>>>>>>> However, some flows might terminated in the middle due to
>>>>>>>>>> failures. There are two solutions.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    1. Get the product to call a flush from a finally block
>>>>>>>>>>    2. Get the library to auto flush collected reading every few
>>>>>>>>>>    seconds
>>>>>>>>>>
>>>>>>>>>> I feel #2 is simpler.
>>>>>>>>>>
>>>>>>>>>> Do we have any concerns about going to this model?
>>>>>>>>>>
>>>>>>>>>> Suho, Anjana we need to think how to do this with our stream
>>>>>>>>>> definition as we force you to define the streams before hand.
>>>>>>>>>>
>>>>>>>>> ​Can't we write something similar to JDBC batch processing where
>>>>>>>>> the code would only do a publisher.addBatch() or something similar. 
>>>>>>>>> The
>>>>>>>>> data publisher can be configured to flush the batched requests to DAS 
>>>>>>>>> when
>>>>>>>>> they hit a certain threshold.
>>>>>>>>>
>>>>>>>>> Ex:- We define the batch size as 10(using code or config xml).
>>>>>>>>> Then if we have 5 streams, the publisher would send 5 requests to 
>>>>>>>>> DAS(for
>>>>>>>>> each stream) instead of 50.
>>>>>>>>>
>>>>>>>>> IMO, this would allow us to keep the existing stream definitions
>>>>>>>>> and reduce the number of calls from a server to DAS.
>>>>>>>>>
>>>>>>>>> WDYT?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Janaka​
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --Srinath
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ============================
>>>>>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>>>>>>>> Site: http://home.apache.org/~hemapani/
>>>>>>>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>>>>>>>> Phone: 0772360902
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Architecture mailing list
>>>>>>>>>> Architecture@wso2.org
>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Janaka Ranabahu*
>>>>>>>>> Associate Technical Lead, WSO2 Inc.
>>>>>>>>> http://wso2.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *E-mail: jan...@wso2.com <http://wso2.com>**M: **+94 718370861
>>>>>>>>> <%2B94%20718370861>*
>>>>>>>>>
>>>>>>>>> Lean . Enterprise . Middleware
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Architecture mailing list
>>>>>>>>> Architecture@wso2.org
>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> W.G. Gihan Anuruddha
>>>>>>>> Senior Software Engineer | WSO2, Inc.
>>>>>>>> M: +94772272595
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Architecture mailing list
>>>>>>>> Architecture@wso2.org
>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks & Regards,
>>>>>>>
>>>>>>> Inosh Goonewardena
>>>>>>> Associate Technical Lead- WSO2 Inc.
>>>>>>> Mobile: +94779966317
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Architecture mailing list
>>>>>>> Architecture@wso2.org
>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nuwan Dias
>>>>>>
>>>>>> Technical Lead - WSO2, Inc. http://wso2.com
>>>>>> email : nuw...@wso2.com
>>>>>> Phone : +94 777 775 729
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> Architecture@wso2.org
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> Architecture@wso2.org
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ============================
>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>> Site: http://home.apache.org/~hemapani/
>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>> Phone: 0772360902
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> Architecture@wso2.org
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Sanjiva Weerawarana, Ph.D.
>>> Founder, CEO & Chief Architect; WSO2, Inc.;  http://wso2.com/
>>> email: sanj...@wso2.com; office: (+1 650 745 4499 | +94  11 214 5345)
>>> x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311
>>> blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva
>>> Lean . Enterprise . Middleware
>>>
>>
>>
>>
>> --
>> ============================
>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>> Site: http://home.apache.org/~hemapani/
>> Photos: http://www.flickr.com/photos/hemapani/
>> Phone: 0772360902
>>
>> _______________________________________________
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Dulitha Wijewantha (Chan)
> Software Engineer - Mobile Development
> WSO2 Inc
> Lean.Enterprise.Middleware
>  * ~Email       duli...@wso2.com <duli...@wso2mobile.com>*
> *  ~Mobile     +94712112165 <%2B94712112165>*
> *  ~Website   dulitha.me <http://dulitha.me>*
> *  ~Twitter     @dulitharw <https://twitter.com/dulitharw>*
>   *~Github     @dulichan <https://github.com/dulichan>*
>   *~SO     @chan <http://stackoverflow.com/users/813471/chan>*
>



-- 
============================
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://home.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to