Srinath what if we come up with a way in the event receiver side to
aggregate a set of events to one based on some correlation field? We can do
this in an embedded Siddhi in the receiver ... basically keep like a 5 sec
window to aggregate all events that carry the same correlation field into
one, combine and then send forward for storage + processing. Sometimes we
will miss but most of the time it won't. The storage model needs to be
sufficiently flexible but HBase should be fine (?). The real time feed must
not have this feature of course.

With multiple servers firing events related to one interaction its not
possible to do this from the source ends without distributed caching and
that's not a good model.

It does not address the network load issue of course.

Sanjiva.

On Tue, Mar 29, 2016 at 2:49 PM, Srinath Perera <srin...@wso2.com> wrote:

> Nuwan, regarding Q1, we can setup such a way that we publisher auto
> publisher the events after timeout or after N events are accumelated.
>
> Nuwan, Chathura ( regarding Q2),
>
> We already do event batching. Above numbers are after event batching.
> There are two bottlenecks. One is sending events over the network and the
> other is writing them to DB. Batching helps a lot in moving it over the
> network, but does not help much when writing to DB.
>
> Regarding null, one option is to group event generated by a single message
> together, which will avoid most nulls. I think our main concern is single
> message triggering multiple events. We also need to write queries to copy
> the values from single big events to different streams and use those
> streams to write queries.
>
> e.g. We can copy values from Big stream to HTTPStream, using which we will
> write HTTP analytics queries.
>
> --Srinath
>
>
>
>
> On Tue, Mar 29, 2016 at 1:29 PM, Chathura Ekanayake <chath...@wso2.com>
> wrote:
>
>> As we can reduce the number of event transfers with event batching, I
>> think the advantage of using a single event stream is to reduce number of
>> disk writes at DAS side. But as Nuwan mentioned, dealing with null fields
>> can be a problem in writing analytics scripts.
>>
>> Regards,
>> Chathura
>>
>> On Tue, Mar 29, 2016 at 10:40 AM, Nuwan Dias <nuw...@wso2.com> wrote:
>>
>>> Having to publish a single event after collecting all possible data
>>> records from the server would be good in terms of scalability aspects of
>>> the DAS/Analytics platform. However I see that it introduces new challenges
>>> for which we would need solutions.
>>>
>>> 1. How to guarantee a event is always published to DAS? In the case of
>>> API Manager, a request has multiple exit points. Such as auth failures,
>>> throttling out, back-end failures, message processing failures, etc. So we
>>> need a way to guarantee that an event is always sent out whatever the state.
>>>
>>> 2. With this model, I'm assuming we only have 1 stream definition. Is
>>> this correct? If so would this not make the analytics part complicated? For
>>> example, say I have a spark query to summarize the throttled out events
>>> from an App, since I can only see a single stream the query would have to
>>> deal with null fields and have to deal with the whole bulk of data even if
>>> in reality it might only have to deal with a few. The same complexity would
>>> arise for the CEP based throttling engine and the new alerts we're building
>>> as well.
>>>
>>> Thanks,
>>> NuwanD.
>>>
>>> On Sat, Mar 26, 2016 at 1:22 AM, Inosh Goonewardena <in...@wso2.com>
>>> wrote:
>>>
>>>> +1. With combined event approach we can avoid sending duplicate
>>>> information to some level as well. For example, in API analytics scenario
>>>> both request and response streams have consumerKey, context, api_version,
>>>> api, resourcePath, etc properties which the values will be same for both
>>>> request event and corresponding response event. With single event approach
>>>> we can avoid such.
>>>>
>>>> On Fri, Mar 25, 2016 at 1:23 AM, Gihan Anuruddha <gi...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Janaka,
>>>>>
>>>>> We do have event batching at the moment as well. You can configure
>>>>> that in data-agent-config.xml [1]. AFAIU, what we are trying to do here is
>>>>> to combine several events into a single event.  Apart from that, wouldn't
>>>>> be a good idea to compress the event after we merge and before we send to
>>>>> DAS?
>>>>>
>>>>> [1] -
>>>>> https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml
>>>>>
>>>>> On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <jan...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Srinath,
>>>>>>
>>>>>> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <srin...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana,
>>>>>>> Miyuru, Seshika, Suho, Nirmal, Nuwan)
>>>>>>>
>>>>>>> Currently we generate several events per message from our products.
>>>>>>> For example, when a message hits APIM, following events will be 
>>>>>>> generated.
>>>>>>>
>>>>>>>
>>>>>>>    1. One from HTTP level
>>>>>>>    2. 1-2 from authentication and authorization logic
>>>>>>>    3. 1 from Throttling
>>>>>>>    4. 1 for ESB level stats
>>>>>>>    5. 2 for request and response
>>>>>>>
>>>>>>> If APIM is handling 10K TPS, that means DAS is receiving events in
>>>>>>> about 80K TPS. Although data bridge that transfers events are fast, 
>>>>>>> writing
>>>>>>> to Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. However,
>>>>>>> that will run to a scenario where APIM deployment will need a very large
>>>>>>> deployment of DAS.
>>>>>>>
>>>>>>> We decided to figure out a way to collect all the events and send a
>>>>>>> single event to DAS. Basically idea is to extend the data publisher 
>>>>>>> library
>>>>>>> such that user can keep adding readings to the library, and it will 
>>>>>>> collect
>>>>>>> the readings and send them over as a single event to the server.
>>>>>>>
>>>>>>> However, some flows might terminated in the middle due to failures.
>>>>>>> There are two solutions.
>>>>>>>
>>>>>>>
>>>>>>>    1. Get the product to call a flush from a finally block
>>>>>>>    2. Get the library to auto flush collected reading every few
>>>>>>>    seconds
>>>>>>>
>>>>>>> I feel #2 is simpler.
>>>>>>>
>>>>>>> Do we have any concerns about going to this model?
>>>>>>>
>>>>>>> Suho, Anjana we need to think how to do this with our stream
>>>>>>> definition as we force you to define the streams before hand.
>>>>>>>
>>>>>> ​Can't we write something similar to JDBC batch processing where the
>>>>>> code would only do a publisher.addBatch() or something similar. The data
>>>>>> publisher can be configured to flush the batched requests to DAS when 
>>>>>> they
>>>>>> hit a certain threshold.
>>>>>>
>>>>>> Ex:- We define the batch size as 10(using code or config xml). Then
>>>>>> if we have 5 streams, the publisher would send 5 requests to DAS(for each
>>>>>> stream) instead of 50.
>>>>>>
>>>>>> IMO, this would allow us to keep the existing stream definitions and
>>>>>> reduce the number of calls from a server to DAS.
>>>>>>
>>>>>> WDYT?
>>>>>>
>>>>>> Thanks,
>>>>>> Janaka​
>>>>>>
>>>>>>>
>>>>>>> --Srinath
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ============================
>>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>>>>> Site: http://home.apache.org/~hemapani/
>>>>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>>>>> Phone: 0772360902
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Architecture mailing list
>>>>>>> Architecture@wso2.org
>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Janaka Ranabahu*
>>>>>> Associate Technical Lead, WSO2 Inc.
>>>>>> http://wso2.com
>>>>>>
>>>>>>
>>>>>> *E-mail: jan...@wso2.com <http://wso2.com>**M: **+94 718370861
>>>>>> <%2B94%20718370861>*
>>>>>>
>>>>>> Lean . Enterprise . Middleware
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> Architecture@wso2.org
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> W.G. Gihan Anuruddha
>>>>> Senior Software Engineer | WSO2, Inc.
>>>>> M: +94772272595
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> Architecture@wso2.org
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>>
>>>> Inosh Goonewardena
>>>> Associate Technical Lead- WSO2 Inc.
>>>> Mobile: +94779966317
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> Architecture@wso2.org
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Nuwan Dias
>>>
>>> Technical Lead - WSO2, Inc. http://wso2.com
>>> email : nuw...@wso2.com
>>> Phone : +94 777 775 729
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>> _______________________________________________
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> ============================
> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
> Site: http://home.apache.org/~hemapani/
> Photos: http://www.flickr.com/photos/hemapani/
> Phone: 0772360902
>
> _______________________________________________
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Sanjiva Weerawarana, Ph.D.
Founder, CEO & Chief Architect; WSO2, Inc.;  http://wso2.com/
email: sanj...@wso2.com; office: (+1 650 745 4499 | +94  11 214 5345)
x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311
blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva
Lean . Enterprise . Middleware
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to