Hi Janaka,

We do have event batching at the moment as well. You can configure that
in data-agent-config.xml [1]. AFAIU, what we are trying to do here is to
combine several events into a single event.  Apart from that, wouldn't be a
good idea to compress the event after we merge and before we send to DAS?

[1] -
https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml

On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <jan...@wso2.com> wrote:

> Hi Srinath,
>
> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <srin...@wso2.com> wrote:
>
>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana, Miyuru,
>> Seshika, Suho, Nirmal, Nuwan)
>>
>> Currently we generate several events per message from our products. For
>> example, when a message hits APIM, following events will be generated.
>>
>>
>>    1. One from HTTP level
>>    2. 1-2 from authentication and authorization logic
>>    3. 1 from Throttling
>>    4. 1 for ESB level stats
>>    5. 2 for request and response
>>
>> If APIM is handling 10K TPS, that means DAS is receiving events in about
>> 80K TPS. Although data bridge that transfers events are fast, writing to
>> Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. However, that
>> will run to a scenario where APIM deployment will need a very large
>> deployment of DAS.
>>
>> We decided to figure out a way to collect all the events and send a
>> single event to DAS. Basically idea is to extend the data publisher library
>> such that user can keep adding readings to the library, and it will collect
>> the readings and send them over as a single event to the server.
>>
>> However, some flows might terminated in the middle due to failures. There
>> are two solutions.
>>
>>
>>    1. Get the product to call a flush from a finally block
>>    2. Get the library to auto flush collected reading every few seconds
>>
>> I feel #2 is simpler.
>>
>> Do we have any concerns about going to this model?
>>
>> Suho, Anjana we need to think how to do this with our stream definition
>> as we force you to define the streams before hand.
>>
> ​Can't we write something similar to JDBC batch processing where the code
> would only do a publisher.addBatch() or something similar. The data
> publisher can be configured to flush the batched requests to DAS when they
> hit a certain threshold.
>
> Ex:- We define the batch size as 10(using code or config xml). Then if we
> have 5 streams, the publisher would send 5 requests to DAS(for each stream)
> instead of 50.
>
> IMO, this would allow us to keep the existing stream definitions and
> reduce the number of calls from a server to DAS.
>
> WDYT?
>
> Thanks,
> Janaka​
>
>>
>> --Srinath
>>
>>
>>
>>
>>
>> --
>> ============================
>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>> Site: http://home.apache.org/~hemapani/
>> Photos: http://www.flickr.com/photos/hemapani/
>> Phone: 0772360902
>>
>> _______________________________________________
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> *Janaka Ranabahu*
> Associate Technical Lead, WSO2 Inc.
> http://wso2.com
>
>
> *E-mail: jan...@wso2.com <http://wso2.com>**M: **+94 718370861
> <%2B94%20718370861>*
>
> Lean . Enterprise . Middleware
>
> _______________________________________________
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
W.G. Gihan Anuruddha
Senior Software Engineer | WSO2, Inc.
M: +94772272595
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to