Hi Janaka, We do have event batching at the moment as well. You can configure that in data-agent-config.xml [1]. AFAIU, what we are trying to do here is to combine several events into a single event. Apart from that, wouldn't be a good idea to compress the event after we merge and before we send to DAS?
[1] - https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <jan...@wso2.com> wrote: > Hi Srinath, > > On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <srin...@wso2.com> wrote: > >> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana, Miyuru, >> Seshika, Suho, Nirmal, Nuwan) >> >> Currently we generate several events per message from our products. For >> example, when a message hits APIM, following events will be generated. >> >> >> 1. One from HTTP level >> 2. 1-2 from authentication and authorization logic >> 3. 1 from Throttling >> 4. 1 for ESB level stats >> 5. 2 for request and response >> >> If APIM is handling 10K TPS, that means DAS is receiving events in about >> 80K TPS. Although data bridge that transfers events are fast, writing to >> Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. However, that >> will run to a scenario where APIM deployment will need a very large >> deployment of DAS. >> >> We decided to figure out a way to collect all the events and send a >> single event to DAS. Basically idea is to extend the data publisher library >> such that user can keep adding readings to the library, and it will collect >> the readings and send them over as a single event to the server. >> >> However, some flows might terminated in the middle due to failures. There >> are two solutions. >> >> >> 1. Get the product to call a flush from a finally block >> 2. Get the library to auto flush collected reading every few seconds >> >> I feel #2 is simpler. >> >> Do we have any concerns about going to this model? >> >> Suho, Anjana we need to think how to do this with our stream definition >> as we force you to define the streams before hand. >> > Can't we write something similar to JDBC batch processing where the code > would only do a publisher.addBatch() or something similar. The data > publisher can be configured to flush the batched requests to DAS when they > hit a certain threshold. > > Ex:- We define the batch size as 10(using code or config xml). Then if we > have 5 streams, the publisher would send 5 requests to DAS(for each stream) > instead of 50. > > IMO, this would allow us to keep the existing stream definitions and > reduce the number of calls from a server to DAS. > > WDYT? > > Thanks, > Janaka > >> >> --Srinath >> >> >> >> >> >> -- >> ============================ >> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >> Site: http://home.apache.org/~hemapani/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >> >> _______________________________________________ >> Architecture mailing list >> Architecture@wso2.org >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > *Janaka Ranabahu* > Associate Technical Lead, WSO2 Inc. > http://wso2.com > > > *E-mail: jan...@wso2.com <http://wso2.com>**M: **+94 718370861 > <%2B94%20718370861>* > > Lean . Enterprise . Middleware > > _______________________________________________ > Architecture mailing list > Architecture@wso2.org > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- W.G. Gihan Anuruddha Senior Software Engineer | WSO2, Inc. M: +94772272595
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture