+1 for the approach. Lets test it and see.

Regards
Suho

On Wed, Apr 20, 2016 at 12:30 PM, Anjana Fernando <anj...@wso2.com> wrote:

> Hi,
>
> Good progress Supun! .. do keep pushing the parameters to find the limits
> we can go to.
>
> @Suho, the idea was to all together eliminate the batch script and just
> store/index the data for later lookup, and do the computation purely in
> Siddhi. I don't think we will get a big scaling problem, since the data
> needs to be stored in-memory when we go to upper layers of summarization is
> smaller, and stops at yearly granularity. So it would be at that time, we
> having data in-memory for last years worth of data, in a way of last 12
> records of summary data for 12 months for a specific artifact, last day's
> worth, that is 30 entries etc.. so growing of data slows immensely, and
> also it has a upper limit, which I guess should be comfortability within
> usual memory capacity.
>
> So if we can get a proper checkpoint and replay mechanism figured out for
> data processed, we can do all the things in CEP, then we just don't have
> the complexity of maintaining two mechanism of doing the processing.
>
> Cheers,
> Anjana.
>
> On Wed, Apr 20, 2016 at 12:11 PM, Sriskandarajah Suhothayan <s...@wso2.com
> > wrote:
>
>> I think it will make more sense to run seconds and minutes from siddhi,
>> and run the spark every hour, when there are lots of date on the system
>> this will be much more scalable.
>>
>> WDYT?
>>
>> Regards
>> Suho
>>
>> On Wed, Apr 20, 2016 at 11:50 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>
>>> Hi,
>>>
>>> This is a follow-up mail of [1], to give an update on the status with
>>> the performance issue [2] . So as mentioned in the previous mail, with
>>> Spark-script doing the summary stat generation as a batch process, creates
>>> a bottleneck at a higher TPS. More precisely, with our findings, it cannot
>>> handle a throughput of more than 30 TPS as a batch process. (i.e: events
>>> published to DAS within 10 mins with a TPS of 30, take more than 10 mins to
>>> process. Means, if we schedule a script every 10 mins, the events to be
>>> processed grows over time).
>>>
>>> To overcome this, thought of doing the summarizing up to a certain
>>> extent (upto second-wise summary) using siddhi, and to generate remaining
>>> stats (per-minute/hour/day/month), using spark. With this enhancement, ran
>>> some load tests locally to evaluate this approach, and the results are as
>>> follows.
>>>
>>> Backend DB : MySQL
>>> ESB analytics nodes: 1
>>>
>>>  With InnoDB
>>>
>>>    - With *80 TPS*: (script scheduled every 1 min) : Avg time taken for
>>>    completion of  the script  = ~ *20 sec*.
>>>    - With* 500 TPS* (script scheduled every 2 min) : Avg time taken for
>>>    completion of  the script  = ~ *45 sec*.
>>>
>>>
>>> With MyISAM
>>>
>>>    - With *80 TPS* (script scheduled every 1 min) : Avg time taken for
>>>    completion of  the script  = ~ *24 sec*.
>>>    - With *80 TPS *(script scheduled every 2 min) : Avg time taken for
>>>    completion of  the script  = ~ *20 sec*.
>>>    - With *500 TPS* (script scheduled every 2 min) : Avg time taken for
>>>    completion of  the script  = ~ *35 sec*.
>>>
>>> As a further improvement, we would be trying out to do summarizing upto
>>> minute/hour level (eventually do all the summarizing using siddhi).
>>>
>>> [1] [Dev] ESB Analytics - Verifying the common production use cases
>>> [2] https://wso2.org/jira/browse/ANLYESB-15
>>>
>>> Thanks,
>>> Supun
>>>
>>> --
>>> *Supun Sethunga*
>>> Software Engineer
>>> WSO2, Inc.
>>> http://wso2.com/
>>> lean | enterprise | middleware
>>> Mobile : +94 716546324
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>>
>> *S. Suhothayan*
>> Technical Lead & Team Lead of WSO2 Complex Event Processor
>> *WSO2 Inc. *http://wso2.com
>> * <http://wso2.com/>*
>> lean . enterprise . middleware
>>
>>
>> *cell: (+94) 779 756 757 <%28%2B94%29%20779%20756%20757> | blog:
>> http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/>twitter:
>> http://twitter.com/suhothayan <http://twitter.com/suhothayan> | linked-in:
>> http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
>>
>
>
>
> --
> *Anjana Fernando*
> Senior Technical Lead
> WSO2 Inc. | http://wso2.com
> lean . enterprise . middleware
>



-- 

*S. Suhothayan*
Technical Lead & Team Lead of WSO2 Complex Event Processor
*WSO2 Inc. *http://wso2.com
* <http://wso2.com/>*
lean . enterprise . middleware


*cell: (+94) 779 756 757 | blog: http://suhothayan.blogspot.com/
<http://suhothayan.blogspot.com/>twitter: http://twitter.com/suhothayan
<http://twitter.com/suhothayan> | linked-in:
http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to