Hi, This is a follow-up mail of [1], to give an update on the status with the performance issue [2] . So as mentioned in the previous mail, with Spark-script doing the summary stat generation as a batch process, creates a bottleneck at a higher TPS. More precisely, with our findings, it cannot handle a throughput of more than 30 TPS as a batch process. (i.e: events published to DAS within 10 mins with a TPS of 30, take more than 10 mins to process. Means, if we schedule a script every 10 mins, the events to be processed grows over time).
To overcome this, thought of doing the summarizing up to a certain extent (upto second-wise summary) using siddhi, and to generate remaining stats (per-minute/hour/day/month), using spark. With this enhancement, ran some load tests locally to evaluate this approach, and the results are as follows. Backend DB : MySQL ESB analytics nodes: 1 With InnoDB - With *80 TPS*: (script scheduled every 1 min) : Avg time taken for completion of the script = ~ *20 sec*. - With* 500 TPS* (script scheduled every 2 min) : Avg time taken for completion of the script = ~ *45 sec*. With MyISAM - With *80 TPS* (script scheduled every 1 min) : Avg time taken for completion of the script = ~ *24 sec*. - With *80 TPS *(script scheduled every 2 min) : Avg time taken for completion of the script = ~ *20 sec*. - With *500 TPS* (script scheduled every 2 min) : Avg time taken for completion of the script = ~ *35 sec*. As a further improvement, we would be trying out to do summarizing upto minute/hour level (eventually do all the summarizing using siddhi). [1] [Dev] ESB Analytics - Verifying the common production use cases [2] https://wso2.org/jira/browse/ANLYESB-15 Thanks, Supun -- *Supun Sethunga* Software Engineer WSO2, Inc. http://wso2.com/ lean | enterprise | middleware Mobile : +94 716546324
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture