Hi all, I recently conducted a performance test on the Airavata Registry using the 0.13 release and the MySQL as the database back-end.
*I've tested 3 scenarios having 10,000 , 100,000, 1,000,000 experiments in the registry.* All the experiments are distributed equally among 10 users where each users experiments are distributed evenly among 4 projects ( each project on a different host) and using the echo_application as the application. *I've tested the following methods of the API:* 1. getExperiment( experimentID); 2. searchExperimentsByName(userName, experimentName); 3. searchExperimentsByApplication(username, appID); searchExperimentsByDesc(username,description); *Results* For all the scenarios the getExperiment(experimentID) runs within 1 second. For all the search methods, the queries that took the most amount of time are, 1. Query: SELECT p FROM Status p WHERE p.expId =:param0 2.Query: SELECT p FROM ErrorDetail p WHERE p.expId =:param0 *100,000 Experiments* Testing searching experiments using search Experiment Name Number of results: 608 Time taken for the operation= 1 secs Time taken per Experiment = 0.0016 secs Testing searching experiments using search Experiment description Number of results: 2500 Time taken for the operation= 5 secs Time taken per Experiment = 0.002 Testing searching experiments using search Application Number of results: 10000 Time taken for the operation= 23 secs Time taken per Experiment = 0.0023 *The queries mentioned above took 57% and 41% of the overall query time.* *1,000,000 experiments* Testing searching experiments using search Experiment Name Number of results: 612 Time taken for the operation= 6 secs Time taken per Experiment = 0.0098 secs Testing searching experiments using search Experiment description Number of results: 100,000 Time taken for the operation= 303 secs Time taken per Experiment = 0.00303 Testing searching experiments using search Application Number of results: 100,000 Time taken for the operation= 293 secs Time taken per Experiment = 0.00293 *The queries mentioned above took 53% and 43% of the overall query time.* *Conclusion* We need to address the following issues. 1. We need to come up with a way to paginate the results or the search methods takes longer to return the results ( as shown above, around 300 seconds) 2. We need to optimize the two queries mentioned above as they take a significant amount of query time (around 96%) 3. The current searches doesn't support full-text searching. for example, if the user has to search an experiment by its description, user has to enter the exact same phase that's in the descriptions or it will not detect the matching result. *WDYT?* -- Thanks, Sachith Withana
