Let's first clearly separate between execution types and backends. There are five execution types (hybrid_spark, spark, hybrid, hadoop, singlenode) and four backends (CP, SPARK, MR, GPU), where for example hybrid_spark indicates that the optimizer is allowed to leverage the backends CP and SPARK (once the GPU backend is graduated this might change).
Given the large number of other parameters that we want to include in the performance tests, I would recommend to make the execution type configurable but by default only run hybrid_spark. Over scaled data sizes, this should already give us good coverage for our primary backends CP and SPARK. Furthermore, I would recommend NOT to run from the bin directory. These tests are usually run on a cluster and copying the bin folder just creates unnecessary hassle. Please, simply refer to SystemML.jar and use either sparkDML.sh or a simplified version of this script. Regards, Matthias On Sun, May 21, 2017 at 10:17 PM, Krishna Kalyan <[email protected]> wrote: > Gentle ping to Frederick Reiss, Mike Dusenberry, Arvind Surve, Niketan > Pansare, Felix Schüler, Deron Eriksson, Nakul Jindal and Matthias Boehm. > (Apologies, if I missed out on some one). > > I would really appreciate if I could have your feedback on this. > > Regards, > Krishna > > > On Sun, May 14, 2017 at 2:47 AM, Krishna Kalyan <[email protected]> > wrote: > > > Hello All, > > > > Question 1: > > > > Currently we have 5 backend modes. > > (standalone, hadoop, spark, hybrid and spark_hybrid) > > I would like to know if we need to have performance tests for all 5 > > backends + noticed that earlier we had perftests only for MapReduce and > > Spark Mode. > > https://github.com/apache/incubator-systemml/tree/ > master/scripts/perftest > > > > Question 2: > > > > Would calling scripts from the bin folder to run performance test be a > > good idea?. > > https://github.com/apache/incubator-systemml/tree/master/bin > > (from python files like systemml-standalone.py, systemml-spark-submit.py > > etc) > > > > Will that be a good approach?. Based on some suggestions on this JIRA, I > > understand that creating an API would be another option. > > > > Please share your thoughts. > > > > Regards, > > Krishna > > > > > > >
