Re: Some questions on Performance Test

Matthias Boehm Sun, 21 May 2017 23:11:15 -0700

Let's first clearly separate between execution types and backends. There
are five execution types (hybrid_spark, spark, hybrid, hadoop, singlenode)
and four backends (CP, SPARK, MR, GPU), where for example hybrid_spark
indicates that the optimizer is allowed to leverage the backends CP and
SPARK (once the GPU backend is graduated this might change).

Given the large number of other parameters that we want to include in the
performance tests, I would recommend to make the execution type
configurable but by default only run hybrid_spark. Over scaled data sizes,
this should already give us good coverage for our primary backends CP and
SPARK.

Furthermore, I would recommend NOT to run from the bin directory. These
tests are usually run on a cluster and copying the bin folder just creates
unnecessary hassle. Please, simply refer to SystemML.jar and use either
sparkDML.sh or a simplified version of this script.

Regards,
Matthias

On Sun, May 21, 2017 at 10:17 PM, Krishna Kalyan <[email protected]>
wrote:

> Gentle ping to Frederick Reiss, Mike Dusenberry, Arvind Surve, Niketan
> Pansare, Felix Schüler, Deron Eriksson, Nakul Jindal and Matthias Boehm.
> (Apologies, if I missed out on some one).
>
> I would really appreciate if I could have your feedback on this.
>
> Regards,
> Krishna
>
>
> On Sun, May 14, 2017 at 2:47 AM, Krishna Kalyan <[email protected]>
> wrote:
>
> > Hello All,
> >
> > Question 1:
> >
> > Currently we have 5 backend modes.
> > (standalone, hadoop, spark, hybrid and spark_hybrid)
> > I would like to know if we need to have performance tests for all 5
> > backends + noticed that earlier we had perftests only for MapReduce and
> > Spark Mode.
> > https://github.com/apache/incubator-systemml/tree/
> master/scripts/perftest
> >
> > Question 2:
> >
> > Would calling scripts from the bin folder to run performance test be a
> > good idea?.
> > https://github.com/apache/incubator-systemml/tree/master/bin
> > (from python files like systemml-standalone.py, systemml-spark-submit.py
> > etc)
> >
> > Will that be a good approach?. Based on some suggestions on this JIRA, I
> > understand that creating an API would be another option.
> >
> > Please share your thoughts.
> >
> > Regards,
> > Krishna
> >
> >
> >
>

Re: Some questions on Performance Test

Reply via email to