[ 
https://issues.apache.org/jira/browse/SYSTEMML-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985365#comment-15985365
 ] 

Nakul Jindal commented on SYSTEMML-1451:
----------------------------------------

Hi [~KrishnaKalyan3], which test is this? and which size, 800MB, 8GB, ...?
Also, try staying within the constraints of your hardware (2 cores, 8GB RAM).
{code}
jvm_args: -Xmx20G -Xms20g -Xmn2g 
java_command: org.apache.spark.deploy.SparkSubmit --master yarn-client --conf 
spark.executor.memory="-Xms50g" --conf spark.driver.memory=20G --conf 
spark.akka.frameSize=128 --conf spark.driver.maxResultSize=0 --conf 
spark.memory.useLegacyMode=true --conf spark.rpc.askTimeout=6000s --conf 
spark.network.timeout=6000s --conf spark.executor.extraJavaOptions="-Xmn5500m" 
--conf spark.yarn.executor.memoryOverhead=8250 --conf 
spark.files.useFetchCache=false --conf spark.driver.extraJavaOptions=-Xms20g 
-Xmn2g --num-executors 5 --executor-memory 60G --executor-cores 24 
./SystemML.jar -f extractTestData.dml -exec hybrid_spark -args 
my_test_data/binomial/X10k_1k_sparse my_test_data/binomial/y10k_1k_sparse 
my_test_data/binomial/X10k_1k_sparse_test 
my_test_data/binomial/y10k_1k_sparse_test binary

{code}

The number of executor cores, number of executors, etc. Keep them small enough 
to fit on your machine.

> Automate performance testing and reporting
> ------------------------------------------
>
>                 Key: SYSTEMML-1451
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1451
>             Project: SystemML
>          Issue Type: Improvement
>          Components: Infrastructure, Test
>            Reporter: Nakul Jindal
>              Labels: gsoc2017, mentor, performance, reporting, testing
>
> As part of a release (and in general), performance tests are run for SystemML.
> Currently, running and reporting on these performance tests are a manual 
> process. There are helper scripts, but largely the process is manual.
> The aim of this GSoC 2017 project is to automate performance testing and its 
> reporting.
> These are the tasks that this entails
> 1. Automate running of the performance tests, including generation of test 
> data
> 2. Detect errors and report if any
> 3. Record performance benchmarking information
> 4. Automatically compare this performance to previous version to check for 
> performance regressions
> 5. Automatically compare to Spark MLLib, R?, Julia?
> 6. Prepare report with all the information about failed jobs, performance 
> information, perf info against other comparable projects/algorithms 
> (plotted/in plain text in CSV, PDF or other common format)
> 7. Create scripts to automatically run this process on a cloud provider that 
> spins up machines, runs the test, saves the reports and spins down the 
> machines.
> 8. Create a web application to do this interactively without dropping down 
> into a shell.
> As part of this project, the student will need to know scripting (in Bash, 
> Python, etc). It may also involve changing error reporting and performance 
> reporting code in SystemML. 
> Rating - Medium (for the amount of work)
> Mentor - [~nakul02] (Other co-mentors will join in)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to