[3/4] systemml git commit: [SYSTEMML-1451][Phase3] phase 3 work

deron Mon, 28 Aug 2017 10:37:34 -0700

[SYSTEMML-1451][Phase3] phase 3 work

- Offline CSV support
- Family bug fix
- Plots
- Doc Update
- Stats update
- Bug train, predict append family name


Closes #604


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/fdc2be22
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/fdc2be22
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/fdc2be22

Branch: refs/heads/gh-pages
Commit: fdc2be22b66151f798a6e2b5be439bd616d24494
Parents: 07bd40a
Author: krishnakalyan3 <krishnakaly...@gmail.com>
Authored: Sat Aug 26 11:52:59 2017 -0700
Committer: Nakul Jindal <naku...@gmail.com>
Committed: Sat Aug 26 11:52:59 2017 -0700

----------------------------------------------------------------------
 python-performance-test.md | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/systemml/blob/fdc2be22/python-performance-test.md
----------------------------------------------------------------------
diff --git a/python-performance-test.md b/python-performance-test.md
index ce36c2d..25e1f35 100644
--- a/python-performance-test.md
+++ b/python-performance-test.md
@@ -148,6 +148,17 @@ Run performance test for all algorithms under the family 
`regression2` and log w
 Run performance test for all algorithms using HDFS.
 
 
+## Result Consolidation and Plotting
+We have two scripts, `stats.py` forpulling results from google docs and 
`update.py` to updating results to google docs or local file system.
+
+Example of `update.py` would be below
+`./scripts/perftest/python/google_docs/update.py --file  
../../temp/perf_test_singlenode.out --exec-type singlenode --tag 2 --append 
test.csv` 
+The arguments being `--file` path of the perf-test output, `--exec-type` 
execution mode used to generate the perf-test output, `--tag` being the 
realease version or a unique name, `--append` being an optional argument that 
would append the a local csv file. If instead of `--append` the `--auth` 
argument needs the location of the `google api key` file.
+
+Example of `stats.py` below 
+`  ./stats.py --auth ../key/client_json.json --exec-type singlenode --plot 
stats1_data-gen_none_dense_10k_100`
+`--plot` argument needs the name of the composite key that you would like to 
compare results over. If this argument is not specified the results would be 
grouped by keys.
+
 ## Operational Notes
 
 All performance test depend mainly on two scripts for execution 
`systemml-standalone.py` and `systemml-spark-submit.py`. Incase we need to 
change standalone or spark parameters we need to manually change these 
parameters in their respective scripts.
@@ -158,7 +169,7 @@ The logs contain the following information below comma 
separated.
 
 algorithm | run_type | intercept | matrix_type | data_shape | time_sec
 --- | --- | --- | --- | --- | --- |
-multinomial|data-gen|0|dense|10k_100| 0.33
+multinomial|data-gen|0|10k_100|dense| 0.33
 MultiLogReg|train|0|10k_100|dense|6.956
 MultiLogReg|predict|0|10k_100|dense|4.780
 
@@ -187,9 +198,12 @@ Matrix Shape | Approximate Data Size
 10M_1k|80GB
 100M_1k|800GB
 
+
 For example the command below runs performance test for all data sizes 
described above
 `run_perftest.py --family binomial clustering multinomial regression1 
regression2 stats1 stats2 --mat-shape 10k_1k 100k_1k 1M_1k 10M_1k 100M_1k 
--master yarn-client  --temp-dir hdfs://localhost:9000/user/systemml`
 
+By default data generated in `hybrid_spark` execution mode is in the current 
users `hdfs` home directory.
+
 Note: Please use this command `pip3 install -r requirements.txt` before using 
the perftest scripts.

[3/4] systemml git commit: [SYSTEMML-1451][Phase3] phase 3 work

Reply via email to