Re: Performance Test: Phase-2 Release

2017-08-02 Thread Deron Eriksson
Hi Krishna,

First off, congratulations on your great GSOC contributions to SystemML!

WRT Phase 3 (c), Mike has created an epic (
https://issues.apache.org/jira/browse/SYSTEMML-493) to address
functionalizing the algorithms. Feel free to choose a DML algorithm (or
algorithms) and add it to the epic. Gus is currently working on this for
Kmeans (it's listed as an issue in the epic).

Also, please note that Imran has some good recommendations in
https://issues.apache.org/jira/browse/SYSTEMML-1667. We may want to create
a new version of the scripts using the folder structure recommended by
Imran.

Thanks!
Deron


On Wed, Aug 2, 2017 at 7:27 AM, Krishna Kalyan 
wrote:

> Dear SystemML Community,
> Thank you so much for your support and guidance so far.
>
> Our Phase-2 release includes:
> - HDFS support
> - Support to upload pref-test results automatically to google docs +
> automatically summarise data from google docs between two versions.
> - Support for additional arguments in our perf test suit like debug, stats,
> config, master etc...
>
> I would really appreciate if the community could use / test our performance
> test suit and share their valuable feedback. (Please refer to the
> documentation below to get started testing our performance test suit. Also
> if something is unclear please let me know.)
>
> Documentation
> https://github.com/apache/systemml/blob/master/docs/
> python-performance-test.md
>
> About Google Docs:
> We need an api client key to upload our performance test results to google
> docs. (Currently I am using my personal email account + personal api client
> key). I was wondering if a separate gmail account should be created for
> SystemML community and access to google docs / api client key can be shared
> with committers?. (Results from my machine can be found in the url at the
> end of this mail)
>
> Phase 3:
> We have another 4 weeks before GSoC ends and some tasks I will be working
> on:
> a) Offline CSV support in addition to google docs
> b) Plots comparing algorithm performance across different releases
> c) I would also be interested in refactoring / improving existing DML
> scripts. (@Deron could you please let me know how I could begin
> contributing here?)
> d) I would also like to know about other features the community would like
> us to build / add in our performance test suit. (We will definitely do our
> best to work on it)
>
> Again, thank you so much for your time and effort.
>
> Cheers,
> Krishna
>
>
> Results obtained in my local system can be found below
> https://docs.google.com/spreadsheets/d/1sIAPKGFph_
> X64ZiKMNc16IS5XiDcM23evelzBF3uwgM/edit#gid=0
>
> My System Config
> 16 GB ram
> OSX
> 2.5 GHz, Intel i5
>
> [Phase1 PR]
> https://github.com/apache/systemml/pull/537
>
> [Phase2 PR]
> https://github.com/apache/systemml/pull/575
>


Performance Test: Phase-2 Release

2017-08-02 Thread Krishna Kalyan
Dear SystemML Community,
Thank you so much for your support and guidance so far.

Our Phase-2 release includes:
- HDFS support
- Support to upload pref-test results automatically to google docs +
automatically summarise data from google docs between two versions.
- Support for additional arguments in our perf test suit like debug, stats,
config, master etc...

I would really appreciate if the community could use / test our performance
test suit and share their valuable feedback. (Please refer to the
documentation below to get started testing our performance test suit. Also
if something is unclear please let me know.)

Documentation
https://github.com/apache/systemml/blob/master/docs/python-performance-test.md

About Google Docs:
We need an api client key to upload our performance test results to google
docs. (Currently I am using my personal email account + personal api client
key). I was wondering if a separate gmail account should be created for
SystemML community and access to google docs / api client key can be shared
with committers?. (Results from my machine can be found in the url at the
end of this mail)

Phase 3:
We have another 4 weeks before GSoC ends and some tasks I will be working
on:
a) Offline CSV support in addition to google docs
b) Plots comparing algorithm performance across different releases
c) I would also be interested in refactoring / improving existing DML
scripts. (@Deron could you please let me know how I could begin
contributing here?)
d) I would also like to know about other features the community would like
us to build / add in our performance test suit. (We will definitely do our
best to work on it)

Again, thank you so much for your time and effort.

Cheers,
Krishna


Results obtained in my local system can be found below
https://docs.google.com/spreadsheets/d/1sIAPKGFph_X64ZiKMNc16IS5XiDcM23evelzBF3uwgM/edit#gid=0

My System Config
16 GB ram
OSX
2.5 GHz, Intel i5

[Phase1 PR]
https://github.com/apache/systemml/pull/537

[Phase2 PR]
https://github.com/apache/systemml/pull/575