+1 for testing our actual (vs simplified test version) scripts against some metric of choice. This will allow us to (1) ensure that each script does not have a showstopper bug (engine bug), and (2) that this script is still producing a reasonable mathematical result (math bug).
-Mike -- Mike Dusenberry GitHub: github.com/dusenberrymw LinkedIn: linkedin.com/in/mikedusenberry Sent from my iPhone. > On Feb 17, 2017, at 4:17 PM, Niketan Pansare <npan...@us.ibm.com> wrote: > > For now, I have updated our python mllearn tests to compare the prediction of > our algorithm to that of scikit-learn: > https://github.com/apache/incubator-systemml/blob/master/src/main/python/tests/test_mllearn_numpy.py#L81 > > The test now uses scikit-learn predictions as the baseline and computes the > scores (accuracy score for classifiers and r2 score for regressors). If the > score is greater than 95%, the test pass. Though using this approach, we do > not measure the generalization capability of our algorithm, we at least > ensure that our algorithm performs no worse than scikit-learn under default > setting. We can make the testing even more rigorous later. The next step > would be to enable these python tests through jenkins. > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > > Matthias Boehm ---02/17/2017 11:54:02 AM---Yes, this has been discussed a > couple of times now, most recently in SYSTEMML-546. It takes quite s > > From: Matthias Boehm <mboe...@googlemail.com> > To: dev@systemml.incubator.apache.org > Date: 02/17/2017 11:54 AM > Subject: Re: Proposal to add 'accuracy test suite' before 1.0 release > > > > > Yes, this has been discussed a couple of times now, most recently in > SYSTEMML-546. It takes quite some effort though to create a > sophisticated algorithm-level test suite as done for GLM. So by all > means, please, go ahead and add these tests. > > However, I would not impose any constraints on the contribution of new > algorithms in that regard, or similarly on tests with simplified > algorithms because it would raise the bar to high. > > Regards, > Matthias > > > On 2/17/2017 10:48 AM, Niketan Pansare wrote: > > > > > > Hi all, > > > > We currently test the correctness of individual runtime operators using our > > integration tests but not the "released" algorithms. To be fair, we do test > > a subset of "simplified" algorithms on synthetic datasets and compare the > > accuracy with R. Also, we are testing subset of released algorithms using > > our Python tests, but it's intended purpose is to only test the integration > > of the APIs: > > Simplified algorithms: > > https://github.com/apache/incubator-systemml/tree/master/src/test/scripts/applications > > Released algorithms: > > https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms > > Python tests: > > https://github.com/apache/incubator-systemml/tree/master/src/main/python/tests > > > > Though the released algorithm is tested when it is initially introduced, > > other artifacts (spark versions, API changes, engine improvements, etc) > > could cause them to return incorrect results over a period of time. > > Therefore, similar to our performance test suite ( > > https://github.com/apache/incubator-systemml/tree/master/scripts/perftest), > > I propose we create another test suite ("accuracy test suite" for lack of a > > better term) that compares the accuracy (or some other metric) of our > > released algorithms on standard datasets. Making it a requirement to add > > tests to accuracy test suite when adding the new algorithm will greatly > > improve the production-readiness of SystemML as well as serve as a usage > > guide too. This implies we run both the performance as well as accuracy > > test suite before our release. Alternative is to replace simplified > > algorithms with our released algorithms. > > > > Advantages of accuracy test suite approach: > > 1. No increase the running time of integration tests on Jenkins. > > 2. Accuracy test suite could use much larger datasets. > > 3. Accuracy test suite could include algorithms that take longer to > > converge (for example: Deep Learning algorithms). > > > > Advantage of replacing simplified algorithms: > > 1. No commit breaks any of the existing algorithms. > > > > Thanks, > > > > Niketan Pansare > > IBM Almaden Research Center > > E-mail: npansar At us.ibm.com > > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > > > > > >