+1 for testing our actual (vs simplified test version) scripts against some 
metric of choice.  This will allow us to (1) ensure that each script does not 
have a showstopper bug (engine bug), and (2) that this script is still 
producing a reasonable mathematical result (math bug).

-Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Feb 17, 2017, at 4:17 PM, Niketan Pansare <npan...@us.ibm.com> wrote:
> 
> For now, I have updated our python mllearn tests to compare the prediction of 
> our algorithm to that of scikit-learn: 
> https://github.com/apache/incubator-systemml/blob/master/src/main/python/tests/test_mllearn_numpy.py#L81
> 
> The test now uses scikit-learn predictions as the baseline and computes the 
> scores (accuracy score for classifiers and r2 score for regressors). If the 
> score is greater than 95%, the test pass. Though using this approach, we do 
> not measure the generalization capability of our algorithm, we at least 
> ensure that our algorithm performs no worse than scikit-learn under default 
> setting. We can make the testing even more rigorous later. The next step 
> would be to enable these python tests through jenkins.
> 
> Thanks,
> 
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> 
> Matthias Boehm ---02/17/2017 11:54:02 AM---Yes, this has been discussed a 
> couple of times now, most recently in SYSTEMML-546. It takes quite s
> 
> From: Matthias Boehm <mboe...@googlemail.com>
> To: dev@systemml.incubator.apache.org
> Date: 02/17/2017 11:54 AM
> Subject: Re: Proposal to add 'accuracy test suite' before 1.0 release
> 
> 
> 
> 
> Yes, this has been discussed a couple of times now, most recently in 
> SYSTEMML-546. It takes quite some effort though to create a 
> sophisticated algorithm-level test suite as done for GLM. So by all 
> means, please, go ahead and add these tests.
> 
> However, I would not impose any constraints on the contribution of new 
> algorithms in that regard, or similarly on tests with simplified 
> algorithms because it would raise the bar to high.
> 
> Regards,
> Matthias
> 
> 
> On 2/17/2017 10:48 AM, Niketan Pansare wrote:
> >
> >
> > Hi all,
> >
> > We currently test the correctness of individual runtime operators using our
> > integration tests but not the "released" algorithms. To be fair, we do test
> > a subset of "simplified" algorithms on synthetic datasets and compare the
> > accuracy with R. Also, we are testing subset of released algorithms using
> > our Python tests, but it's intended purpose is to only test the integration
> > of the APIs:
> > Simplified algorithms:
> > https://github.com/apache/incubator-systemml/tree/master/src/test/scripts/applications
> > Released algorithms:
> > https://github.com/apache/incubator-systemml/tree/master/scripts/algorithms
> > Python tests:
> > https://github.com/apache/incubator-systemml/tree/master/src/main/python/tests
> >
> > Though the released algorithm is tested when it is initially introduced,
> > other artifacts (spark versions, API changes, engine improvements, etc)
> > could cause them to return incorrect results over a period of time.
> > Therefore, similar to our performance test suite (
> > https://github.com/apache/incubator-systemml/tree/master/scripts/perftest),
> > I propose we create another test suite ("accuracy test suite" for lack of a
> > better term) that compares the accuracy (or some other metric) of our
> > released algorithms on standard datasets. Making it a requirement to add
> > tests to accuracy test suite when adding the new algorithm will greatly
> > improve the production-readiness of SystemML as well as serve as a usage
> > guide too. This implies we run both the performance as well as accuracy
> > test suite before our release. Alternative is to replace simplified
> > algorithms with our released algorithms.
> >
> > Advantages of accuracy test suite approach:
> > 1. No increase the running time of integration tests on Jenkins.
> > 2. Accuracy test suite could use much larger datasets.
> > 3. Accuracy test suite could include algorithms that take longer to
> > converge (for example: Deep Learning algorithms).
> >
> > Advantage of replacing simplified algorithms:
> > 1. No commit breaks any of the existing algorithms.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> >
> 
> 
> 
> 

Reply via email to