[ 
https://issues.apache.org/jira/browse/SYSTEMML-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962036#comment-15962036
 ] 

Glenn Weidner commented on SYSTEMML-1474:
-----------------------------------------

Thank you Niketan for the fix/merge and Matthias for corresponding PR review.  
I built a 0.15 snapshot distribution in my local development environment with 
latest from master including above commit.  Updated my target test environment 
with new version and repeated test_mllearn_numpy.py.  The index out of bounds 
error no longer occurred and test_naive_bayes1 succeeded.  I did see an error 
with test_svm, but my understanding/recollection is that's more of a test case 
issue where the expected accuracy may be to high.

{code}
17/04/08 21:24:14 INFO DAGScheduler: Job 37 finished: collect at 
SparkExecutionContext.java:796, took 0.014786 s
F
======================================================================
FAIL: test_svm (__main__.TestMLLearn)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ambari-qa/test_mllearn_numpy.py", line 157, in test_svm
    self.failUnless(accuracy_score(sklearn_predicted, mllearn_predicted) > 0.95 
)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 7 tests in 46.610s

FAILED (failures=1)
{code}

Marking this JIRA as done/verified, but let me know if I'm mistaken regarding 
SVM accuracy.

> Index out of bounds error in test_naive_bayes1 of test_mllearn_numpy.py
> -----------------------------------------------------------------------
>
>                 Key: SYSTEMML-1474
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1474
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Glenn Weidner
>            Assignee: Niketan Pansare
>            Priority: Minor
>             Fix For: SystemML 0.14
>
>
> The following error was observed running the python tests from command line 
> with spark-submit:
> {code}
> ======================================================================
> ERROR: test_naive_bayes1 (__main__.TestMLLearn)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/spark/test_mllearn_numpy.py", line 184, in test_naive_bayes1
>     mllearn_predicted = nb.fit(vectors, 
> newsgroups_train.target).predict(vectors_test)
>   File "/usr/lib/python2.7/site-packages/systemml/mllearn/estimators.py", 
> line 142, in fit
>     self.fit_numpy(X, y)
>   File "/usr/lib/python2.7/site-packages/systemml/mllearn/estimators.py", 
> line 95, in fit_numpy
>     self._fit_numpy()
>   File "/usr/lib/python2.7/site-packages/systemml/mllearn/estimators.py", 
> line 88, in _fit_numpy
>     self.model = self.estimator.fit(convertToMatrixBlock(self.sc, self.X), 
> y_mb)
>   File "/usr/lib/python2.7/site-packages/systemml/converters.py", line 106, 
> in convertToMatrixBlock
>     [ _copyRowBlock(i, sc, ret, src, numRowsPerBlock,  rlen, clen) for i in 
> range(0, src.shape[0], numRowsPerBlock) ]
>   File "/usr/lib/python2.7/site-packages/systemml/converters.py", line 83, in 
> _copyRowBlock
>     mb = _convertSPMatrixToMB(sc, src[i:i+numRowsPerBlock,]) if 
> isinstance(src, spmatrix) else _convertDenseMatrixToMB(sc, 
> src[i:i+numRowsPerBlock,])
>   File "/usr/lib64/python2.7/site-packages/scipy/sparse/csr.py", line 304, in 
> __getitem__
>     return self._get_submatrix(row, col)
>   File "/usr/lib64/python2.7/site-packages/scipy/sparse/csr.py", line 447, in 
> _get_submatrix
>     check_bounds(i0, i1, M)
>   File "/usr/lib64/python2.7/site-packages/scipy/sparse/csr.py", line 443, in 
> check_bounds
>     " %d <= %d" % (i0, num, i1, num, i0, i1))
> IndexError: index out of bounds: 0 <= 2030 <= 2034, 0 <= 2059 <= 2034, 2030 
> <= 2059
> {code}
> The IndexError was first observed when running the test under a Notebook 
> cloud environment with Spark 2.0.2, then reproduced at command line on local 
> system with Spark 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to