Imran Younus created SYSTEMML-1238:
--------------------------------------

             Summary: Python test failing for LinearRegCG
                 Key: SYSTEMML-1238
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1238
             Project: SystemML
          Issue Type: Bug
          Components: Algorithms, APIs
    Affects Versions: SystemML 0.13
            Reporter: Imran Younus
         Attachments: python_LinearReg_test_spark.1.6.log, 
python_LinearReg_test_spark.2.1.log

[~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) 
with spark 2.1.0 was failing because the test score from linear regression was 
very low ({{~ 0.24}}). I did a some investigation and it turns out the the 
model parameters computed by the dml script are incorrect. In systemml.12, the 
values of betas from linear regression model are {{\[152.919, 938.237\]}}. This 
is what we expect from normal equation. (I also tested this with sklearn). But 
the values of betas from systemml.13 (with spark 2.1.0) come out to be 
{{\[153.146, 458.489\]}}. These are not correct and therefore the test score is 
much lower than expected. The data going into DML script is correct. I printed 
out the valued of {{X}} and {{Y}} in dml and I didn't see any issue there.

Attached are the log files for two different tests (systemml0.12 and 0.13) with 
explain flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to