[ https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868468#comment-15868468 ]
Imran Younus commented on SYSTEMML-1238: ---------------------------------------- I tested LinearRegCG.dml script with the same data set that is being used in this test and get the correct results from the dml script. Here is how I ran it: {code} $SPARK_HOME/bin/spark-submit --master=local --driver-memory=6g $SYSTEMML_HOME/target/SystemML.jar -f $SYSTEMML_HOME/scripts/algorithms/LinearRegCG.dml -nvargs X=/user/iyounus/data/diabetes_X_train.txt Y=/user/iyounus/data/diabetes_y_train.txt B="beta.txt" icpt=1 {code} But if I run the python test, then I get incorrect results. Just to complete, here is how I'm running the test: {code} $SPARK_HOME/bin/spark-submit --master=local --driver-memory=6g --driver-class-path $SYSTEMML_HOME/target/SystemML.jar test_mllearn_df.py {code} I hope this helps. > Python test failing for LinearRegCG > ----------------------------------- > > Key: SYSTEMML-1238 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1238 > Project: SystemML > Issue Type: Bug > Components: Algorithms, APIs > Affects Versions: SystemML 0.13 > Reporter: Imran Younus > Assignee: Niketan Pansare > Attachments: python_LinearReg_test_spark.1.6.log, > python_LinearReg_test_spark.2.1.log > > > [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) > with spark 2.1.0 was failing because the test score from linear regression > was very low ({{~ 0.24}}). I did a some investigation and it turns out the > the model parameters computed by the dml script are incorrect. In > systemml.12, the values of betas from linear regression model are > {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I > also tested this with sklearn). But the values of betas from systemml.13 > (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not > correct and therefore the test score is much lower than expected. The data > going into DML script is correct. I printed out the valued of {{X}} and {{Y}} > in dml and I didn't see any issue there. > Attached are the log files for two different tests (systemml0.12 and 0.13) > with explain flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)