[jira] [Commented] (SYSTEMML-1202) Deploy versioned documentation to main project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858823#comment-15858823 ] Felix Schüler commented on SYSTEMML-1202: - Woohoo! Thanks Deron! Now we just have to make sure that the documentation filed under the version is up-to-date with the release! > Deploy versioned documentation to main project website > -- > > Key: SYSTEMML-1202 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1202 > Project: SystemML > Issue Type: Task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature versioned documentation that appears when a > SystemML release (at least major releases) is performed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (SYSTEMML-894) Integrate documentation contents into main SystemML website
[ https://issues.apache.org/jira/browse/SYSTEMML-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson resolved SYSTEMML-894. - Resolution: Duplicate Fix Version/s: Not Applicable This has been resolved by SYSTEMML-1202. Documentation page now exists at http://systemml.apache.org/documentation on main site which links to 0.12.0 docs and the latest docs. 0.12.0 docs deployed to http://systemml.apache.org/docs/0.12.0/index.html. 0.12.0 javadocs deployed to http://systemml.apache.org/docs/0.12.0/api/java/index.html. > Integrate documentation contents into main SystemML website > --- > > Key: SYSTEMML-894 > URL: https://issues.apache.org/jira/browse/SYSTEMML-894 > Project: SystemML > Issue Type: New Feature > Components: Website >Reporter: Luciano Resende >Assignee: Deron Eriksson > Fix For: Not Applicable > > Attachments: 0001-SYSTEMML-1106-WIP-Sample-new-layout.patch > > > Currently the main SystemML website is hosted at apache, while the > documentation one is hosted at github pages. > Integrating them will provide more consistent look and feel and navigational > links. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (SYSTEMML-1202) Deploy versioned documentation to main project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson resolved SYSTEMML-1202. -- Resolution: Fixed Fix Version/s: SystemML 0.13 Documentation page now exists on main site which links to 0.12.0 docs and the latest docs. 0.12.0 docs deployed to http://systemml.apache.org/docs/0.12.0/index.html. 0.12.0 javadocs deployed to http://systemml.apache.org/docs/0.12.0/api/java/index.html. > Deploy versioned documentation to main project website > -- > > Key: SYSTEMML-1202 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1202 > Project: SystemML > Issue Type: Task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature versioned documentation that appears when a > SystemML release (at least major releases) is performed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (SYSTEMML-1202) Deploy versioned documentation to main project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson closed SYSTEMML-1202. > Deploy versioned documentation to main project website > -- > > Key: SYSTEMML-1202 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1202 > Project: SystemML > Issue Type: Task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature versioned documentation that appears when a > SystemML release (at least major releases) is performed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (SYSTEMML-1205) Deploy 0.12.0 javadocs to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson closed SYSTEMML-1205. > Deploy 0.12.0 javadocs to main website > -- > > Key: SYSTEMML-1205 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1205 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > Generate javadocs for the previous 0.12.0 release. Deploy the javadocs to the > main website following a folder structure similar to Spark. This should be > committed to svn rather than first to git and then to svn. > Sample URL: http://spark.apache.org/docs/2.1.0/api/java/index.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (SYSTEMML-1205) Deploy 0.12.0 javadocs to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson resolved SYSTEMML-1205. -- Resolution: Fixed Fix Version/s: SystemML 0.13 Fixed by svn commit 1782251. Version 0.12.0 javadocs now available online at http://systemml.apache.org/docs/0.12.0/api/java/index.html > Deploy 0.12.0 javadocs to main website > -- > > Key: SYSTEMML-1205 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1205 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > Generate javadocs for the previous 0.12.0 release. Deploy the javadocs to the > main website following a folder structure similar to Spark. This should be > committed to svn rather than first to git and then to svn. > Sample URL: http://spark.apache.org/docs/2.1.0/api/java/index.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (SYSTEMML-1204) Deploy 0.12.0 project documentation to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson resolved SYSTEMML-1204. -- Resolution: Fixed Fix Version/s: SystemML 0.13 Fixed by svn commit 1782251. Version 0.12.0 docs now available online at http://systemml.apache.org/docs/0.12.0/index.html > Deploy 0.12.0 project documentation to main website > --- > > Key: SYSTEMML-1204 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1204 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > Generate documentation site for the previous 0.12.0 (make sure _config.yml > version is correct). Deploy to main website following a folder structure > similar to Spark. This should just be committed to svn rather than first to > git and then to svn. > Example Spark URLs: > http://spark.apache.org/docs/2.1.0/ > http://spark.apache.org/docs/2.0.2/ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (SYSTEMML-1205) Deploy 0.12.0 javadocs to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson updated SYSTEMML-1205: - Description: Generate javadocs for the previous 0.12.0 release. Deploy the javadocs to the main website following a folder structure similar to Spark. This should be committed to svn rather than first to git and then to svn. Sample URL: http://spark.apache.org/docs/2.1.0/api/java/index.html was: Generate javadocs for the previous 0.11.0 release. Deploy the javadocs to the main website following a folder structure similar to Spark. This should be committed to svn rather than first to git and then to svn. Sample URL: http://spark.apache.org/docs/2.1.0/api/java/index.html > Deploy 0.12.0 javadocs to main website > -- > > Key: SYSTEMML-1205 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1205 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > > Generate javadocs for the previous 0.12.0 release. Deploy the javadocs to the > main website following a folder structure similar to Spark. This should be > committed to svn rather than first to git and then to svn. > Sample URL: http://spark.apache.org/docs/2.1.0/api/java/index.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (SYSTEMML-1204) Deploy 0.12.0 project documentation to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson closed SYSTEMML-1204. > Deploy 0.12.0 project documentation to main website > --- > > Key: SYSTEMML-1204 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1204 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > Generate documentation site for the previous 0.12.0 (make sure _config.yml > version is correct). Deploy to main website following a folder structure > similar to Spark. This should just be committed to svn rather than first to > git and then to svn. > Example Spark URLs: > http://spark.apache.org/docs/2.1.0/ > http://spark.apache.org/docs/2.0.2/ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (SYSTEMML-1204) Deploy 0.12.0 project documentation to main website
[ https://issues.apache.org/jira/browse/SYSTEMML-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson updated SYSTEMML-1204: - Description: Generate documentation site for the previous 0.12.0 (make sure _config.yml version is correct). Deploy to main website following a folder structure similar to Spark. This should just be committed to svn rather than first to git and then to svn. Example Spark URLs: http://spark.apache.org/docs/2.1.0/ http://spark.apache.org/docs/2.0.2/ was: Generate documentation site for the previous 0.11.0 (make sure _config.yml version is correct). Deploy to main website following a folder structure similar to Spark. This should just be committed to svn rather than first to git and then to svn. Example Spark URLs: http://spark.apache.org/docs/2.1.0/ http://spark.apache.org/docs/2.0.2/ > Deploy 0.12.0 project documentation to main website > --- > > Key: SYSTEMML-1204 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1204 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > > Generate documentation site for the previous 0.12.0 (make sure _config.yml > version is correct). Deploy to main website following a folder structure > similar to Spark. This should just be committed to svn rather than first to > git and then to svn. > Example Spark URLs: > http://spark.apache.org/docs/2.1.0/ > http://spark.apache.org/docs/2.0.2/ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (SYSTEMML-1203) Create main Documentation page on project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858689#comment-15858689 ] Deron Eriksson commented on SYSTEMML-1203: -- Svn revision 1782274 > Create main Documentation page on project website > - > > Key: SYSTEMML-1203 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1203 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature a main Documentation page that can be used to > easily access different versions of the project documentation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (SYSTEMML-1203) Create main Documentation page on project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson closed SYSTEMML-1203. > Create main Documentation page on project website > - > > Key: SYSTEMML-1203 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1203 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature a main Documentation page that can be used to > easily access different versions of the project documentation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (SYSTEMML-1203) Create main Documentation page on project website
[ https://issues.apache.org/jira/browse/SYSTEMML-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deron Eriksson resolved SYSTEMML-1203. -- Resolution: Fixed Fix Version/s: SystemML 0.13 Fixed by commit https://github.com/apache/incubator-systemml-website/commit/2b75d93a481e439ead339df5b5b2621ed9d149bd > Create main Documentation page on project website > - > > Key: SYSTEMML-1203 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1203 > Project: SystemML > Issue Type: Sub-task > Components: Documentation, Website >Reporter: Deron Eriksson >Assignee: Deron Eriksson > Fix For: SystemML 0.13 > > > The main website should feature a main Documentation page that can be used to > easily access different versions of the project documentation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (SYSTEMML-1238) Python test failing for LinearRegCG
[ https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858442#comment-15858442 ] Niketan Pansare edited comment on SYSTEMML-1238 at 2/8/17 7:36 PM: --- Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 {code} python_LinearReg_test_spark.2.1.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 {code} was (Author: niketanpansare): Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 {code} python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 > Python test failing for LinearRegCG > --- > > Key: SYSTEMML-1238 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1238 > Project: SystemML > Issue Type: Bug > Components: Algorithms, APIs >Affects Versions: SystemML 0.13 >Reporter: Imran Younus > Attachments: python_LinearReg_test_spark.1.6.log, > python_LinearReg_test_spark.2.1.log > > > [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) > with spark 2.1.0 was failing because the test score from linear regression > was very low ({{~ 0.24}}). I did a some investigation and it turns out the > the model parameters computed by the dml script are incorrect. In > systemml.12, the values of betas from linear regression model are > {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I > also tested this with sklearn). But the values of betas from systemml.13 > (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not > correct and therefore the test score is much lower than expected. The data > going into DML script is correct. I printed out the valued of {{X}} and {{Y}} > in dml and I didn't see any issue there. > Attached are the log files for two different tests (systemml0.12 and 0.13) > with explain flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (SYSTEMML-1238) Python test failing for LinearRegCG
[ https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858442#comment-15858442 ] Niketan Pansare edited comment on SYSTEMML-1238 at 2/8/17 7:36 PM: --- Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 {/code} python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 was (Author: niketanpansare): Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 > Python test failing for LinearRegCG > --- > > Key: SYSTEMML-1238 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1238 > Project: SystemML > Issue Type: Bug > Components: Algorithms, APIs >Affects Versions: SystemML 0.13 >Reporter: Imran Younus > Attachments: python_LinearReg_test_spark.1.6.log, > python_LinearReg_test_spark.2.1.log > > > [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) > with spark 2.1.0 was failing because the test score from linear regression > was very low ({{~ 0.24}}). I did a some investigation and it turns out the > the model parameters computed by the dml script are incorrect. In > systemml.12, the values of betas from linear regression model are > {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I > also tested this with sklearn). But the values of betas from systemml.13 > (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not > correct and therefore the test score is much lower than expected. The data > going into DML script is correct. I printed out the valued of {{X}} and {{Y}} > in dml and I didn't see any issue there. > Attached are the log files for two different tests (systemml0.12 and 0.13) > with explain flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (SYSTEMML-1238) Python test failing for LinearRegCG
[ https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858442#comment-15858442 ] Niketan Pansare commented on SYSTEMML-1238: --- Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 > Python test failing for LinearRegCG > --- > > Key: SYSTEMML-1238 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1238 > Project: SystemML > Issue Type: Bug > Components: Algorithms, APIs >Affects Versions: SystemML 0.13 >Reporter: Imran Younus > Attachments: python_LinearReg_test_spark.1.6.log, > python_LinearReg_test_spark.2.1.log > > > [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) > with spark 2.1.0 was failing because the test score from linear regression > was very low ({{~ 0.24}}). I did a some investigation and it turns out the > the model parameters computed by the dml script are incorrect. In > systemml.12, the values of betas from linear regression model are > {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I > also tested this with sklearn). But the values of betas from systemml.13 > (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not > correct and therefore the test score is much lower than expected. The data > going into DML script is correct. I printed out the valued of {{X}} and {{Y}} > in dml and I didn't see any issue there. > Attached are the log files for two different tests (systemml0.12 and 0.13) > with explain flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (SYSTEMML-1238) Python test failing for LinearRegCG
[ https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858442#comment-15858442 ] Niketan Pansare edited comment on SYSTEMML-1238 at 2/8/17 7:36 PM: --- Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 {code} python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 was (Author: niketanpansare): Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows: python_LinearReg_test_spark.1.6.log: {code} ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857 {/code} python_LinearReg_test_spark.2.1.log: ||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.0137881395137 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 > Python test failing for LinearRegCG > --- > > Key: SYSTEMML-1238 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1238 > Project: SystemML > Issue Type: Bug > Components: Algorithms, APIs >Affects Versions: SystemML 0.13 >Reporter: Imran Younus > Attachments: python_LinearReg_test_spark.1.6.log, > python_LinearReg_test_spark.2.1.log > > > [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) > with spark 2.1.0 was failing because the test score from linear regression > was very low ({{~ 0.24}}). I did a some investigation and it turns out the > the model parameters computed by the dml script are incorrect. In > systemml.12, the values of betas from linear regression model are > {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I > also tested this with sklearn). But the values of betas from systemml.13 > (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not > correct and therefore the test score is much lower than expected. The data > going into DML script is correct. I printed out the valued of {{X}} and {{Y}} > in dml and I didn't see any issue there. > Attached are the log files for two different tests (systemml0.12 and 0.13) > with explain flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)