[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268400#comment-15268400 ] Mohamed Baddar commented on SPARK-13073: [~samsudhin] I will work on it soon > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268322#comment-15268322 ] Samsudhin commented on SPARK-13073: --- [~mbaddar1] any update on this > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224982#comment-15224982 ] Joseph K. Bradley commented on SPARK-13073: --- Sorry for the slow response. I think this approach is fine, where we match the summary format of R and provide whatever info is available. I'll comment on the PR. > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210259#comment-15210259 ] Mohamed Baddar commented on SPARK-13073: Thanks [~samsudhin] I noticed the difference in params. Do you have any other comments on my notes > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210237#comment-15210237 ] Samsudhin commented on SPARK-13073: --- @Mohammed Baddar i checked on your comment - 10/Mar/16 13:28 You have executed Linear Regression Summary. For Logistic Regression the summary would be like below, > summary(glm(formula = vs ~ wt + hp + gear, family = binomial(), data = > mtcars)) Call: glm(formula = vs ~ wt + hp + gear, family = binomial(), data = mtcars) Deviance Residuals: Min1QMedian3Q Max -1.79167 -0.19535 -0.00689 0.43289 1.54872 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 11.175729.26728 1.206 0.2278 wt 0.31.58811 0.350 0.7265 hp -0.085140.03618 -2.353 0.0186 * gear-0.647231.42248 -0.455 0.6491 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 43.86 on 31 degrees of freedom Residual deviance: 15.89 on 28 degrees of freedom AIC: 23.89 Number of Fisher Scoring iterations: 7 > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210054#comment-15210054 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] Can any one of the admins verify this PR > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195184#comment-15195184 ] Apache Spark commented on SPARK-13073: -- User 'mbaddar1' has created a pull request for this issue: https://github.com/apache/spark/pull/11729 > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192290#comment-15192290 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] After more investigation in the code , and to make minimal changes on the code.My previous suggestion may not be suitable .I think we can implement toString version for BinaryLogisticRegressionSummary that give different information than R summary. It will create string representation for the following members : precision recall fmeasure [~josephkb] is there any comment before i start the PR ? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189283#comment-15189283 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] After looking at source code of org.apache.spark.ml.classification.LogisticRegressionSummary and org.apache.spark.ml.classification.LogisticRegressionTrainingSummary and after running a sample GLM in R which has the following output Call: glm(formula = mpg ~ wt + hp + gear, family = gaussian(), data = mtcars) Deviance Residuals: Min 1Q Median 3Q Max -3.3712 -1.9017 -0.3444 0.9883 6.0655 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.013657 4.632264 6.911 1.64e-07 *** wt -3.197811 0.846546 -3.777 0.000761 *** hp -0.036786 0.009891 -3.719 0.000888 *** gear 1.019981 0.851408 1.198 0.240963 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for gaussian family taken to be 6.626347) Null deviance: 1126.05 on 31 degrees of freedom Residual deviance: 185.54 on 28 degrees of freedom AIC: 157.05 Number of Fisher Scoring iterations: 2 I have the following comments : 1-I think we should add the following member to LogisticRegressionSummary : coefficients and residuals 2-toString method should be overridden in the following classes : org.apache.spark.ml.classification.BinaryLogisticRegressionSummary and org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary Any other suggestions ? Please correct me if have missed something. > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187935#comment-15187935 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] Can you assign this to me as a starter task ? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183575#comment-15183575 ] Mohamed Baddar commented on SPARK-13073: [~josephkb] If no body is working on it , can i start working on that issue as a starter task ? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174943#comment-15174943 ] Gayathri Murali commented on SPARK-13073: - I can work on this, can you please assign it to me? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13073) creating R like summary for logistic Regression in Spark - Scala
[ https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174691#comment-15174691 ] Joseph K. Bradley commented on SPARK-13073: --- It sounds reasonable to provide the same printed summary in Scala, Java, and Python as in R. Perhaps it can be provided as a toString method for the LogisticRegressionModel.summary member? > creating R like summary for logistic Regression in Spark - Scala > > > Key: SPARK-13073 > URL: https://issues.apache.org/jira/browse/SPARK-13073 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Reporter: Samsudhin >Priority: Minor > > Currently Spark ML provides Coefficients for logistic regression. To evaluate > the trained model tests like wald test, chi square tests and their results to > be summarized and display like GLM summary of R -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org