[ 
https://issues.apache.org/jira/browse/SPARK-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189283#comment-15189283
 ] 

Mohamed Baddar commented on SPARK-13073:
----------------------------------------

[~josephkb] After looking at source code of 
org.apache.spark.ml.classification.LogisticRegressionSummary and 
org.apache.spark.ml.classification.LogisticRegressionTrainingSummary

and after running a sample GLM in R which has the following output 

Call:
glm(formula = mpg ~ wt + hp + gear, family = gaussian(), data = mtcars)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.3712  -1.9017  -0.3444   0.9883   6.0655  

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 32.013657   4.632264   6.911 1.64e-07 ***
wt          -3.197811   0.846546  -3.777 0.000761 ***
hp          -0.036786   0.009891  -3.719 0.000888 ***
gear         1.019981   0.851408   1.198 0.240963    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 6.626347)

    Null deviance: 1126.05  on 31  degrees of freedom
Residual deviance:  185.54  on 28  degrees of freedom
AIC: 157.05

Number of Fisher Scoring iterations: 2

I have the following comments :
1-I think we should add the following member to LogisticRegressionSummary : 
coefficients and residuals

2-toString method should be overridden in the following classes :
org.apache.spark.ml.classification.BinaryLogisticRegressionSummary and 
org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary

Any other suggestions ? Please correct me if have missed something.

> creating R like summary for logistic Regression in Spark - Scala
> ----------------------------------------------------------------
>
>                 Key: SPARK-13073
>                 URL: https://issues.apache.org/jira/browse/SPARK-13073
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>            Reporter: Samsudhin
>            Priority: Minor
>
> Currently Spark ML provides Coefficients for logistic regression. To evaluate 
> the trained model tests like wald test, chi square tests and their results to 
> be summarized and display like GLM summary of R



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to