[ https://issues.apache.org/jira/browse/SPARK-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-9836: --------------------------------- Assignee: Yanbo Liang > Provide R-like summary statistics for ordinary least squares via normal > equation solver > --------------------------------------------------------------------------------------- > > Key: SPARK-9836 > URL: https://issues.apache.org/jira/browse/SPARK-9836 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xiangrui Meng > Assignee: Yanbo Liang > > In R, model fitting comes with summary statistics. We can provide most of > those via normal equation solver (SPARK-9834). If some statistics requires > additional passes to the dataset, we can expose an option to let users select > desired statistics before model fitting. > {code} > > summary(model) > Call: > glm(formula = Sepal.Length ~ Sepal.Width + Species, data = iris) > Deviance Residuals: > Min 1Q Median 3Q Max > -1.30711 -0.25713 -0.05325 0.19542 1.41253 > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 2.2514 0.3698 6.089 9.57e-09 *** > Sepal.Width 0.8036 0.1063 7.557 4.19e-12 *** > Speciesversicolor 1.4587 0.1121 13.012 < 2e-16 *** > Speciesvirginica 1.9468 0.1000 19.465 < 2e-16 *** > --- > Signif. codes: > 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > (Dispersion parameter for gaussian family taken to be 0.1918059) > Null deviance: 102.168 on 149 degrees of freedom > Residual deviance: 28.004 on 146 degrees of freedom > AIC: 183.94 > Number of Fisher Scoring iterations: 2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org