Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/13139#discussion_r64086020 --- Diff: docs/ml-classification-regression.md --- @@ -374,6 +374,197 @@ regression model and extracting model summary statistics. </div> +## Generalized linear regression + +When working with data that has a relatively small number of features (< 4096), Spark's GeneralizedLinearRegression interface +allows for flexible specification of [generalized linear models](https://en.wikipedia.org/wiki/Generalized_linear_model) (GLMs) which can be used for various types of +prediction problems including linear regression, Poisson regression, logistic regression, and others. + +Contrasted with linear regression where the output is assumed to follow a Gaussian +distribution, GLMs are specifications of linear models where the response variable $Y_i$ may take on _any_ +distribution from the [exponential family of distributions](https://en.wikipedia.org/wiki/Exponential_family). --- End diff -- You should put this paragraph first (it explains the purpose of GRL), and mention that all the supported families are listed below.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org