GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/16699
[SPARK-18710] Add offset in GLM ## What changes were proposed in this pull request? Add support for offset in GLM. This is useful for at least two reasons: 1. Account for exposure: e.g., when modeling the number of accidents, we may need to use miles driven as an offset to access factors on frequency. 2. Test incremental effects of new variables: we can use predictions from the existing model as offset and run a much smaller model on only new variables. This avoids re-estimating the large model with all variables (old + new) and can be very important for efficient large-scaled analysis. ## How was this patch tested? New test. @yanboliang @srowen @felixcheung @sethah You can merge this pull request into a Git repository by running: $ git pull https://github.com/actuaryzhang/spark offset Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16699.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16699 ---- commit 3bf2718c1a1e68273508e63499bb5d1cc8230155 Author: actuaryzhang <actuaryzhan...@gmail.com> Date: 2017-01-24T23:46:16Z add trait offset commit 0e240eb313aa91cb645fb3ab8d70e51b6c65b3c7 Author: actuaryzhang <actuaryzhan...@gmail.com> Date: 2017-01-24T23:48:03Z add offset setter commit 9c41453a19c0f9c31403fafaf1995c642c37c70d Author: actuaryzhang <actuaryzhan...@gmail.com> Date: 2017-01-25T05:15:50Z implement offset in GLM commit 7823f8af8b0926790816c9e79e9425e503e494ad Author: actuaryzhang <actuaryzhan...@gmail.com> Date: 2017-01-25T06:55:56Z add test for glm with offset ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org