GitHub user actuaryzhang opened a pull request:

    https://github.com/apache/spark/pull/16699

    [SPARK-18710] Add offset in GLM

    ## What changes were proposed in this pull request?
    Add support for offset in GLM. This is useful for at least two reasons:
    
    1. Account for exposure: e.g., when modeling the number of accidents, we 
may need to use miles driven as an offset to access factors on frequency.    
    2. Test incremental effects of new variables: we can use predictions from 
the existing model as offset and run a much smaller model on only new 
variables. This avoids re-estimating the large model with all variables (old + 
new) and can be very important for efficient large-scaled analysis. 
    
    ## How was this patch tested?
    New test.
    
    @yanboliang @srowen @felixcheung @sethah 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/actuaryzhang/spark offset

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16699.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16699
    
----
commit 3bf2718c1a1e68273508e63499bb5d1cc8230155
Author: actuaryzhang <actuaryzhan...@gmail.com>
Date:   2017-01-24T23:46:16Z

    add trait offset

commit 0e240eb313aa91cb645fb3ab8d70e51b6c65b3c7
Author: actuaryzhang <actuaryzhan...@gmail.com>
Date:   2017-01-24T23:48:03Z

    add offset setter

commit 9c41453a19c0f9c31403fafaf1995c642c37c70d
Author: actuaryzhang <actuaryzhan...@gmail.com>
Date:   2017-01-25T05:15:50Z

    implement offset in GLM

commit 7823f8af8b0926790816c9e79e9425e503e494ad
Author: actuaryzhang <actuaryzhan...@gmail.com>
Date:   2017-01-25T06:55:56Z

    add test for glm with offset

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to