GitHub user MrBago opened a pull request: https://github.com/apache/spark/pull/20229
Update RFormula to use VectorSizeHint & OneHotEncoderEstimator. ## What changes were proposed in this pull request? RFormula should use VectorSizeHint & OneHotEncoderEstimator in its pipeline to avoid using the deprecated OneHotEncoder & to ensure the model produced can be used in streaming. ## How was this patch tested? Unit tests. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MrBago/spark rFormula Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20229.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20229 ---- commit 7bad275cd995d22d70a42a5b9932073bc3a5d1e5 Author: Bago Amirbekian <bago@...> Date: 2018-01-11T05:45:27Z Update RFormula to use VectorSizeHint & OneHotEncoderEstimator. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org