Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/20257#discussion_r161472191 --- Diff: docs/ml-features.md --- @@ -775,7 +775,9 @@ for more details on the API. </div> </div> -## OneHotEncoder +## OneHotEncoder (Deprecated since 2.3.0) --- End diff -- I think we should add a little more detail about why it's deprecated. The reason is that because the existing `OneHotEncoder` is a stateless transformer, it is not usable on new data where the number of categories may differ from the training data. In order to fix this, a new `OneHotEncoderEstimator` was created that produces a `OneHotEncoderModel` when fit. Add a link to the JIRA ticket for more detail (https://issues.apache.org/jira/browse/SPARK-13030).
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org