Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20257#discussion_r162043736 --- Diff: docs/ml-features.md --- @@ -783,11 +783,11 @@ Because this existing `OneHotEncoder` is a stateless transformer, it is not usab ## OneHotEncoderEstimator -[One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a column of label indices to a column of binary vectors, and each output binary vector includes at most a single one-value. This encoding allows algorithms which expect continuous features, such as Logistic Regression, to use categorical features. For string type input data, it is common to encode categorical features using [StringIndexer](ml-features.html#stringindexer) first. +[One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a categorical feature, represented as a label index, to a binary vector with at most a single one-value indicating the presence of a specific feature value from among the set of all feature values. --- End diff -- No problem. Added it back.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org