Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/20257#discussion_r161740612 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java --- @@ -35,41 +34,37 @@ import org.apache.spark.sql.types.StructType; // $example off$ -public class JavaOneHotEncoderExample { +public class JavaOneHotEncoderEstimatorExample { public static void main(String[] args) { SparkSession spark = SparkSession .builder() - .appName("JavaOneHotEncoderExample") + .appName("JavaOneHotEncoderEstimatorExample") .getOrCreate(); // $example on$ + // Notice: this categorical features are usually encoded with `StringIndexer`. --- End diff -- Perhaps we can move the note above the `$example on$` - I don't think it is necessary for it to appear in the user guide as we've mentioned it above. Also perhaps rather: `Note: categorical features are usually first encoded with StringIndexer`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org