Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19527 Question about this PR description comment: > Note that keep can't be used at the same time with dropLast as true. Because they will conflict in encoded vector by producing a vector of zeros. Why is this necessary? With ```n``` categories found in fitting, shouldn't the behavior be the following? * ```keep=true, dropLast=true``` ==> vector size n * ```keep=true, dropLast=false``` ==> vector size n+1 * ```keep=false, dropLast=true``` ==> vector size n-1 * ```keep=false, dropLast=false``` ==> vector size n
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org