Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/23171 @cloud-fan as @aokolnychyi said, `switch` will still be faster than optimized `Set` without autoboxing when the number of elements are small. As a result, this PR is still very useful. @mgaido91 `InSet` can be better when we implement properly without autoboxing for large numbers of elements controlled by `spark.sql.optimizer.inSetConversionThreshold`. Also, generating `In` with huge lists can cause a compile exception due to the method size limit as you pointed out. As a result, we should convert it into `InSet` for large set.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org