[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

aokolnychyi Thu, 29 Nov 2018 12:13:22 -0800

Github user aokolnychyi commented on the issue:

    https://github.com/apache/spark/pull/23171
  
    To sum up, I would set the goal of this PR is to make `In` expressions as 
efficient as possible for bytes/shorts/ints. Then we can do benchmarks for `In` 
vs `InSet` in [SPARK-26203](https://issues.apache.org/jira/browse/SPARK-26203) 
and try to come up with a solution for `InSet` in 
[SPARK-26204](https://issues.apache.org/jira/browse/SPARK-26204). By the time 
we solve [SPARK-26204](https://issues.apache.org/jira/browse/SPARK-26204), we 
will have a clear undestanding of pros and cons in `In` and `InSet` and would 
be able to determine the right thresholds.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

Reply via email to