Reynold Xin created SPARK-9258: ---------------------------------- Summary: Remove BroadcastLeftSemiJoinHash Key: SPARK-9258 URL: https://issues.apache.org/jira/browse/SPARK-9258 Project: Spark Issue Type: Improvement Components: SQL Reporter: Reynold Xin
We have too many join operators than our resources to optimize them. In this case, BroadcastLeftSemiJoinHash isn't very necessary. We can still use an equi-join operator to do the join, and just not include any values from the other join. We waste a little bit space due to building a hash map rather than a hash set, but at the end of the day unless we are going to spend a lot of time optimizing hash set, our Tungsten hash map will be a lot more efficient than the hash set anyway ... -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org