[ https://issues.apache.org/jira/browse/HIVE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252142#comment-14252142 ]
Xuefu Zhang commented on HIVE-8639: ----------------------------------- +1 > Convert SMBJoin to MapJoin [Spark Branch] > ----------------------------------------- > > Key: HIVE-8639 > URL: https://issues.apache.org/jira/browse/HIVE-8639 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Szehon Ho > Assignee: Szehon Ho > Attachments: HIVE-8639.1-spark.patch, HIVE-8639.2-spark.patch, > HIVE-8639.3-spark.patch, HIVE-8639.3-spark.patch, HIVE-8639.4-spark.patch > > > HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are > partitioned, there could be a slow down as each mapper would need to get a > very small chunk of a partition which has a single key. Thus, in some > scenarios it's beneficial to convert SMB join to map join. > The task is to research and support the conversion from SMB join to map join > for Spark execution engine. See the equivalent of MapReduce in > SortMergeJoinResolver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)