[ https://issues.apache.org/jira/browse/HIVE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252076#comment-14252076 ]
Szehon Ho commented on HIVE-8639: --------------------------------- [~brocknoland] yes there are tests that still do. The triggering factor is whether the tests have "hive.auto.convert.sortmerge.join.to.mapjoin" turned on. For example, all the auto_sortmerge_.* tests have at least one part that runs SMB join before that flag is turned on. [~xuefuz] can you review when you get a chance? Test failures seem unrelated. I looked at join32_lessSize, it seems caused by a TimeoutException in spark client's RPC layer. {noformat} Caused by: java.util.concurrent.TimeoutException: Timed out waiting for client connection. at org.apache.hive.spark.client.rpc.RpcServer$2.run(RpcServer.java:125) {noformat} > Convert SMBJoin to MapJoin [Spark Branch] > ----------------------------------------- > > Key: HIVE-8639 > URL: https://issues.apache.org/jira/browse/HIVE-8639 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Szehon Ho > Assignee: Szehon Ho > Attachments: HIVE-8639.1-spark.patch, HIVE-8639.2-spark.patch, > HIVE-8639.3-spark.patch, HIVE-8639.3-spark.patch, HIVE-8639.4-spark.patch > > > HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are > partitioned, there could be a slow down as each mapper would need to get a > very small chunk of a partition which has a single key. Thus, in some > scenarios it's beneficial to convert SMB join to map join. > The task is to research and support the conversion from SMB join to map join > for Spark execution engine. See the equivalent of MapReduce in > SortMergeJoinResolver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)