[
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138567#comment-14138567
]
Suhas Satish commented on HIVE-7613:
------------------------------------
Hi Xuefu,
thats a good idea. I was thinking on the lines of calling SparkContext's
addFile method in each of the N-1 spark jobs in HashTableSinkOperator.java to
write the hash tables as files and then read it in the map-only join job in
MapJoinOperator. But that doesn't involve RDDs.
> Research optimization of auto convert join to map join [Spark branch]
> ---------------------------------------------------------------------
>
> Key: HIVE-7613
> URL: https://issues.apache.org/jira/browse/HIVE-7613
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Chengxiang Li
> Assignee: Suhas Satish
> Priority: Minor
> Attachments: HIve on Spark Map join background.docx
>
>
> ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle
> join) with a map join(aka broadcast or fragment replicate join) when
> possible. we need to research how to make it workable with Hive on Spark.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)