[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

Suhas Satish (JIRA) Wed, 17 Sep 2014 23:08:57 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138567#comment-14138567
 ]


Suhas Satish commented on HIVE-7613:
------------------------------------

Hi Xuefu, 
thats a good idea. I was thinking on the lines of calling SparkContext's 
addFile method in each of the N-1 spark jobs in HashTableSinkOperator.java  to 
write the hash tables as files and then read it in the map-only join job in 
MapJoinOperator. But that doesn't involve RDDs.   

> Research optimization of auto convert join to map join [Spark branch]
> ---------------------------------------------------------------------
>
>                 Key: HIVE-7613
>                 URL: https://issues.apache.org/jira/browse/HIVE-7613
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Suhas Satish
>            Priority: Minor
>         Attachments: HIve on Spark Map join background.docx
>
>
> ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
> join) with a map join(aka broadcast or fragment replicate join) when 
> possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

Reply via email to