[ 
https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052100#comment-15052100
 ] 

Xuefu Zhang commented on HIVE-12616:
------------------------------------

Thanks for the explanation. I guess the problem is that user didn't set 
spark.master explicitly, Hive's default, yarn-cluster, is set only for the 
HiveConf of the first operation. 

I think we should set "spark.master" in session level HiveConf. It seems we 
just need to add one line doing that in the if block below:
{code}
    // load properties from hive configurations, including both spark.* 
properties,
    // properties for remote driver RPC, and yarn properties for Spark on YARN 
mode.
    String sparkMaster = hiveConf.get("spark.master");
    if (sparkMaster == null) {
      sparkMaster = sparkConf.get("spark.master");
      hiveConf.set("spark.master", sparkMaster);
    }
{code}

> NullPointerException when spark session is reused to run a mapjoin
> ------------------------------------------------------------------
>
>                 Key: HIVE-12616
>                 URL: https://issues.apache.org/jira/browse/HIVE-12616
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 1.3.0
>            Reporter: Nemon Lou
>            Assignee: Nemon Lou
>         Attachments: HIVE-12616.patch
>
>
> The way to reproduce:
> {noformat}
> set hive.execution.engine=spark;
> create table if not exists test(id int);
> create table if not exists test1(id int);
> insert into test values(1);
> insert into test1 values(1);
> select max(a.id) from test a ,test1 b
> where a.id = b.id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to