[ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030405#comment-15030405
 ] 

Nemon Lou commented on HIVE-12538:
----------------------------------

After debugging ,i find the problem is that ,the operation conf object 
SparkUtilities used to detect configuration change is different from session 
conf.
And the session conf object 's getSparkConfigUpdated method always return true 
after setting spark related config.
The code path where SQLOperation copy a new conf object from session conf:
https://github.com/apache/hive/blob/spark/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L467
{code}
/**
   * If there are query specific settings to overlay, then create a copy of 
config
   * There are two cases we need to clone the session config that's being 
passed to hive driver
   * 1. Async query -
   *    If the client changes a config setting, that shouldn't reflect in the 
execution already underway
   * 2. confOverlay -
   *    The query specific settings should only be applied to the query config 
and not session
   * @return new configuration
   * @throws HiveSQLException
   */
  private HiveConf getConfigForOperation() throws HiveSQLException {
    HiveConf sqlOperationConf = getParentSession().getHiveConf();
    if (!getConfOverlay().isEmpty() || shouldRunAsync()) {
      // clone the partent session config for this query
      sqlOperationConf = new HiveConf(sqlOperationConf);

      // apply overlay query specific settings, if any
      for (Map.Entry<String, String> confEntry : getConfOverlay().entrySet()) {
        try {
          sqlOperationConf.verifyAndSet(confEntry.getKey(), 
confEntry.getValue());
        } catch (IllegalArgumentException e) {
          throw new HiveSQLException("Error applying statement specific 
settings", e);
        }
      }
    }
    return sqlOperationConf;
  }
{code}
The code path where SparkUtilities detect the change and close the spark 
session :
https://github.com/apache/hive/blob/spark/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java#L122
{code}
public static SparkSession getSparkSession(HiveConf conf,
      SparkSessionManager sparkSessionManager) throws HiveException {
    SparkSession sparkSession = SessionState.get().getSparkSession();

    // Spark configurations are updated close the existing session
    if (conf.getSparkConfigUpdated()) {
      sparkSessionManager.closeSession(sparkSession);
      sparkSession =  null;
      conf.setSparkConfigUpdated(false);
    }
    sparkSession = sparkSessionManager.getSession(sparkSession, conf, true);
    SessionState.get().setSparkSession(sparkSession);
    return sparkSession;
  }
{code}

It shoud be easy to reproduce, i will dig more.



> After set spark related config, SparkSession never get reused
> -------------------------------------------------------------
>
>                 Key: HIVE-12538
>                 URL: https://issues.apache.org/jira/browse/HIVE-12538
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 1.3.0
>            Reporter: Nemon Lou
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to