[
https://issues.apache.org/jira/browse/SPARK-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhenhua Wang updated SPARK-11398:
---------------------------------
Description:
1. def dialectClassName in HiveContext is unnecessary.
In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new
HiveQLDialect(this);
else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it calls
dialectClassName, which is overriden in HiveContext and still return
super.dialectClassName.
So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of def
dialectClassName in HiveContext.
2. When we start bin/spark-sql, the default context is HiveContext, and the
corresponding dialect is hiveql.
However, if we type "set spark.sql.dialect;", the result is "sql", which is
inconsistent with the actual dialect and is misleading. For example, we can use
sql like "create table" which is only allowed in hiveql, but this dialect conf
shows it's "sql".
Although this problem will not cause any execution error, it's misleading to
spark sql users. Therefore I think we should fix it.
In this pr, instead of overriding def dialect in conf of HiveContext, I set the
SQLConf.DIALECT directly as "hiveql", such that result of "set
spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can still
use "sql" as the dialect in HiveContext through "set spark.sql.dialect=sql".
Then the conf.dialect in HiveContext will become sql. Because in SQLConf, def
dialect = getConf(), and now the dialect in "settings" becomes "sql".
was:
1. def dialectClassName in HiveContext is unnecessary.
In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new
HiveQLDialect(this);
else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it calls
dialectClassName, which is overriden in HiveContext and still return
super.dialectClassName.
So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of def
dialectClassName in HiveContext.
2. When we start bin/spark-sql, the default context is HiveContext, and the
corresponding dialect is hiveql.
However, if we type "set spark.sql.dialect;", the result is "sql", which is
inconsistent with the actual dialect and is misleading. For example, we can use
sql like "create table" which is only allowed in hiveql, but this dialect conf
shows it's "sql".
Although this problem will not cause any execution error, it's misleading to
spark sql users. Therefore I think we should fix it.
In this pr, instead of overriding def dialect in conf of HiveContext, I set the
SQLConf.DIALECT directly as "hiveql", such that result of "set
spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can still
use "sql" as the dialect in HiveContext through "set spark.sql.dialect=sql".
Then the conf.dialect in HiveContext will become sql. Because in SQLConf, def
dialect = getConf(), and now the dialect in "settings" becomes "sql".
> unnecessary def dialectClassName in HiveContext, and misleading dialect conf
> at the start of spark-sql
> ------------------------------------------------------------------------------------------------------
>
> Key: SPARK-11398
> URL: https://issues.apache.org/jira/browse/SPARK-11398
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Zhenhua Wang
> Priority: Minor
>
> 1. def dialectClassName in HiveContext is unnecessary.
> In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new
> HiveQLDialect(this);
> else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it
> calls dialectClassName, which is overriden in HiveContext and still return
> super.dialectClassName.
> So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of
> def dialectClassName in HiveContext.
> 2. When we start bin/spark-sql, the default context is HiveContext, and the
> corresponding dialect is hiveql.
> However, if we type "set spark.sql.dialect;", the result is "sql", which is
> inconsistent with the actual dialect and is misleading. For example, we can
> use sql like "create table" which is only allowed in hiveql, but this dialect
> conf shows it's "sql".
> Although this problem will not cause any execution error, it's misleading to
> spark sql users. Therefore I think we should fix it.
> In this pr, instead of overriding def dialect in conf of HiveContext, I set
> the SQLConf.DIALECT directly as "hiveql", such that result of "set
> spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can
> still use "sql" as the dialect in HiveContext through "set
> spark.sql.dialect=sql". Then the conf.dialect in HiveContext will become sql.
> Because in SQLConf, def dialect = getConf(), and now the dialect in
> "settings" becomes "sql".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]