[ https://issues.apache.org/jira/browse/SPARK-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhenhua Wang updated SPARK-11398: --------------------------------- Description: 1. def dialectClassName in HiveContext is unnecessary. In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new HiveQLDialect(this); else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it calls dialectClassName, which is overriden in HiveContext and still return super.dialectClassName. So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of def dialectClassName in HiveContext. 2. When we start bin/spark-sql, the default context is HiveContext, and the corresponding dialect is hiveql. However, if we type "set spark.sql.dialect;", the result is "sql", which is inconsistent with the actual dialect and is misleading. For example, we can use sql like "create table" which is only allowed in hiveql, but this dialect conf shows it's "sql". Although this problem will not cause any execution error, it's misleading to spark sql users. Therefore I think we should fix it. In this pr, instead of overriding def dialect in conf of HiveContext, I set the SQLConf.DIALECT directly as "hiveql", such that result of "set spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can still use "sql" as the dialect in HiveContext through "set spark.sql.dialect=sql". Then the conf.dialect in HiveContext will become sql. Because in SQLConf, def dialect = getConf(), and now the dialect in "settings" becomes "sql". was: 1. def dialectClassName in HiveContext is unnecessary. In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new HiveQLDialect(this); else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it calls dialectClassName, which is overriden in HiveContext and still return super.dialectClassName. So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of def dialectClassName in HiveContext. 2. When we start bin/spark-sql, the default context is HiveContext, and the corresponding dialect is hiveql. However, if we type "set spark.sql.dialect;", the result is "sql", which is inconsistent with the actual dialect and is misleading. For example, we can use sql like "create table" which is only allowed in hiveql, but this dialect conf shows it's "sql". Although this problem will not cause any execution error, it's misleading to spark sql users. Therefore I think we should fix it. In this pr, instead of overriding def dialect in conf of HiveContext, I set the SQLConf.DIALECT directly as "hiveql", such that result of "set spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can still use "sql" as the dialect in HiveContext through "set spark.sql.dialect=sql". Then the conf.dialect in HiveContext will become sql. Because in SQLConf, def dialect = getConf(), and now the dialect in "settings" becomes "sql". > unnecessary def dialectClassName in HiveContext, and misleading dialect conf > at the start of spark-sql > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-11398 > URL: https://issues.apache.org/jira/browse/SPARK-11398 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Zhenhua Wang > Priority: Minor > > 1. def dialectClassName in HiveContext is unnecessary. > In HiveContext, if conf.dialect == "hiveql", getSQLDialect() will return new > HiveQLDialect(this); > else it will use super.getSQLDialect(). Then in super.getSQLDialect(), it > calls dialectClassName, which is overriden in HiveContext and still return > super.dialectClassName. > So we'll never reach the code "classOf[HiveQLDialect].getCanonicalName" of > def dialectClassName in HiveContext. > 2. When we start bin/spark-sql, the default context is HiveContext, and the > corresponding dialect is hiveql. > However, if we type "set spark.sql.dialect;", the result is "sql", which is > inconsistent with the actual dialect and is misleading. For example, we can > use sql like "create table" which is only allowed in hiveql, but this dialect > conf shows it's "sql". > Although this problem will not cause any execution error, it's misleading to > spark sql users. Therefore I think we should fix it. > In this pr, instead of overriding def dialect in conf of HiveContext, I set > the SQLConf.DIALECT directly as "hiveql", such that result of "set > spark.sql.dialect;" will be "hiveql", not "sql". After the change, we can > still use "sql" as the dialect in HiveContext through "set > spark.sql.dialect=sql". Then the conf.dialect in HiveContext will become sql. > Because in SQLConf, def dialect = getConf(), and now the dialect in > "settings" becomes "sql". -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org