[jira] [Updated] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

Teng Qiu (JIRA) Sat, 19 Mar 2016 04:27:46 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Teng Qiu updated SPARK-13983:
-----------------------------
    Description: 
HiveThriftServer2 should be able to get "\--hiveconf" or ''\-\-hivevar" 
variables from JDBC client, either from command line parameter of beeline, such 
as
{{beeline --hiveconf spark.sql.shuffle.partitions=3 --hivevar db_name=default}}
or from JDBC connection string, like
{{jdbc:hive2://localhost:10000?spark.sql.shuffle.partitions=3#db_name=default}}

this worked in spark version 1.5.x, but after upgraded to 1.6, it doesn't work.

to reproduce this issue, try to connect to HiveThriftServer2 with beeline:

{code}
bin/beeline -u jdbc:hive2://localhost:10000 \
            --hiveconf spark.sql.shuffle.partitions=3 \
            --hivevar db_name=default
{code}

or
{code}
bin/beeline -u 
jdbc:hive2://localhost:10000?spark.sql.shuffle.partitions=3#db_name=default
{code}

will get following results:

{code}
0: jdbc:hive2://localhost:10000> set spark.sql.shuffle.partitions;
+-------------------------------+--------+--+
|              key              | value  |
+-------------------------------+--------+--+
| spark.sql.shuffle.partitions  | 200    |
+-------------------------------+--------+--+
1 row selected (0.192 seconds)
0: jdbc:hive2://localhost:10000> use ${db_name};
Error: org.apache.spark.sql.AnalysisException: cannot recognize input near '$' 
'{' 'db_name' in switch database statement; line 1 pos 4 (state=,code=0)
{code}

-

but this bug does not affect current versions of spark-sql CLI, following 
commands works:

{code}
bin/spark-sql --master local[2] \
              --hiveconf spark.sql.shuffle.partitions=3 \
              --hivevar db_name=default

spark-sql> set spark.sql.shuffle.partitions
spark.sql.shuffle.partitions   3
Time taken: 1.037 seconds, Fetched 1 row(s)

spark-sql> use ${db_name};
OK
Time taken: 1.697 seconds
{code}

so I think it may caused by this change: 
https://github.com/apache/spark/pull/8909 ( [SPARK-10810] [SPARK-10902] [SQL] 
Improve session management in SQL )

perhaps by calling {{hiveContext.newSession}}, the variables from 
{{sessionConf}} were not loaded into the new session? 
(https://github.com/apache/spark/pull/8909/files#diff-8f8b7f4172e8a07ff20a4dbbbcc57b1dR69)

  was:
HiveThriftServer2 should be able to get "\--hiveconf" or ''\-\-hivevar" 
variables from JDBC client, either from command line parameter of beeline, such 
as
{{beeline --hiveconf hive.stats.autogather=false --hivevar db_name=default}}
or from JDBC connection string, like
{{jdbc:hive2://localhost:10000?hive.stats.autogather=false#db_name=default}}

this worked in spark version 1.5.x, but after upgraded to 1.6, it doesn't work.

to reproduce this issue, try to connect to HiveThriftServer2 with beeline:

{code}
bin/beeline -u jdbc:hive2://localhost:10000 \
            --hiveconf hive.stats.autogather=false \
            --hivevar db_name=default
{code}

or
{code}
bin/beeline -u 
jdbc:hive2://localhost:10000?hive.stats.autogather=false#db_name=default
{code}

will get following results:

{code}
0: jdbc:hive2://localhost:10000> set hive.stats.autogather;
+-----------------------+--------------+--+
|          key          |    value     |
+-----------------------+--------------+--+
| hive.stats.autogather | <undefined>  |
+-----------------------+--------------+--+
1 row selected (0.01 seconds)
0: jdbc:hive2://localhost:10000> use ${db_name};
Error: org.apache.spark.sql.AnalysisException: cannot recognize input near '$' 
'{' 'db_name' in switch database statement; line 1 pos 4 (state=,code=0)
{code}

-

but this bug does not affect current versions of spark-sql CLI, following 
commands works:

{code}
bin/spark-sql --master local[2] \
              --hiveconf hive.stats.autogather=false \
              --hivevar db_name=default

spark-sql> set hive.stats.autogather;
hive.stats.autogather   false
Time taken: 1.037 seconds, Fetched 1 row(s)

spark-sql> use ${db_name};
OK
Time taken: 1.697 seconds
{code}

so I think it may caused by this change: 
https://github.com/apache/spark/pull/8909 ( [SPARK-10810] [SPARK-10902] [SQL] 
Improve session management in SQL )

perhaps by calling {{hiveContext.newSession}}, the variables from 
{{sessionConf}} were not loaded into the new session? 
(https://github.com/apache/spark/pull/8909/files#diff-8f8b7f4172e8a07ff20a4dbbbcc57b1dR69)


> HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 
> 1.6 version (both multi-session and single session)
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13983
>                 URL: https://issues.apache.org/jira/browse/SPARK-13983
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0, 1.6.1
>         Environment: ubuntu, spark 1.6.0 standalone, spark 1.6.1 standalone
> (tried spark branch-1.6 snapshot as well)
>            Reporter: Teng Qiu
>
> HiveThriftServer2 should be able to get "\--hiveconf" or ''\-\-hivevar" 
> variables from JDBC client, either from command line parameter of beeline, 
> such as
> {{beeline --hiveconf spark.sql.shuffle.partitions=3 --hivevar 
> db_name=default}}
> or from JDBC connection string, like
> {{jdbc:hive2://localhost:10000?spark.sql.shuffle.partitions=3#db_name=default}}
> this worked in spark version 1.5.x, but after upgraded to 1.6, it doesn't 
> work.
> to reproduce this issue, try to connect to HiveThriftServer2 with beeline:
> {code}
> bin/beeline -u jdbc:hive2://localhost:10000 \
>             --hiveconf spark.sql.shuffle.partitions=3 \
>             --hivevar db_name=default
> {code}
> or
> {code}
> bin/beeline -u 
> jdbc:hive2://localhost:10000?spark.sql.shuffle.partitions=3#db_name=default
> {code}
> will get following results:
> {code}
> 0: jdbc:hive2://localhost:10000> set spark.sql.shuffle.partitions;
> +-------------------------------+--------+--+
> |              key              | value  |
> +-------------------------------+--------+--+
> | spark.sql.shuffle.partitions  | 200    |
> +-------------------------------+--------+--+
> 1 row selected (0.192 seconds)
> 0: jdbc:hive2://localhost:10000> use ${db_name};
> Error: org.apache.spark.sql.AnalysisException: cannot recognize input near 
> '$' '{' 'db_name' in switch database statement; line 1 pos 4 (state=,code=0)
> {code}
> -
> but this bug does not affect current versions of spark-sql CLI, following 
> commands works:
> {code}
> bin/spark-sql --master local[2] \
>               --hiveconf spark.sql.shuffle.partitions=3 \
>               --hivevar db_name=default
> spark-sql> set spark.sql.shuffle.partitions
> spark.sql.shuffle.partitions   3
> Time taken: 1.037 seconds, Fetched 1 row(s)
> spark-sql> use ${db_name};
> OK
> Time taken: 1.697 seconds
> {code}
> so I think it may caused by this change: 
> https://github.com/apache/spark/pull/8909 ( [SPARK-10810] [SPARK-10902] [SQL] 
> Improve session management in SQL )
> perhaps by calling {{hiveContext.newSession}}, the variables from 
> {{sessionConf}} were not loaded into the new session? 
> (https://github.com/apache/spark/pull/8909/files#diff-8f8b7f4172e8a07ff20a4dbbbcc57b1dR69)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

Reply via email to