[jira] [Commented] (SPARK-32560) improve exception message
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172791#comment-17172791 ] philipse commented on SPARK-32560: -- Thanks [~maropu] for you notice. will improve it in furture.;) > improve exception message > - > > Key: SPARK-32560 > URL: https://issues.apache.org/jira/browse/SPARK-32560 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > Attachments: exception.png > > > Exception messages are lack of single quotes, we can improve it to keep > consisent -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32560) improve exception message
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Description: Exception messages are lack of single quotes, we can improve it to keep consisent (was: Exception messages are lack of single quotes, we can improve it to keep consisent !image-2020-08-07-08-32-35-808.png!) > improve exception message > - > > Key: SPARK-32560 > URL: https://issues.apache.org/jira/browse/SPARK-32560 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > Attachments: exception.png > > > Exception messages are lack of single quotes, we can improve it to keep > consisent -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32560) improve exception message
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Attachment: exception.png > improve exception message > - > > Key: SPARK-32560 > URL: https://issues.apache.org/jira/browse/SPARK-32560 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > Attachments: exception.png > > > Exception messages are lack of single quotes, we can improve it to keep > consisent > !image-2020-08-07-08-32-35-808.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32560) improve exception message
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Description: Exception messages are lack of single quotes, we can improve it to keep consisent !image-2020-08-07-08-32-35-808.png! was:Exception message have extra single quotes, we can improve it. > improve exception message > - > > Key: SPARK-32560 > URL: https://issues.apache.org/jira/browse/SPARK-32560 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > Exception messages are lack of single quotes, we can improve it to keep > consisent > !image-2020-08-07-08-32-35-808.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32560) improve exception message
philipse created SPARK-32560: Summary: improve exception message Key: SPARK-32560 URL: https://issues.apache.org/jira/browse/SPARK-32560 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: philipse Exception message have extra single quotes, we can improve it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-24194) HadoopFsRelation cannot overwrite a path that is also being read from
[ https://issues.apache.org/jira/browse/SPARK-24194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-24194: - Comment: was deleted (was: Hi is the issue closed ? can i try it in product env? Thanks) > HadoopFsRelation cannot overwrite a path that is also being read from > - > > Key: SPARK-24194 > URL: https://issues.apache.org/jira/browse/SPARK-24194 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 > Environment: spark master >Reporter: yangz >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > When > {code:java} > INSERT OVERWRITE TABLE territory_count_compare select * from > territory_count_compare where shop_count!=real_shop_count > {code} > And territory_count_compare is a table with parquet, there will be a error > Cannot overwrite a path that is also being read from > > And in file MetastoreDataSourceSuite.scala, there have a test case > > > {code:java} > table(tableName).write.mode(SaveMode.Overwrite).insertInto(tableName) > {code} > > But when the table territory_count_compare is a common hive table, there is > no error. > So I think the reason is when insert overwrite into hadoopfs relation with > static partition, it first delete the partition in the output. But it should > be the time when the job commited. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-32324) Fix error messages during using PIVOT and lateral view
[ https://issues.apache.org/jira/browse/SPARK-32324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse resolved SPARK-32324. -- Resolution: Not A Problem > Fix error messages during using PIVOT and lateral view > -- > > Key: SPARK-32324 > URL: https://issues.apache.org/jira/browse/SPARK-32324 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > Currently when we use `lateral view` and `pivot` together in from clause, if > `lateral view` is before `pivot`, the error message is "LATERAL cannot be > used together with PIVOT in FROM clause".if if `lateral view` is after > `pivot`,the query will be normal ,So the error messages "LATERAL cannot be > used together with PIVOT in FROM clause" is not accurate, we may improve it. > > Steps to reproduce: > {code:java} > CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); > INSERT INTO person VALUES > (100, 'John', 30, 1, 'Street 1'), > (200, 'Mary', NULL, 1, 'Street 2'), > (300, 'Mike', 80, 3, 'Street 3'), > (400, 'Dan', 50, 4, 'Street 4'); > {code} > > Query1: > > {code:java} > SELECT * FROM person > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) > {code} > Result 1: > > {code:java} > Error: org.apache.spark.sql.catalyst.parser.ParseException: > LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) > == SQL == > SELECT * FROM person > -^^^ > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) (state=,code=0) > {code} > > > Query2: > > {code:java} > SELECT * FROM person > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > {code} > > Reuslt2: > +---+--++---++ > |id|Mary|John|c_age|d_age| > +---+--++---++ > |300|NULL|NULL|30|40| > |300|NULL|NULL|30|80| > |300|NULL|NULL|60|40| > |300|NULL|NULL|60|80| > |100|0|NULL|30|40| > |100|0|NULL|30|80| > |100|0|NULL|60|40| > |100|0|NULL|60|80| > |400|NULL|NULL|30|40| > |400|NULL|NULL|30|80| > |400|NULL|NULL|60|40| > |400|NULL|NULL|60|80| > |200|NULL|1|30|40| > |200|NULL|1|30|80| > |200|NULL|1|60|40| > |200|NULL|1|60|80| > +---+--++---++ > ``` > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
[ https://issues.apache.org/jira/browse/SPARK-32358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse resolved SPARK-32358. -- Resolution: Won't Fix > temp view not working after upgrading from 2.3.3 to 2.4.5 > - > > Key: SPARK-32358 > URL: https://issues.apache.org/jira/browse/SPARK-32358 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Major > > After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . > Please correct me if i miss sth. Thanks! > Steps to reproduce: > ``` > from pyspark.sql import SparkSession > from pyspark.sql import Row > spark=SparkSession\ > .builder \ > .appName('scenary_address_1') \ > .enableHiveSupport() \ > .getOrCreate() > > address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) > print("create dataframe finished") > address_tok_result_df.createOrReplaceTempView("scenery_address_test1") > print(spark.read.table('scenery_address_test1').dtypes) > spark.sql("select * from scenery_address_test1").show() > ``` > > In spark2.3.3 I can easily gey the following result: > ``` > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > +-+-++--- > |a|b|c| > +-+-++--- > |1|难|80| > |2|v|81| > +-+-+—+ > ``` > > But in 2.4.5. i can only get,but without result showing out: > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
[ https://issues.apache.org/jira/browse/SPARK-32358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse resolved SPARK-32358. -- Resolution: Fixed > temp view not working after upgrading from 2.3.3 to 2.4.5 > - > > Key: SPARK-32358 > URL: https://issues.apache.org/jira/browse/SPARK-32358 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Major > > After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . > Please correct me if i miss sth. Thanks! > Steps to reproduce: > ``` > from pyspark.sql import SparkSession > from pyspark.sql import Row > spark=SparkSession\ > .builder \ > .appName('scenary_address_1') \ > .enableHiveSupport() \ > .getOrCreate() > > address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) > print("create dataframe finished") > address_tok_result_df.createOrReplaceTempView("scenery_address_test1") > print(spark.read.table('scenery_address_test1').dtypes) > spark.sql("select * from scenery_address_test1").show() > ``` > > In spark2.3.3 I can easily gey the following result: > ``` > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > +-+-++--- > |a|b|c| > +-+-++--- > |1|难|80| > |2|v|81| > +-+-+—+ > ``` > > But in 2.4.5. i can only get,but without result showing out: > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
[ https://issues.apache.org/jira/browse/SPARK-32358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse reopened SPARK-32358: -- > temp view not working after upgrading from 2.3.3 to 2.4.5 > - > > Key: SPARK-32358 > URL: https://issues.apache.org/jira/browse/SPARK-32358 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Major > > After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . > Please correct me if i miss sth. Thanks! > Steps to reproduce: > ``` > from pyspark.sql import SparkSession > from pyspark.sql import Row > spark=SparkSession\ > .builder \ > .appName('scenary_address_1') \ > .enableHiveSupport() \ > .getOrCreate() > > address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) > print("create dataframe finished") > address_tok_result_df.createOrReplaceTempView("scenery_address_test1") > print(spark.read.table('scenery_address_test1').dtypes) > spark.sql("select * from scenery_address_test1").show() > ``` > > In spark2.3.3 I can easily gey the following result: > ``` > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > +-+-++--- > |a|b|c| > +-+-++--- > |1|难|80| > |2|v|81| > +-+-+—+ > ``` > > But in 2.4.5. i can only get,but without result showing out: > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
[ https://issues.apache.org/jira/browse/SPARK-32358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32358: - Description: After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . Please correct me if i miss sth. Thanks! Steps to reproduce: ``` from pyspark.sql import SparkSession from pyspark.sql import Row spark=SparkSession\ .builder \ .appName('scenary_address_1') \ .enableHiveSupport() \ .getOrCreate() address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) print("create dataframe finished") address_tok_result_df.createOrReplaceTempView("scenery_address_test1") print(spark.read.table('scenery_address_test1').dtypes) spark.sql("select * from scenery_address_test1").show() ``` In spark2.3.3 I can easily gey the following result: ``` create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] +-+-++--- |a|b|c| +-+-++--- |1|难|80| |2|v|81| +-+-+—+ ``` But in 2.4.5. i can only get,but without result showing out: create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] was: After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . Please correct me if i miss sth. Thanks! Steps to reproduce: ``` from pyspark.sql import SparkSession from pyspark.sql import Row spark=SparkSession\ .builder \ .appName('scenary_address_1') \ .enableHiveSupport() \ .getOrCreate() address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) print("create dataframe finished") address_tok_result_df.createOrReplaceTempView("scenery_address_test1") print(spark.read.table('scenery_address_test1').dtypes) spark.sql("select * from scenery_address_test1").show() ``` In spark2.3.3 i can easily gey the following result: ``` create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] ++--++--- |a|b|c| ++--++--- |1|难|80| |2|v|81| ++--+—+ ``` But in 2.4.5. i can only get: create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > temp view not working after upgrading from 2.3.3 to 2.4.5 > - > > Key: SPARK-32358 > URL: https://issues.apache.org/jira/browse/SPARK-32358 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Major > > After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . > Please correct me if i miss sth. Thanks! > Steps to reproduce: > ``` > from pyspark.sql import SparkSession > from pyspark.sql import Row > spark=SparkSession\ > .builder \ > .appName('scenary_address_1') \ > .enableHiveSupport() \ > .getOrCreate() > > address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) > print("create dataframe finished") > address_tok_result_df.createOrReplaceTempView("scenery_address_test1") > print(spark.read.table('scenery_address_test1').dtypes) > spark.sql("select * from scenery_address_test1").show() > ``` > > In spark2.3.3 I can easily gey the following result: > ``` > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > +-+-++--- > |a|b|c| > +-+-++--- > |1|难|80| > |2|v|81| > +-+-+—+ > ``` > > But in 2.4.5. i can only get,but without result showing out: > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
[ https://issues.apache.org/jira/browse/SPARK-32358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32358: - Description: After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . Please correct me if i miss sth. Thanks! Steps to reproduce: ``` from pyspark.sql import SparkSession from pyspark.sql import Row spark=SparkSession\ .builder \ .appName('scenary_address_1') \ .enableHiveSupport() \ .getOrCreate() address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) print("create dataframe finished") address_tok_result_df.createOrReplaceTempView("scenery_address_test1") print(spark.read.table('scenery_address_test1').dtypes) spark.sql("select * from scenery_address_test1").show() ``` In spark2.3.3 i can easily gey the following result: ``` create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] ++--++--- |a|b|c| ++--++--- |1|难|80| |2|v|81| ++--+—+ ``` But in 2.4.5. i can only get: create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] was: After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . i am not sure if if missing sth Steps to reproduce: ``` from pyspark.sql import SparkSession from pyspark.sql import Row spark=SparkSession\ .builder \ .appName('scenary_address_1') \ .enableHiveSupport() \ .getOrCreate() address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) print("create dataframe finished") address_tok_result_df.createOrReplaceTempView("scenery_address_test1") print(spark.read.table('scenery_address_test1').dtypes) spark.sql("select * from scenery_address_test1").show() ``` In spark2.3.3 i can easily gey the following result: ``` create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] +---+---+---+ | a| b| c| +---+---+---+ | 1| 难| 80| | 2| v| 81| +---+---+—+ ``` But in 2.4.5. i can only get: create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > temp view not working after upgrading from 2.3.3 to 2.4.5 > - > > Key: SPARK-32358 > URL: https://issues.apache.org/jira/browse/SPARK-32358 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Major > > After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . > Please correct me if i miss sth. Thanks! > Steps to reproduce: > ``` > from pyspark.sql import SparkSession > from pyspark.sql import Row > spark=SparkSession\ > .builder \ > .appName('scenary_address_1') \ > .enableHiveSupport() \ > .getOrCreate() > > address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) > print("create dataframe finished") > address_tok_result_df.createOrReplaceTempView("scenery_address_test1") > print(spark.read.table('scenery_address_test1').dtypes) > spark.sql("select * from scenery_address_test1").show() > ``` > > In spark2.3.3 i can easily gey the following result: > ``` > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] > ++--++--- > |a|b|c| > ++--++--- > |1|难|80| > |2|v|81| > ++--+—+ > ``` > > But in 2.4.5. i can only get: > create dataframe finished > [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32358) temp view not working after upgrading from 2.3.3 to 2.4.5
philipse created SPARK-32358: Summary: temp view not working after upgrading from 2.3.3 to 2.4.5 Key: SPARK-32358 URL: https://issues.apache.org/jira/browse/SPARK-32358 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 2.4.5 Reporter: philipse After upgrading from 2.3.3 to spark 2.4.5. the temp view seems not working . i am not sure if if missing sth Steps to reproduce: ``` from pyspark.sql import SparkSession from pyspark.sql import Row spark=SparkSession\ .builder \ .appName('scenary_address_1') \ .enableHiveSupport() \ .getOrCreate() address_tok_result_df=spark.createDataFrame([Row(a=1,b='难',c=80),Row(a=2,b='v',c=81)]) print("create dataframe finished") address_tok_result_df.createOrReplaceTempView("scenery_address_test1") print(spark.read.table('scenery_address_test1').dtypes) spark.sql("select * from scenery_address_test1").show() ``` In spark2.3.3 i can easily gey the following result: ``` create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] +---+---+---+ | a| b| c| +---+---+---+ | 1| 难| 80| | 2| v| 81| +---+---+—+ ``` But in 2.4.5. i can only get: create dataframe finished [('a', 'bigint'), ('b', 'string'), ('c', 'bigint')] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32324) Fix error messages during using PIVOT and lateral view
[ https://issues.apache.org/jira/browse/SPARK-32324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32324: - Description: Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it. Steps to reproduce: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); {code} Query1: {code:java} SELECT * FROM person lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) {code} Result 1: {code:java} Error: org.apache.spark.sql.catalyst.parser.ParseException: LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) == SQL == SELECT * FROM person -^^^ lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) (state=,code=0) {code} Query2: {code:java} SELECT * FROM person PIVOT ( count(distinct age) as a for name in ('Mary','John') ) lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age {code} Reuslt2: +---+--++---++ |id|Mary|John|c_age|d_age| +---+--++---++ |300|NULL|NULL|30|40| |300|NULL|NULL|30|80| |300|NULL|NULL|60|40| |300|NULL|NULL|60|80| |100|0|NULL|30|40| |100|0|NULL|30|80| |100|0|NULL|60|40| |100|0|NULL|60|80| |400|NULL|NULL|30|40| |400|NULL|NULL|30|80| |400|NULL|NULL|60|40| |400|NULL|NULL|60|80| |200|NULL|1|30|40| |200|NULL|1|30|80| |200|NULL|1|60|40| |200|NULL|1|60|80| +---+--++---++ ``` was: Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it. Steps to reproduce: ``` CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); ``` Query1: ``` SELECT * FROM person lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) ``` Result 1: ``` Error: org.apache.spark.sql.catalyst.parser.ParseException: LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) == SQL == SELECT * FROM person -^^^ lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) (state=,code=0) ``` Query2: ``` SELECT * FROM person PIVOT ( count(distinct age) as a for name in ('Mary','John') ) lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age ``` Reuslt2: ``` +--+---+---+++ | id | Mary | John | c_age | d_age | +--+---+---+++ | 300 | NULL | NULL | 30 | 40 | | 300 | NULL | NULL | 30 | 80 | | 300 | NULL | NULL | 60 | 40 | | 300 | NULL | NULL | 60 | 80 | | 100 | 0 | NULL | 30 | 40 | | 100 | 0 | NULL | 30 | 80 | | 100 | 0 | NULL | 60 | 40 | | 100 | 0 | NULL | 60 | 80 | | 400 | NULL | NULL | 30 | 40 | | 400 | NULL | NULL | 30 | 80 | | 400 | NULL | NULL | 60 | 40 | | 400 | NULL | NULL | 60 | 80 | | 200 | NULL | 1 | 30 | 40 | | 200 | NULL | 1 | 30 | 80 | | 200 | NULL | 1 | 60 | 40 | | 200 | NULL | 1 | 60 | 80 | +--+---+---+++ ``` > Fix error messages during using PIVOT and lateral view > -- > > Key: SPARK-32324 > URL: https://issues.apache.org/jira/browse/SPARK-32324 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > Currently when we use `lateral view` and `pivot` together in from clause, if > `lateral view` is before `pivot`, the error message is "LATERAL cannot be > used together with PIVOT in FROM clau
[jira] [Created] (SPARK-32324) Fix error messages during using PIVOT and lateral view
philipse created SPARK-32324: Summary: Fix error messages during using PIVOT and lateral view Key: SPARK-32324 URL: https://issues.apache.org/jira/browse/SPARK-32324 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: philipse Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it. Steps to reproduce: ``` CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); ``` Query1: ``` SELECT * FROM person lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) ``` Result 1: ``` Error: org.apache.spark.sql.catalyst.parser.ParseException: LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) == SQL == SELECT * FROM person -^^^ lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) (state=,code=0) ``` Query2: ``` SELECT * FROM person PIVOT ( count(distinct age) as a for name in ('Mary','John') ) lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age ``` Reuslt2: ``` +--+---+---+++ | id | Mary | John | c_age | d_age | +--+---+---+++ | 300 | NULL | NULL | 30 | 40 | | 300 | NULL | NULL | 30 | 80 | | 300 | NULL | NULL | 60 | 40 | | 300 | NULL | NULL | 60 | 80 | | 100 | 0 | NULL | 30 | 40 | | 100 | 0 | NULL | 30 | 80 | | 100 | 0 | NULL | 60 | 40 | | 100 | 0 | NULL | 60 | 80 | | 400 | NULL | NULL | 30 | 40 | | 400 | NULL | NULL | 30 | 80 | | 400 | NULL | NULL | 60 | 40 | | 400 | NULL | NULL | 60 | 80 | | 200 | NULL | 1 | 30 | 40 | | 200 | NULL | 1 | 30 | 80 | | 200 | NULL | 1 | 60 | 40 | | 200 | NULL | 1 | 60 | 80 | +--+---+---+++ ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-32239) remove duplicate datatype
[ https://issues.apache.org/jira/browse/SPARK-32239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse resolved SPARK-32239. -- Resolution: Won't Fix > remove duplicate datatype > - > > Key: SPARK-32239 > URL: https://issues.apache.org/jira/browse/SPARK-32239 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > remove duplicate datatype to improve code quality -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32239) remove duplicate datatype
philipse created SPARK-32239: Summary: remove duplicate datatype Key: SPARK-32239 URL: https://issues.apache.org/jira/browse/SPARK-32239 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: philipse remove duplicate datatype to improve code quality -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32193) update docs on regexp function
[ https://issues.apache.org/jira/browse/SPARK-32193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32193: - Description: Sparksql support the following usage, we may update the docs to let it known to more users {code:java} select 'abc' REGEXP '([a-z]+)';{code} was:Hive support regexp function, Spark sql use `rlike` instead of `regexp` , we can update the docs to make it known to more users. > update docs on regexp function > --- > > Key: SPARK-32193 > URL: https://issues.apache.org/jira/browse/SPARK-32193 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > Sparksql support the following usage, we may update the docs to let it known > to more users > {code:java} > select 'abc' REGEXP '([a-z]+)';{code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32193) update docs on regexp function
[ https://issues.apache.org/jira/browse/SPARK-32193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32193: - Summary: update docs on regexp function (was: update migrate guide docs on regexp function) > update docs on regexp function > --- > > Key: SPARK-32193 > URL: https://issues.apache.org/jira/browse/SPARK-32193 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: philipse >Priority: Minor > > Hive support regexp function, Spark sql use `rlike` instead of `regexp` , we > can update the docs to make it known to more users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32193) update migrate guide docs on regexp function
philipse created SPARK-32193: Summary: update migrate guide docs on regexp function Key: SPARK-32193 URL: https://issues.apache.org/jira/browse/SPARK-32193 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: philipse Hive support regexp function, Spark sql use `rlike` instead of `regexp` , we can update the docs to make it known to more users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32131) union and set operations have wrong exception infomation
[ https://issues.apache.org/jira/browse/SPARK-32131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32131: - Description: Union and set operations can only be performed on tables with the compatible column types,while when we have more than two column, the warning messages will have wrong column index.Steps to reproduce. Step1:prepare test data {code:java} drop table if exists test1; drop table if exists test2; drop table if exists test3; create table if not exists test1(id int, age int, name timestamp); create table if not exists test2(id int, age timestamp, name timestamp); create table if not exists test3(id int, age int, name int); insert into test1 select 1,2,'2020-01-01 01:01:01'; insert into test2 select 1,'2020-01-01 01:01:01','2020-01-01 01:01:01'; insert into test3 select 1,3,4; {code} Step2:do query: {code:java} Query1: select * from test1 except select * from test2; Result1: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. timestamp <> int at the second column of the second table;; 'Except false :- Project [id#620, age#621, name#622] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#620, age#621, name#622] +- Project [id#623, age#624, name#625] +- SubqueryAlias `default`.`test2` +- HiveTableRelation `default`.`test2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#623, age#624, name#625] (state=,code=0) Query2: select * from test1 except select * from test3; Result2: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. int <> timestamp at the 2th column of the second table;; 'Except false :- Project [id#632, age#633, name#634] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#632, age#633, name#634] +- Project [id#635, age#636, name#637] +- SubqueryAlias `default`.`test3` +- HiveTableRelation `default`.`test3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#635, age#636, name#637] (state=,code=0) {code} the result of query1 is correct, while query2 have the wrong errors,it should be the third column Here has the wrong column index. +Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. int <> timestamp at the *2th* column of the second table+ We may need to change to the following +Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. int <> timestamp at the *third* column of the second table+ was: Union and set operations can only be performed on tables with the compatible column types,while when we have more than two column, the warning messages will have wrong column index.Steps to reproduce. Step1:prepare test data {code:java} drop table if exists test1; drop table if exists test2; drop table if exists test3; create table if not exists test1(id int, age int, name timestamp); create table if not exists test2(id int, age timestamp, name timestamp); create table if not exists test3(id int, age int, name int); insert into test1 select 1,2,'2020-01-01 01:01:01'; insert into test2 select 1,'2020-01-01 01:01:01','2020-01-01 01:01:01'; insert into test3 select 1,3,4; {code} Step2:do query: {code:java} Query1: select * from test1 except select * from test2; Result1: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. timestamp <> int at the second column of the second table;; 'Except false :- Project [id#620, age#621, name#622] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#620, age#621, name#622] +- Project [id#623, age#624, name#625] +- SubqueryAlias `default`.`test2` +- HiveTableRelation `default`.`test2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#623, age#624, name#625] (state=,code=0) Query2: select * from test1 except select * from test3; Result2: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. int <> timestamp at the 2th column of the second table;; 'Except false :- Project [id#632, age#633, name#634] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#632, age#633, name#634] +- Project [id#635, age#636, name#637] +- SubqueryAlias `default`.`test3` +- HiveTableRelation `default`.`test3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#635, age#636, name#637] (state=,code=0) {code} the result of query1 is correct, while query2 have the wrong errors,it should be the third column
[jira] [Created] (SPARK-32131) union and set operations have wrong exception infomation
philipse created SPARK-32131: Summary: union and set operations have wrong exception infomation Key: SPARK-32131 URL: https://issues.apache.org/jira/browse/SPARK-32131 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: philipse Union and set operations can only be performed on tables with the compatible column types,while when we have more than two column, the warning messages will have wrong column index.Steps to reproduce. Step1:prepare test data {code:java} drop table if exists test1; drop table if exists test2; drop table if exists test3; create table if not exists test1(id int, age int, name timestamp); create table if not exists test2(id int, age timestamp, name timestamp); create table if not exists test3(id int, age int, name int); insert into test1 select 1,2,'2020-01-01 01:01:01'; insert into test2 select 1,'2020-01-01 01:01:01','2020-01-01 01:01:01'; insert into test3 select 1,3,4; {code} Step2:do query: {code:java} Query1: select * from test1 except select * from test2; Result1: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. timestamp <> int at the second column of the second table;; 'Except false :- Project [id#620, age#621, name#622] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#620, age#621, name#622] +- Project [id#623, age#624, name#625] +- SubqueryAlias `default`.`test2` +- HiveTableRelation `default`.`test2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#623, age#624, name#625] (state=,code=0) Query2: select * from test1 except select * from test3; Result2: Error: org.apache.spark.sql.AnalysisException: Except can only be performed on tables with the compatible column types. int <> timestamp at the 2th column of the second table;; 'Except false :- Project [id#632, age#633, name#634] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#632, age#633, name#634] +- Project [id#635, age#636, name#637] +- SubqueryAlias `default`.`test3` +- HiveTableRelation `default`.`test3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#635, age#636, name#637] (state=,code=0) {code} the result of query1 is correct, while query2 have the wrong errors,it should be the third column -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31954) delete duplicate test cases in hivequerysuite
philipse created SPARK-31954: Summary: delete duplicate test cases in hivequerysuite Key: SPARK-31954 URL: https://issues.apache.org/jira/browse/SPARK-31954 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.6 Reporter: philipse remove duplication test cases and result files in hivequerysuite -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31839) delete duplicate code
philipse created SPARK-31839: Summary: delete duplicate code Key: SPARK-31839 URL: https://issues.apache.org/jira/browse/SPARK-31839 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 2.4.5 Reporter: philipse there are duplicate code, we can clear it to improve test quality -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31790) cast scenarios may generate different results between Hive and Spark
[ https://issues.apache.org/jira/browse/SPARK-31790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-31790: - Description: `CAST(n,TIMESTAMPTYPE)` If n is Byte/Short/Int/Long data type, Hive treat n as milliseconds unit , while Spark SQL as seconds unit. so the cast result is different,please be care when you use it. For example: {code:java} In spark spark-sql> select cast(1586318188000 as timestamp); 52238-06-04 13:06:400.0 spark-sql> select cast(1586318188 as timestamp); 2020-04-08 11:56:28 In Hive hive> select cast(1586318188000 as timestamp); 2020-04-08 11:56:28 hive> select cast(1586318188 as timestamp); 1970-01-19 16:38:38.188{code} was:`CAST(n,TIMESTAMPTYPE)` If n is Byte/Short/Int/Long data type, Hive treat n as milliseconds unit , while Spark SQL as seconds unit. so the cast result is different,please be care when you use it > cast scenarios may generate different results between Hive and Spark > - > > Key: SPARK-31790 > URL: https://issues.apache.org/jira/browse/SPARK-31790 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 2.4.5 >Reporter: philipse >Priority: Minor > > `CAST(n,TIMESTAMPTYPE)` If n is Byte/Short/Int/Long data type, Hive treat n > as milliseconds unit , while Spark SQL as seconds unit. so the cast result is > different,please be care when you use it. > For example: > {code:java} > In spark > spark-sql> select cast(1586318188000 as timestamp); > 52238-06-04 13:06:400.0 > spark-sql> select cast(1586318188 as timestamp); > 2020-04-08 11:56:28 > In Hive > hive> select cast(1586318188000 as timestamp); > 2020-04-08 11:56:28 > hive> select cast(1586318188 as timestamp); > 1970-01-19 16:38:38.188{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31790) cast scenarios may generate different results between Hive and Spark
philipse created SPARK-31790: Summary: cast scenarios may generate different results between Hive and Spark Key: SPARK-31790 URL: https://issues.apache.org/jira/browse/SPARK-31790 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 2.4.5 Reporter: philipse `CAST(n,TIMESTAMPTYPE)` If n is Byte/Short/Int/Long data type, Hive treat n as milliseconds unit , while Spark SQL as seconds unit. so the cast result is different,please be care when you use it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31710) result is the not the same when query and execute jobs
[ https://issues.apache.org/jira/browse/SPARK-31710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-31710: - Description: Hi Team Steps to reproduce. {code:java} create table test(id bigint); insert into test select 1586318188000; create table test1(id bigint) partitioned by (year string); insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test; {code} let's check the result. Case 1: *select * from test1;* 234 | 52238-06-04 13:06:400.0 --the result is wrong Case 2: *select 234,cast(id as TIMESTAMP) from test;* java.lang.IllegalArgumentException: Timestamp format must be -mm-dd hh:mm:ss[.f] at java.sql.Timestamp.valueOf(Timestamp.java:237) at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441) at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421) at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530) at org.apache.hive.beeline.Rows$Row.(Rows.java:166) at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:43) at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756) at org.apache.hive.beeline.Commands.execute(Commands.java:826) at org.apache.hive.beeline.Commands.sql(Commands.java:670) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0) I try hive,it works well,and the convert is fine and correct {code:java} select 234,cast(id as TIMESTAMP) from test; 234 2020-04-08 11:56:28 {code} Two questions: q1: if we forbid this convert,should we keep all cases the same? q2: if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*100 nomatter how long the data is,if it convert to timestamp with incorrect length, we can raise the error. {code:java} // // converting seconds to us private[this] def longToTimestamp(t: Long): Long = t * 100L{code} Thanks! was: Hi Team Steps to reproduce. {code:java} create table test(id bigint); insert into test select 1586318188000; create table test1(id bigint) partitioned by (year string); insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test; {code} let's check the result. Case 1: *select * from test1;* 234 | 52238-06-04 13:06:400.0 Case 2: *select 234,cast(id as TIMESTAMP) from test;* java.lang.IllegalArgumentException: Timestamp format must be -mm-dd hh:mm:ss[.f] at java.sql.Timestamp.valueOf(Timestamp.java:237) at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441) at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421) at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530) at org.apache.hive.beeline.Rows$Row.(Rows.java:166) at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:43) at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756) at org.apache.hive.beeline.Commands.execute(Commands.java:826) at org.apache.hive.beeline.Commands.sql(Commands.java:670) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0) I try hive,it works well,and the convert is correct Two questions: q1: if we forbid this convert,should we keep all cases the same? q2: if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*100 nomatter how long the data is,if it convert to timestamp with inc
[jira] [Created] (SPARK-31710) result is the not the same when query and execute jobs
philipse created SPARK-31710: Summary: result is the not the same when query and execute jobs Key: SPARK-31710 URL: https://issues.apache.org/jira/browse/SPARK-31710 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.5 Environment: hdp:2.7.7 spark:2.4.5 Reporter: philipse Hi Team Steps to reproduce. {code:java} create table test(id bigint); insert into test select 1586318188000; create table test1(id bigint) partitioned by (year string); insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test; {code} let's check the result. Case 1: *select * from test1;* 234 | 52238-06-04 13:06:400.0 Case 2: *select 234,cast(id as TIMESTAMP) from test;* java.lang.IllegalArgumentException: Timestamp format must be -mm-dd hh:mm:ss[.f] at java.sql.Timestamp.valueOf(Timestamp.java:237) at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441) at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421) at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530) at org.apache.hive.beeline.Rows$Row.(Rows.java:166) at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:43) at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756) at org.apache.hive.beeline.Commands.execute(Commands.java:826) at org.apache.hive.beeline.Commands.sql(Commands.java:670) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0) I try hive,it works well,and the convert is correct Two questions: q1: if we forbid this convert,should we keep all cases the same? q2: if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*100 nomatter how long the data is,if it convert to timestamp with incorrect length, we can raise the error. {code:java} // // converting seconds to us private[this] def longToTimestamp(t: Long): Long = t * 100L{code} Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31588) merge small files may need more common setting
[ https://issues.apache.org/jira/browse/SPARK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105530#comment-17105530 ] philipse commented on SPARK-31588: -- Thanks Hyukjin for your advice , i will reconsider it. > merge small files may need more common setting > -- > > Key: SPARK-31588 > URL: https://issues.apache.org/jira/browse/SPARK-31588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5 > Environment: spark:2.4.5 > hdp:2.7 >Reporter: philipse >Priority: Major > > Hi , > SparkSql now allow us to use repartition or coalesce to manually control the > small files like the following > /*+ REPARTITION(1) */ > /*+ COALESCE(1) */ > But it can only be tuning case by case ,we need to decide whether we need to > use COALESCE or REPARTITION,can we try a more common way to reduce the > decision by set the target size as hive did > *Good points:* > 1)we will also the new partitions number > 2)with an ON-OFF parameter provided , user can close it if needed > 3)the parmeter can be set at cluster level instand of user side,it will be > more easier to controll samll files. > 4)greatly reduce the pressue of namenode > > *Not good points:* > 1)It will add a new task to calculate the target numbers by stastics the out > files. > > I don't know whether we have planned this in future. > > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31588) merge small files may need more common setting
[ https://issues.apache.org/jira/browse/SPARK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102414#comment-17102414 ] philipse commented on SPARK-31588: -- yes, the block size can be controlled in HDFS.i mean we just take the block size as one the the condition. if we can control the target size in SPARK, we can control the real data in HDFS,instand using repartition control the hard limit. > merge small files may need more common setting > -- > > Key: SPARK-31588 > URL: https://issues.apache.org/jira/browse/SPARK-31588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5 > Environment: spark:2.4.5 > hdp:2.7 >Reporter: philipse >Priority: Major > > Hi , > SparkSql now allow us to use repartition or coalesce to manually control the > small files like the following > /*+ REPARTITION(1) */ > /*+ COALESCE(1) */ > But it can only be tuning case by case ,we need to decide whether we need to > use COALESCE or REPARTITION,can we try a more common way to reduce the > decision by set the target size as hive did > *Good points:* > 1)we will also the new partitions number > 2)with an ON-OFF parameter provided , user can close it if needed > 3)the parmeter can be set at cluster level instand of user side,it will be > more easier to controll samll files. > 4)greatly reduce the pressue of namenode > > *Not good points:* > 1)It will add a new task to calculate the target numbers by stastics the out > files. > > I don't know whether we have planned this in future. > > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31588) merge small files may need more common setting
[ https://issues.apache.org/jira/browse/SPARK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101775#comment-17101775 ] philipse commented on SPARK-31588: -- For example: if we have output 3 files,size as 10M,50M,200M,the block size as 128M,we may keep the file size more close the average,but we also should keep the size bigger than the block, just in case someone set wrong paramters. case 1:we set the target size as 60M.the expected average file size as Max(blocksize,60M) it will output an integer file count as the repartition number :[total_file_size /average file size]+1 the final result will be 3 files:size as 128M,128M,4M if we set the target size as 5120M, then it will repartition as 1 file. size as 260M. thus ,we can set the target size as the global paramter,it will benefit all task. > merge small files may need more common setting > -- > > Key: SPARK-31588 > URL: https://issues.apache.org/jira/browse/SPARK-31588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5 > Environment: spark:2.4.5 > hdp:2.7 >Reporter: philipse >Priority: Major > > Hi , > SparkSql now allow us to use repartition or coalesce to manually control the > small files like the following > /*+ REPARTITION(1) */ > /*+ COALESCE(1) */ > But it can only be tuning case by case ,we need to decide whether we need to > use COALESCE or REPARTITION,can we try a more common way to reduce the > decision by set the target size as hive did > *Good points:* > 1)we will also the new partitions number > 2)with an ON-OFF parameter provided , user can close it if needed > 3)the parmeter can be set at cluster level instand of user side,it will be > more easier to controll samll files. > 4)greatly reduce the pressue of namenode > > *Not good points:* > 1)It will add a new task to calculate the target numbers by stastics the out > files. > > I don't know whether we have planned this in future. > > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31588) merge small files may need more common setting
philipse created SPARK-31588: Summary: merge small files may need more common setting Key: SPARK-31588 URL: https://issues.apache.org/jira/browse/SPARK-31588 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.5 Environment: spark:2.4.5 hdp:2.7 Reporter: philipse Hi , SparkSql now allow us to use repartition or coalesce to manually control the small files like the following /*+ REPARTITION(1) */ /*+ COALESCE(1) */ But it can only be tuning case by case ,we need to decide whether we need to use COALESCE or REPARTITION,can we try a more common way to reduce the decision by set the target size as hive did *Good points:* 1)we will also the new partitions number 2)with an ON-OFF parameter provided , user can close it if needed 3)the parmeter can be set at cluster level instand of user side,it will be more easier to controll samll files. 4)greatly reduce the pressue of namenode *Not good points:* 1)It will add a new task to calculate the target numbers by stastics the out files. I don't know whether we have planned this in future. Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24194) HadoopFsRelation cannot overwrite a path that is also being read from
[ https://issues.apache.org/jira/browse/SPARK-24194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093055#comment-17093055 ] philipse commented on SPARK-24194: -- Hi is the issue closed ? can i try it in product env? Thanks > HadoopFsRelation cannot overwrite a path that is also being read from > - > > Key: SPARK-24194 > URL: https://issues.apache.org/jira/browse/SPARK-24194 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 > Environment: spark master >Reporter: yangz >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > When > {code:java} > INSERT OVERWRITE TABLE territory_count_compare select * from > territory_count_compare where shop_count!=real_shop_count > {code} > And territory_count_compare is a table with parquet, there will be a error > Cannot overwrite a path that is also being read from > > And in file MetastoreDataSourceSuite.scala, there have a test case > > > {code:java} > table(tableName).write.mode(SaveMode.Overwrite).insertInto(tableName) > {code} > > But when the table territory_count_compare is a common hive table, there is > no error. > So I think the reason is when insert overwrite into hadoopfs relation with > static partition, it first delete the partition in the output. But it should > be the time when the job commited. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31508) string type compare with numberic cause data inaccurate
[ https://issues.apache.org/jira/browse/SPARK-31508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089663#comment-17089663 ] philipse commented on SPARK-31508: -- haha ,but normall it will be a little complex,we will migrate many hqls to sparksql,I suggest it will be better dealed with in the code. Can you help review the PR? > string type compare with numberic cause data inaccurate > --- > > Key: SPARK-31508 > URL: https://issues.apache.org/jira/browse/SPARK-31508 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: hadoop2.7 > spark2.4.5 >Reporter: philipse >Priority: Major > Attachments: image-2020-04-22-20-00-09-821.png > > > Hi all > > Sparksql may should convert values to double if string type compare with > number type.the cases shows as below > 1, create table > create table test1(id string); > > 2,insert data into table > insert into test1 select 'avc'; > insert into test1 select '2'; > insert into test1 select '0a'; > insert into test1 select ''; > insert into test1 select > '22'; > 3.Let's check what's happening > select * from test_gf13871.test1 where id > 0 > the results shows below > *2* > ** > Really amazing,the big number 222...cannot be selected. > while when i check in hive,the 222...shows normal. > 4.try to explain the command,we may know what happened,if the data is big > enough than max_int_value,it will not selected,we may need to convert to > double instand. > !image-2020-04-21-18-49-58-850.png! > I wanna know if we have fixed or planned in 3.0 or later version.,please feel > free to give any advice, > > Many Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31512) make window function using order by optional
philipse created SPARK-31512: Summary: make window function using order by optional Key: SPARK-31512 URL: https://issues.apache.org/jira/browse/SPARK-31512 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.5 Environment: spark2.4.5 hadoop2.7.7 Reporter: philipse Hi all In other sql dialect ,order by is not the must when using window function,we may make it pararmteterized the below is the case : *select row_number()over() from test1* Error: org.apache.spark.sql.AnalysisException: Window function row_number() requires window to be ordered, please add ORDER BY clause. For example SELECT row_number()(value_expr) OVER (PARTITION BY window_partition ORDER BY window_ordering) from table; (state=,code=0) So. i suggest make it as a choice,or we will meet the error when migrate sql from other dialect,such as hive -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31508) string type compare with numberic cause data inaccurate
[ https://issues.apache.org/jira/browse/SPARK-31508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-31508: - Summary: string type compare with numberic cause data inaccurate (was: string type compare with numberic case data inaccurate) > string type compare with numberic cause data inaccurate > --- > > Key: SPARK-31508 > URL: https://issues.apache.org/jira/browse/SPARK-31508 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 > Environment: hadoop2.7 > spark2.4.5 >Reporter: philipse >Priority: Major > > Hi all > > Sparksql may should convert values to double if string type compare with > number type.the cases shows as below > 1, create table > create table test1(id string); > > 2,insert data into table > insert into test1 select 'avc'; > insert into test1 select '2'; > insert into test1 select '0a'; > insert into test1 select ''; > insert into test1 select > '22'; > 3.Let's check what's happening > select * from test_gf13871.test1 where id > 0 > the results shows below > *2* > ** > Really amazing,the big number 222...cannot be selected. > while when i check in hive,the 222...shows normal. > 4.try to explain the command,we may know what happened,if the data is big > enough than max_int_value,it will not selected,we may need to convert to > double instand. > !image-2020-04-21-18-49-58-850.png! > I wanna know if we have fixed or planned in 3.0 or later version.,please feel > free to give any advice, > > Many Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31508) string type compare with numberic case data inaccurate
philipse created SPARK-31508: Summary: string type compare with numberic case data inaccurate Key: SPARK-31508 URL: https://issues.apache.org/jira/browse/SPARK-31508 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.5 Environment: hadoop2.7 spark2.4.5 Reporter: philipse Hi all Sparksql may should convert values to double if string type compare with number type.the cases shows as below 1, create table create table test1(id string); 2,insert data into table insert into test1 select 'avc'; insert into test1 select '2'; insert into test1 select '0a'; insert into test1 select ''; insert into test1 select '22'; 3.Let's check what's happening select * from test_gf13871.test1 where id > 0 the results shows below *2* ** Really amazing,the big number 222...cannot be selected. while when i check in hive,the 222...shows normal. 4.try to explain the command,we may know what happened,if the data is big enough than max_int_value,it will not selected,we may need to convert to double instand. !image-2020-04-21-18-49-58-850.png! I wanna know if we have fixed or planned in 3.0 or later version.,please feel free to give any advice, Many Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074361#comment-17074361 ] philipse commented on SPARK-18681: -- [~michael] any news for this issue ? i meet the same issue on spark2.4.5 > Throw Filtering is supported only on partition keys of type string exception > > > Key: SPARK-18681 > URL: https://issues.apache.org/jira/browse/SPARK-18681 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 2.1.0 > > > Cloudera put > {{/var/run/cloudera-scm-agent/process/15000-hive-HIVEMETASTORE/hive-site.xml}} > as the configuration file for the Hive Metastore Server, where > {{hive.metastore.try.direct.sql=false}}. But Spark reading the gateway > configuration file and get default value > {{hive.metastore.try.direct.sql=true}}. we should use {{getMetaConf}} or > {{getMSC.getConfigValue}} method to obtain the original configuration from > Hive Metastore Server. > {noformat} > spark-sql> CREATE TABLE test (value INT) PARTITIONED BY (part INT); > Time taken: 0.221 seconds > spark-sql> select * from test where part=1 limit 10; > 16/12/02 08:33:45 ERROR thriftserver.SparkSQLDriver: Failed in [select * from > test where part=1 limit 10] > java.lang.RuntimeException: Caught Hive MetaException attempting to get > partition metadata by filter from Hive. You can set the Spark configuration > setting spark.sql.hive.manageFilesourcePartitions to false to work around > this problem, however this will result in degraded performance. Please report > a bug: https://issues.apache.org/jira/browse/SPARK > at > org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:610) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:549) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionsByFilter$1.apply(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:547) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:954) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:91) > at > org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:938) > at > org.apache.spark.sql.hive.MetastoreRelation.getHiveQlPartitions(MetastoreRelation.scala:156) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:151) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$10.apply(HiveTableScanExec.scala:150) > at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2435) > at > org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:149) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) > at > org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:295) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution$$anonfun$hiveResultString$4.apply(QueryExecution.scala:13