[ 
https://issues.apache.org/jira/browse/SPARK-32131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-32131:
----------------------------------
    Affects Version/s: 2.1.3

> union and set operations have wrong exception infomation
> --------------------------------------------------------
>
>                 Key: SPARK-32131
>                 URL: https://issues.apache.org/jira/browse/SPARK-32131
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.3, 2.2.3, 2.3.4, 2.4.6, 3.0.0
>            Reporter: philipse
>            Priority: Minor
>
> Union and set operations can only be performed on tables with the compatible 
> column types,while when we have more than two column, the warning messages 
> will have wrong column index.Steps to reproduce.
> Step1:prepare test data
> {code:java}
> drop table if exists test1; 
> drop table if exists test2; 
> drop table if exists test3;
> create table if not exists test1(id int, age int, name timestamp);
> create table if not exists test2(id int, age timestamp, name timestamp);
> create table if not exists test3(id int, age int, name int);
> insert into test1 select 1,2,'2020-01-01 01:01:01';
> insert into test2 select 1,'2020-01-01 01:01:01','2020-01-01 01:01:01'; 
> insert into test3 select 1,3,4;
> {code}
> Step2:do query:
> {code:java}
> Query1:
> select * from test1 except select * from test2;
> Result1:
> Error: org.apache.spark.sql.AnalysisException: Except can only be performed 
> on tables with the compatible column types. timestamp <> int at the second 
> column of the second table;; 'Except false :- Project [id#620, age#621, 
> name#622] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation 
> `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> [id#620, age#621, name#622] +- Project [id#623, age#624, name#625] +- 
> SubqueryAlias `default`.`test2` +- HiveTableRelation `default`.`test2`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#623, age#624, 
> name#625] (state=,code=0)
> Query2:
> select * from test1 except select * from test3;
> Result2:
> Error: org.apache.spark.sql.AnalysisException: Except can only be performed 
> on tables with the compatible column types. int <> timestamp at the 2th 
> column of the second table;; 'Except false :- Project [id#632, age#633, 
> name#634] : +- SubqueryAlias `default`.`test1` : +- HiveTableRelation 
> `default`.`test1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> [id#632, age#633, name#634] +- Project [id#635, age#636, name#637] +- 
> SubqueryAlias `default`.`test3` +- HiveTableRelation `default`.`test3`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [id#635, age#636, 
> name#637] (state=,code=0)
> {code}
> the result of query1 is correct, while query2 have the wrong errors,it should 
> be the third column
> Here has the wrong column index.
> +Error: org.apache.spark.sql.AnalysisException: Except can only be performed 
> on tables with the compatible column types. int <> timestamp at the *2th* 
> column of the second table+
> We may need to change to the following
> +Error: org.apache.spark.sql.AnalysisException: Except can only be performed 
> on tables with the compatible column types. int <> timestamp at the *third* 
> column of the second table+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to