[ https://issues.apache.org/jira/browse/SPARK-42594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zzzzming95 updated SPARK-42594: ------------------------------- Description: 1. Spark would save view schema as tabel param. 2. Spark will make tabel param as output schema when select the view . 3. Hive will not update tabel param when runing `create or replace view` to update the view. !image-2023-02-27-13-31-20-420.png! So when hive and spark are mixed and update the view, spark may ignore some col. To reproduce this issue: 1. running in spark ``` create table test_spark (id string); create view test_spark_view as select id from test_spark; ``` 2. running in hive ``` create or replace view test_spark_view as select id , "test" as new_id from test_spark; ``` 3. We can see spark will ignore `test_spark_view#new_id` when select test_spark_view using spark. But hive can read it. I'm not sure if this is a feature of spark. was: 1. Spark would save view schema as tabel param. 2. Spark will make tabel param as output schema when select the view . 3. Hive will not update tabel param when runing `create or replace view` to update the view. So when hive and spark are mixed and update the view, spark may ignore some strings. To reproduce this issue: 1. running in spark ``` create table test_spark (id string); create view test_spark_view as select id from test_spark; ``` 2. running in hive ``` create or replace view test_spark_view as select id , "test" as new_id from test_spark; ``` 3. We can see spark will ignore `test_spark_view#new_id` when select test_spark_view using spark. But hive can read it. > spark can not read lastest view sql when run `create or replace view` by hive > ----------------------------------------------------------------------------- > > Key: SPARK-42594 > URL: https://issues.apache.org/jira/browse/SPARK-42594 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.3.2 > Reporter: zzzzming95 > Priority: Major > Attachments: image-2023-02-27-13-31-20-420.png > > > 1. Spark would save view schema as tabel param. > 2. Spark will make tabel param as output schema when select the view . > 3. Hive will not update tabel param when runing `create or replace view` to > update the view. > !image-2023-02-27-13-31-20-420.png! > So when hive and spark are mixed and update the view, spark may ignore some > col. > To reproduce this issue: > 1. running in spark > ``` > create table test_spark (id string); > create view test_spark_view as select id from test_spark; > ``` > 2. running in hive > ``` > create or replace view test_spark_view as select id , "test" as new_id from > test_spark; > ``` > 3. We can see spark will ignore `test_spark_view#new_id` when select > test_spark_view using spark. But hive can read it. > I'm not sure if this is a feature of spark. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org