[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yy updated SPARK-25367: ----------------------- Description: We save the dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: {code:java} val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people_test").show() +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ | age| bigint| null| | name| string| null| +--------+---------+-------+ {code} hive: {code:java} hive> desc people_test; OK age bigint name string Time taken: 0.454 seconds, Fetched: 2 row(s) hive> alter table people_test change column age age1 double; OK Time taken: 0.68 seconds hive> desc people_test; OK age1 double name string Time taken: 0.358 seconds, Fetched: 2 row(s){code} spark-shell: {code:java} spark.sql("desc people_test").show() +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ | age| bigint| null| | name| string| null| +--------+---------+-------+ {code} We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent. was: We save the dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: {code:java} val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people").show() {code} hive: {code:java} alter table people_test change column age age1 double; desc people_test;{code} spark-shell: {code:java} spark.sql("desc people").show() {code} We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent. > Hive table created by Spark dataFrame has incompatiable schema in spark and > hive > -------------------------------------------------------------------------------- > > Key: SPARK-25367 > URL: https://issues.apache.org/jira/browse/SPARK-25367 > Project: Spark > Issue Type: Bug > Components: Spark Shell, SQL > Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1 > Environment: spark2.2.1-hadoop-2.6.0-chd-5.4.2 > hive-1.2.1 > Reporter: yy > Priority: Major > Labels: sparksql > Fix For: 2.3.2 > > > We save the dataframe object as a hive table in orc/parquet format in the > spark shell. > After we modified the column type (int to double) of this table in hive > jdbc, we found the column type queried in spark-shell didn't change, but > changed in hive jdbc. After we restarted the spark-shell, this table's column > type is still incompatible as showed in hive jdbc. > The coding process are as follows: > spark-shell: > {code:java} > val df = spark.read.json("examples/src/main/resources/people.json"); > df.write.format("orc").saveAsTable("people_test"); > spark.catalog.refreshTable("people_test") > spark.sql("desc people_test").show() > +--------+---------+-------+ > |col_name|data_type|comment| > +--------+---------+-------+ > | age| bigint| null| > | name| string| null| > +--------+---------+-------+ > {code} > > hive: > {code:java} > hive> desc people_test; > OK > age bigint > name string > Time taken: 0.454 seconds, Fetched: 2 row(s) > hive> alter table people_test change column age age1 double; > OK > Time taken: 0.68 seconds > hive> desc people_test; > OK > age1 double > name string > Time taken: 0.358 seconds, Fetched: 2 row(s){code} > spark-shell: > {code:java} > spark.sql("desc people_test").show() > +--------+---------+-------+ > |col_name|data_type|comment| > +--------+---------+-------+ > | age| bigint| null| > | name| string| null| > +--------+---------+-------+ > {code} > > We also tested in spark-shell by creating a table using spark.sql("create > table XXX()"), the modified columns are consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org