[ https://issues.apache.org/jira/browse/HIVE-16559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Barna Zsombor Klara updated HIVE-16559: --------------------------------------- Status: Patch Available (was: Open) > Parquet schema evolution for partitioned tables may break if table and > partition serdes differ > ---------------------------------------------------------------------------------------------- > > Key: HIVE-16559 > URL: https://issues.apache.org/jira/browse/HIVE-16559 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Reporter: Barna Zsombor Klara > Assignee: Barna Zsombor Klara > Fix For: 3.0.0 > > Attachments: HIVE-16559.01.patch, HIVE-16559.02.patch, > HIVE-16559.03.patch, HIVE-16559.04.patch > > > Parquet schema evolution should make it possible to have partitions/tables > backed by files with different schemas. Hive should match the table columns > with file columns based on the column name if possible. > However if the serde for a table is missing columns from the serde of a > partition Hive fails to match the columns together. > Steps to reproduce: > {code} > CREATE TABLE myparquettable_parted > ( > name string, > favnumber int, > favcolor string, > age int, > favpet string > ) > PARTITIONED BY (day string) > STORED AS PARQUET; > INSERT OVERWRITE TABLE myparquettable_parted > PARTITION(day='2017-04-04') > SELECT > 'mary' as name, > 5 AS favnumber, > 'blue' AS favcolor, > 35 AS age, > 'dog' AS favpet; > alter table myparquettable_parted > REPLACE COLUMNS > ( > favnumber int, > age int > ); <!--- No cascade option, so the partition will not be altered. > {code} > {{SELECT * FROM myparquettable_parted where day='2017-04-04';}} > will fail with: > {{java.lang.UnsupportedOperationException: Cannot inspect > org.apache.hadoop.io.IntWritable}} > Hive should either match the columns together or prevent the user from > dropping columns from the table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)