[ https://issues.apache.org/jira/browse/HIVE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Kaszab updated HIVE-22062: -------------------------------- Description: Changing the schema (e.g. adding a new column) of a non-partitioned ACID table results in the table-level writeId being incremented. This is as expected. However, if you do the same on a partitioned ACID table then neither the table-level nor the partition-level writeIds are updated. I would expect in this case to increment the table-level writeId to reflect that the table has been changed. Note, that get_valid_write_ids() shows that the high watermark is incremented even though the writeId isn't. Update: I'd extend the scope of this Jira further a bit. There are a number of use cases in Hive that doesn't result in a writeId change on ACID tables and as a result there is no way from other systems (like Impala) to judge if a refresh should be run on a table or not. The only option is to every time update all the data for a table that is expensive. E.g. Additionally to the above use-case compaction is something that is not noticeable outside from Hive. was: Changing the schema (e.g. adding a new column) of a non-partitioned ACID table results in the table-level writeId being incremented. This is as expected. However, if you do the same on a partitioned ACID table then neither the table-level nor the partition-level writeIds are updated. I would expect in this case to increment the table-level writeId to reflect that the table has been changed. Note, that get_valid_write_ids() shows that the high watermark is incremented even though the writeId isn't. > WriteId is not updated for a partitioned ACID table when schema changes > ----------------------------------------------------------------------- > > Key: HIVE-22062 > URL: https://issues.apache.org/jira/browse/HIVE-22062 > Project: Hive > Issue Type: Bug > Reporter: Gabor Kaszab > Assignee: Laszlo Kovari > Priority: Major > Labels: ACID > > Changing the schema (e.g. adding a new column) of a non-partitioned ACID > table results in the table-level writeId being incremented. This is as > expected. > However, if you do the same on a partitioned ACID table then neither the > table-level nor the partition-level writeIds are updated. I would expect in > this case to increment the table-level writeId to reflect that the table has > been changed. > Note, that get_valid_write_ids() shows that the high watermark is incremented > even though the writeId isn't. > Update: I'd extend the scope of this Jira further a bit. There are a number > of use cases in Hive that doesn't result in a writeId change on ACID tables > and as a result there is no way from other systems (like Impala) to judge if > a refresh should be run on a table or not. The only option is to every time > update all the data for a table that is expensive. E.g. Additionally to the > above use-case compaction is something that is not noticeable outside from > Hive. -- This message was sent by Atlassian Jira (v8.3.4#803005)