[ https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sankar Hariappan updated HIVE-20968: ------------------------------------ Status: Open (was: Patch Available) > Support conversion of managed to external where location set was not owned by > hive > ---------------------------------------------------------------------------------- > > Key: HIVE-20968 > URL: https://issues.apache.org/jira/browse/HIVE-20968 > Project: Hive > Issue Type: Sub-task > Components: repl > Affects Versions: 4.0.0 > Reporter: mahesh kumar behera > Assignee: Sankar Hariappan > Priority: Major > Labels: DR, pull-request-available > Attachments: HIVE-20968.01.patch, HIVE-20968.02.patch > > Time Spent: 2h 20m > Remaining Estimate: 0h > > As per migration rule, if a location is outside the default managed table > directory and the location is not owned by "hive" user, then it should be > converted to external table after upgrade. > So, the same rule is applicable for Hive replication where the data of > source managed table is residing outside the default warehouse directory and > is not owned by "hive" user. > During this conversion, the path should be preserved in target as well so > that failover works seamlessly. > # If the table location is out side hive warehouse and is not owned by hive, > then the table at target will be converted to external table. But the > location can not be retained , it will be retained relative to hive external > warehouse directory. > # As the table is not an external table at source, only those data which > are added using events will be replicated. > # The ownership of the location will be stored in the create table event and > will be used to compare it with strict.managed.tables.migration.owner to > decide if the flag in replication scope can be set. This flag is used to > convert the managed table to external table at target. > Some of the scenarios needs to be blocked if the database is set for > replication from a cluster with non strict managed table setting to strict > managed table. > 1. Block alter table / partition set location for database with source of > replication set for managed tables > 2. If user manually changes the ownership of the location, hive replication > may go to a non recoverable state. > 3. Block add partition if the location ownership is different than table > location for managed tables. > 4. User needs to set strict.managed.tables.migration.owner along with dump > command (default to hive user). This value will be used during dump to decide > the ownership which will be used during load to decide the table type. The > location owner information can be stored in the events during create table. > The flag can be stored in replication spec. Check other such configs used in > upgrade tool. > 5. Block conversion from managed to external and vice versa. Pass some flag > in upgrade flow to allow this conversion during upgrade flow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)