[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17524983#comment-17524983 ] tanghui commented on HIVE-24920: After the patch is updated, the partition table location and hdfs data directory are displayed normally, but the partition location of the table in the SDS in the Hive metabase is still displayed as the location of the old table, resulting in no data in the query partition. CREATE TABLE part_test( c1 string ,c2 string )PARTITIONED BY (dat string) insert into part_test values ("11","th","20220101") insert into part_test values ("22","th","20220102") alter table part_test rename to part_test11; --this resulting in no data in the query partition. select * from part_test11 where dat="20220101"; - SDS in the Hive metabase: select SDS.LOCATION from TBLS,SDS where TBLS.TBL_NAME="part_test11" AND TBLS.TBL_ID=SDS.CD_ID; --- |LOCATION| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test11| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220101| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220102| --- We need to modify the partition location of the table in SDS to ensure that the query results are normal > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: metastore_translator, pull-request-available > Fix For: 4.0.0, 4.0.0-alpha-1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348993#comment-17348993 ] Thejas Nair commented on HIVE-24920: {quote}create table t(i integer); {quote} I agree that this should behave like the old managed table behavior (irrespective of config). If old managed tables would have thrown error if the dir exists, it should do so now as well. > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347763#comment-17347763 ] Zoltan Haindrich commented on HIVE-24920: - [~ngangam], [~thejas]: I've updated the PR - and implemented that TRANSLATED_TO_EXTERNAL tables may follow renames > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347702#comment-17347702 ] Zoltan Haindrich commented on HIVE-24920: - if we are about to do that then an existing external table dir might cause some trouble: {code} create external table t (i integer); -- this will create dir WH/t insert into t values (1); drop table t; -- this will leave WH/t as is beacuse its a full external table without the purge option create table t(i integer); -- this will create a table at the same external location; which is now occupied...your current proposal doesn't handle this case select * from t 1 -- shows the inserted record from the previous table instance... {code} I don't think we should just accept the above behaviour the user have used a statement which should have created a normal managed table (create table t) - so it should be empty in any circumstancesif we ant to do the same kind of renames for translated table we should still retain the "existing location dir" avoidance mechanisms of the existing patch - and set the one which throws an exception if it exists the default. This could probably enable our users to choose the behaviour they would like to see. [~thejas]: what do you think? > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346279#comment-17346279 ] Thejas Nair commented on HIVE-24920: For resiliency around issues caused by existing dirs with same name, users should specify custom locations for the external tables (or otherwise ensure that that default dir doesn’t exist). Other option is to use ACID-Managed tables (where you don’t have cases of random dirs) External tables by definition don’t have data under management of hive. HDFS backed one is just one example, other examples include hbase, kudu tables etc. > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342449#comment-17342449 ] Zoltan Haindrich commented on HIVE-24920: - this could also be implemented as an option; but i think this will not solve the problem fully: what should happen if you already have a directory with that name? because there is no guarantee that test_renamed doesn't exists > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
[ https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342267#comment-17342267 ] Naveen Gangam commented on HIVE-24920: -- [~kgyrtkirk] an alternate approach which I think might be cleaner is to have AlterTableHandler treat such tables with (“TRANSLATED_TO_EXTERNAL” or (type=“EXTERNAL” + “external.table.purge=true”)) same as Managed/ACID tables? i.e. we relocate these tables on a rename just like we do for managed tables today. create table test; --> table located in "test" directory alter table test rename to test_renamed; --> because this table has property "TRANSLATED_TO_EXTERNAL" and/or "external.table.purge=true", we relocate this table data from dir "test" to dir "test_renamed" create table test; --> creates a new table with dir "test" > TRANSLATED_TO_EXTERNAL tables may write to the same location > > > Key: HIVE-24920 > URL: https://issues.apache.org/jira/browse/HIVE-24920 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > create table t (a integer); > insert into t values(1); > alter table t rename to t2; > create table t (a integer); -- I expected an exception from this command > (location already exists) but because its an external table no exception > insert into t values(2); > select * from t; -- shows 1 and 2 > drop table t2;-- wipes out data location > select * from t; -- empty resultset > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)