[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2022-04-20 Thread tanghui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17524983#comment-17524983
 ] 

tanghui commented on HIVE-24920:


After the patch is updated, the partition table location and hdfs data 
directory are displayed normally, but the partition location of the table in 
the SDS in the Hive metabase is still displayed as the location of the old 
table, resulting in no data in the query partition.


CREATE TABLE part_test(
c1 string
,c2 string
)PARTITIONED BY (dat string)

insert into part_test values ("11","th","20220101")
insert into part_test values ("22","th","20220102")

alter table part_test rename to part_test11;

--this resulting in no data in the query partition.
select * from part_test11 where dat="20220101";
-

SDS in the Hive metabase:
select SDS.LOCATION from TBLS,SDS where TBLS.TBL_NAME="part_test11" AND 
TBLS.TBL_ID=SDS.CD_ID;

---
|LOCATION|
|hdfs://nameservice1/warehouse/tablespace/external/hive/part_test11|
|hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220101|
|hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220102|

---

 

We need to modify the partition location of the table in SDS to ensure that the 
query results are normal

 

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: metastore_translator, pull-request-available
> Fix For: 4.0.0, 4.0.0-alpha-1
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-21 Thread Thejas Nair (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348993#comment-17348993
 ] 

Thejas Nair commented on HIVE-24920:


{quote}create table t(i integer);
{quote}
I agree that this should behave like the old managed table behavior 
(irrespective of config). If old managed tables would have thrown error if the 
dir exists, it should do so now as well.

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-19 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347763#comment-17347763
 ] 

Zoltan Haindrich commented on HIVE-24920:
-

[~ngangam], [~thejas]: I've updated the PR - and implemented that 
TRANSLATED_TO_EXTERNAL tables may follow renames

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-19 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347702#comment-17347702
 ] 

Zoltan Haindrich commented on HIVE-24920:
-

if we are about to do that then an existing external table dir might cause some 
trouble:
{code}
create external table t (i integer);
-- this will create dir WH/t
insert into t values (1);
drop table t;
-- this will leave WH/t as is beacuse its a full external table without the 
purge option
create table t(i integer);
-- this will create a table at the same external location; which is now 
occupied...your current proposal doesn't handle this case
select * from t
1
-- shows the inserted record from the previous table instance...
{code}

I don't think we should just accept the above behaviour the user have used a 
statement which should have created a normal managed table (create table t) - 
so it should be empty in any circumstancesif we ant to do the same kind of 
renames for translated table we should still retain the "existing location dir" 
avoidance mechanisms of the existing patch - and set the one which throws an 
exception if it exists the default.

This could probably enable our users to choose the behaviour they would like to 
see.

[~thejas]: what do you think?

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-17 Thread Thejas Nair (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346279#comment-17346279
 ] 

Thejas Nair commented on HIVE-24920:


For resiliency around issues caused by existing dirs with same name, users 
should specify custom locations for the external tables (or otherwise ensure 
that that default dir doesn’t exist). Other option is to use ACID-Managed 
tables (where you don’t have cases of random dirs)

External tables by definition don’t have data under management of hive. HDFS 
backed one is just one example, other examples include hbase, kudu tables etc.  

 

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-11 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342449#comment-17342449
 ] 

Zoltan Haindrich commented on HIVE-24920:
-

this could also be implemented as an option; but i think this will not solve 
the problem fully: what should happen if you already have a directory with that 
name?
because there is no guarantee that test_renamed doesn't exists

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-05-10 Thread Naveen Gangam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342267#comment-17342267
 ] 

Naveen Gangam commented on HIVE-24920:
--

[~kgyrtkirk] an alternate approach which I think might be cleaner is to have 
AlterTableHandler treat such tables with (“TRANSLATED_TO_EXTERNAL” or 
(type=“EXTERNAL” + “external.table.purge=true”)) same as Managed/ACID tables? 
i.e. we relocate these tables on a rename just like we do for managed tables 
today.
create table test; --> table located in "test" directory
alter table test rename to test_renamed; --> because this table has property 
"TRANSLATED_TO_EXTERNAL" and/or "external.table.purge=true", we relocate this 
table data from dir "test" to dir "test_renamed"
create table test; --> creates a new table with dir "test"  

> TRANSLATED_TO_EXTERNAL tables may write to the same location
> 
>
> Key: HIVE-24920
> URL: https://issues.apache.org/jira/browse/HIVE-24920
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> create table t (a integer);
> insert into t values(1);
> alter table t rename to t2;
> create table t (a integer); -- I expected an exception from this command 
> (location already exists) but because its an external table no exception
> insert into t values(2);
> select * from t;  -- shows 1 and 2
> drop table t2;-- wipes out data location
> select * from t;  -- empty resultset
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)