[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213453#comment-16213453
 ] 

Sergey Shelukhin commented on HIVE-17856:
-----------------------------------------

Ok I noticed this while looking at the union bug. I think the reason IOW 
"worked for me" in some of the tests and works for [~steveyeom2017] when he 
runs the above test is that it's still using old MM logic that is not ACID 
compliant.
This call in Hive.java:
{noformat}
deleteOldPathForReplace(newPartPath, oldPartPath, getConf(), isAutoPurge,
              new JavaUtils.IdPathFilter(txnId, stmtId, false, true), true,
              tbl.isStoredAsSubDirectories() ? tbl.getSkewedColNames().size() : 
0);
{noformat}
Deletes old delta directories by means of IdPathFilter with 3rd arg being 
false, which means, return ALL delta directories that don't match txnId.
So, all the other data in the table gets nuked.
This is the implementation is incorrect for ACID integration.

So, delete for MM table codepath needs to be removed (easy to find by looking 
where IdPathFilter is used with isMatch == false, meaning "find every txn 
except this one").
Then, IOW will probably stop working w.r.t. "overwrite" because old deltas will 
stick around.
After that Eugene can comment on where and how ACID uses base directories to 
implement IOW.
I suspect that in IOW case, it will be as simple as instead of creating 
delta_.... dir for output, creating base_.... directory; and that would be 
enough for all the logic shared with ACID, e.g. compactor, to handle it 
correctly. The code that finds what to read in HiveInputFormat would also need 
to be updated to take committed base-s into account. But I am not familiar with 
ACID IOW, so it may not be as simple.

cc [~hagleitn] [~steveyeom2017] [~ekoifman]



> MM tables - IOW is not ACID compliant
> -------------------------------------
>
>                 Key: HIVE-17856
>                 URL: https://issues.apache.org/jira/browse/HIVE-17856
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Sergey Shelukhin
>            Assignee: Steve Yeom
>              Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to