zhuwei created HIVE-20725:
-----------------------------

             Summary: Simultaneous dynamic inserts can result in partition 
files lost 
                 Key: HIVE-20725
                 URL: https://issues.apache.org/jira/browse/HIVE-20725
             Project: Hive
          Issue Type: Bug
            Reporter: zhuwei
            Assignee: zhuwei


If two users attempt a dynamic insert into the same new partition at the same 
time, a possible race condition exists which result in error state. In that 
case the partition info has been inserted to metastore but data files been 
removed.

The current logic in function "add_partition_core" in class 
HiveMetaStore.HMSHandler is like this :
 # check if partition already exists
 # create the partition files directory if not exists
 # try to add partition
 # if add partition failed and it created the directory in step 2, delete that 
directory

Assume that two users are inserting the same partition at the same time, there 
are two threads operating their requests, say thread A and thread B. If 1~4 
steps of thread B are all done between step 2 and step 3 of thread A. The 
sequence like this : A1 A2 B1 B2 B3 B4 A3 A4. The partition files written by B 
will be removed by A.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to