-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
-----------------------------------------------------------

(Updated March 26, 2018, 10:10 a.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Changes
-------

Rebase patch


Bugs: HIVE-18696
    https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
-------

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-----

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 519e8fe 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 4d9cb1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 1122057 


Diff: https://reviews.apache.org/r/65716/diff/4/

Changes: https://reviews.apache.org/r/65716/diff/3-4/


Testing
-------

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora

Reply via email to