Sankar Hariappan created HIVE-21269:
---------------------------------------

             Summary: Hive replication should mandate -update and -delete as 
DistCp options to avoid data inconsistency.
                 Key: HIVE-21269
                 URL: https://issues.apache.org/jira/browse/HIVE-21269
             Project: Hive
          Issue Type: Bug
          Components: repl
    Affects Versions: 4.0.0
            Reporter: Sankar Hariappan
            Assignee: Sankar Hariappan


Currently, external tables replication, copies the data in directory level. So, 
if target directory exist, then DistCp should compare and update or skip data 
files in the directory instead of creating new directory inside pre-existing 
target directory.
This can be achieved using -update.
Also, -delete option is needed to delete the files missing in source directory 
but present in target.
Hive should mandate these DistCp options even if user passes other options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to