[ 
https://issues.apache.org/jira/browse/HIVE-21773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21773:
---------------------------------------
    Fix Version/s:     (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Supporting external table replication with partition filter.
> ------------------------------------------------------------
>
>                 Key: HIVE-21773
>                 URL: https://issues.apache.org/jira/browse/HIVE-21773
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2, repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>
> Hive external table replication is done differently than managed table 
> replication. In case of external table, list is created for the locations of 
> the table and partitions to be replicated. If the partition location is 
> within the table location, then partition location is not added to the list. 
> For partitions with location outside table, partition location is added to 
> the list. In case of incremental dump, the data related events are ignored 
> and just the metadata related events are dumped. The list of location is 
> prepared and that is used for replication. During load, the events are 
> replayed and then the distcp tasks are created, one for each location present 
> in the list.
> For partition level replication, not all partition will be present in the 
> dump. So even if the partition locations are within the table location, each 
> partition location will be added to the list.
>  * If where condition is present in the REPL DUMP command then add location 
> for each satisfying partition even though the partition location is within 
> table location.
>  * If table is not mentioned in the where clause then follow the older 
> behavior.
>  * If table is mentioned with a key but the key does not match any of the 
> partitioned column then fail repl dump.
>  * If the table is mentioned with the key and even if all the partitions are 
> satisfying the filter condition, add location for each partition. This is to 
> avoid copying partitions which are added using alter after the dump.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to