[ 
https://issues.apache.org/jira/browse/HIVE-28772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshal Patel updated HIVE-28772:
---------------------------------
    Status: Patch Available  (was: In Progress)

> Clear REPL_TXN_MAP table on DR when deleting replication policy
> ---------------------------------------------------------------
>
>                 Key: HIVE-28772
>                 URL: https://issues.apache.org/jira/browse/HIVE-28772
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>            Reporter: Harshal Patel
>            Assignee: Harshal Patel
>            Priority: Major
>              Labels: Replication, pull-request-available
>
> Currently, if you create scheduled queries for repl dump and repl load, then 
> by design during incremental replication transactions can span across 
> multiple replication runs.
> And if OPEN_TXN is replicated, then on the DR side it will add an entry in 
> the REPL_TXN_MAP table to keep track of open transactions for that database.
> But before replaying COMMIT_TXN/ABORT_TXN, if user deletes the policy and 
> drops the database on the DR to re-bootstrap, then we are not cleaning up the 
> REPL_TXN_MAP table
> Which can lead to 2 major issues:
>  # It can bring down AcidHouseKeeper cleaner services, which clean up 
> transaction-related tables in the HMS, because it reads the database name in 
> the REPL_TXN_MAP and tries to get the database information for that and fails 
> with a database doesn't exist error. This can bloat up the HMS.
>  # If user creates a policy quickly before AcidHouseKeeper kicks in the in 
> that case newly created database will be marked as repl incompatible from the 
> AcidHouseKeeper while cleaning up the TXNS and REPL_TXN_MAP table after 11 
> days



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to