Viraj Jasani created PHOENIX-7513:
-------------------------------------
Summary: Clean-up CDC partition metadata for closed partitions
Key: PHOENIX-7513
URL: https://issues.apache.org/jira/browse/PHOENIX-7513
Project: Phoenix
Issue Type: Sub-task
Reporter: Viraj Jasani
Phoenix CDC Partitions can be categorized into two categories:
# Open partitions: Any partition with corresponding data table region that is
currently active is considered as open partition. The data table region can
continue to server read/write requests until it is split into two daughter
regions or multiple parent regions are merged into one region.
# Closed partitions: Any partition with corresponding data table regions that
is not longer alive and ready to be archived or already archived after getting
split or merged into new region(s), is considered as closed partition. The data
table region is no longer live and hence can no longer server any more
read/write requests.
Once parent region(s) split or merged into child region(s), metadata for the
closed partitions should stay in SYSTEM.CDC_STREAM at least for predetermined
Stream metadata TTL time duration (let's say 24 hr by default). After this
duration, the records should be cleaned up.
The cleanup can be performed in any of the two ways:
Wither, use background Task that can clean up partitions that have been closed
i.e. the rows with not-null PARTITION_END_TIME and PHOENIX_ROW_TIMESTAMP()
value less than current time - TTL (24 hr)
Or, use Conditional TTL with condition like:
{code:java}
TTL_EXPRESSION = CASE WHEN PHOENIX_ROW_TIMESTAMP() < (CURRENT_TIME() - 24 hr)
AND PARTITION_END_TIME IS NOT NULL THEN 0 ELSE <FOREVER> END{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)