[ 
https://issues.apache.org/jira/browse/PHOENIX-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866725#comment-15866725
 ] 

Poorna Chandra commented on PHOENIX-2993:
-----------------------------------------

[~jamestaylor] Tephra 0.11.0-incubating by default has pruning of the invalid 
list disabled, since this feature has not been tested extensively yet. Once we 
do some more testing, we can enable it by default from 0.12.0-incubating 
release onwards.

It would be great to have Phoenix start using it now, and give us early 
feedback. To enable pruning in 0.11.0, just set the configuration 
{{data.tx.prune.enable}} to {{true}}. Once you restart transaction service and 
HBase region servers after the configuration change, the invalid list will get 
pruned automatically based on major compactions.

Also note that Tephra will create an HBase table called {{tephra.state}} in the 
default namespace when pruning is enabled. The name of this table can be 
controlled by using the configuration parameter {{data.tx.prune.state.table}}.

> Tephra: Prune invalid transaction set once all data for a given invalid 
> transaction has been dropped
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2993
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2993
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Poorna Chandra
>            Assignee: Poorna Chandra
>         Attachments: ApacheTephraAutomaticInvalidListPruning.pdf
>
>
> From TEPHRA-35 -
> In addition to dropping the data from invalid transactions we need to be able 
> to prune the invalid set of any transactions where data cleanup has been 
> completely performed. Without this, the invalid set will grow indefinitely 
> and become a greater and greater cost to in-progress transactions over time.
> To do this correctly, the TransactionDataJanitor coprocessor will need to 
> maintain some bookkeeping for the transaction data that it removes, so that 
> the transaction manager can reason about when all of a given transaction's 
> data has been removed. Only at this point can the transaction manager safely 
> drop the transaction ID from the invalid set.
> One approach would be for the TransactionDataJanitor to update a table 
> marking when a major compaction was performed on a region and what 
> transaction IDs were filtered out. Once all regions in a table containing the 
> transaction data have been compacted, we can remove the filtered out 
> transaction IDs from the invalid set. However, this will need to cope with 
> changing region names due to splits, etc.
> Note: This will be moved to Tephra JIRA once the setup of Tephra JIRA is 
> complete (INFRA-11445)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to