[ https://issues.apache.org/jira/browse/PHOENIX-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342786#comment-15342786 ]
Poorna Chandra edited comment on PHOENIX-2993 at 6/22/16 12:44 AM: ------------------------------------------------------------------- Thanks for the review [~anew] and [~jamestaylor] [~anew] Regarding your questions - We can have a plugin architecture where there is a plugin for every datastore that is transactional. Each plugins computes the prune upper bound for its own datastore. A service in Transaction Manager can then get the prune upper bounds from all the plugins and do the pruning. Then we can let the plugin handle things like - * Figure out what tables are transactional. For HBase tables this can be a check to see if Transaction co-processor is attached to the table. * Store intermediate data - like {{(regionid, prune-uppper-bound-region)}}. Most likely the data will be stored in the datastore that the plugin is responsible for. I'll add details on this into the design doc. was (Author: poornachandra): Thanks for the review [~anew] and [~jamestaylor] [~anew] Regarding your questions - We can have a plugin architecture where there is a plugin for every datastore that is transactional. Each plugins computes the prune upper bound for its own datastore. A service in Transaction Manager can then get the prune upper bounds from all the plugins and do the pruning. Then we can let the plugin handle things like - * Figure out what tables are transactional. * Store intermediate data - like {{(regionid, prune-uppper-bound-region)}}. Most likely the data will be stored in the datastore that the plugin is responsible for. I'll add details on this into the design doc. > Tephra: Prune invalid transaction set once all data for a given invalid > transaction has been dropped > ---------------------------------------------------------------------------------------------------- > > Key: PHOENIX-2993 > URL: https://issues.apache.org/jira/browse/PHOENIX-2993 > Project: Phoenix > Issue Type: New Feature > Reporter: Poorna Chandra > Assignee: Poorna Chandra > Attachments: ApacheTephraAutomaticInvalidListPruning.pdf > > > From TEPHRA-35 - > In addition to dropping the data from invalid transactions we need to be able > to prune the invalid set of any transactions where data cleanup has been > completely performed. Without this, the invalid set will grow indefinitely > and become a greater and greater cost to in-progress transactions over time. > To do this correctly, the TransactionDataJanitor coprocessor will need to > maintain some bookkeeping for the transaction data that it removes, so that > the transaction manager can reason about when all of a given transaction's > data has been removed. Only at this point can the transaction manager safely > drop the transaction ID from the invalid set. > One approach would be for the TransactionDataJanitor to update a table > marking when a major compaction was performed on a region and what > transaction IDs were filtered out. Once all regions in a table containing the > transaction data have been compacted, we can remove the filtered out > transaction IDs from the invalid set. However, this will need to cope with > changing region names due to splits, etc. > Note: This will be moved to Tephra JIRA once the setup of Tephra JIRA is > complete (INFRA-11445) -- This message was sent by Atlassian JIRA (v6.3.4#6332)