[ 
https://issues.apache.org/jira/browse/PHOENIX-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113226#comment-15113226
 ] 

Thomas D'Silva commented on PHOENIX-2582:
-----------------------------------------

Attaching a possible solution from a email conversation with [~apurtell]

>In lieu of an (external) transaction manager, maybe you could run a Procedure 
>that must complete before the index create is declared successful? Procedure 
>is HBase's i?>internal coordination framework. HBase 0.98 and 1.0 have 
>ProcedureV1. HBase 1.1+ has ProcedureV2. 
>
>Your procedure workers would set the writestate on each region to readonly, 
>wait for in flight writes to finish, and then join the barrier. Once inside 
>the barrier your workers >could make the index related state changes, or just 
>return if no further work needed. Your procedure workers would reset 
>writestate in the cleanup callback. Your coordinator >(in the master) can wait 
>on a monitor for global completion or poll on a completion status check. Note 
>Procedures will complete in either successful or failed state. Failure >may be 
>explicit (worker posted failure notice) or a timeout. If failed, you'll need 
>to retry. Once one of these has completed successfully, you would be good. 

> Creating an index while a batch of rows is being written leads to missing 
> rows in the index table
> -------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2582
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2582
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Thomas D'Silva
>
> If we create an index while we are upserting rows to the table its possible 
> we can miss writing corresponding rows to the index table. 
> If a region server is writing a batch of rows and we create an index just 
> before the batch is written we will miss writing that batch to the index 
> table. This is because we run the inital UPSERT SELECT to populate the index 
> with an SCN that we get from the server which will be before the timestamp 
> the batch of rows is written. 
> We need to figure out if there is a way to determine that are pending batches 
> have been written before running the UPSERT SELECT to do the initial index 
> population.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to