James Taylor created PHOENIX-3847:
-------------------------------------

             Summary: Handle out of order rows during index maintenance
                 Key: PHOENIX-3847
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3847
             Project: Phoenix
          Issue Type: Bug
            Reporter: James Taylor


Based on the investigation and work done in PHOENIX-3825 plus the existence of 
the ignoreNewerMutations flag, it seems that out of order rows are not handled 
correctly during index maintenance. Regardless of the order the server 
processes data table mutations, the resulting index rows should be the same and 
should purely be based on the cell time stamp of the data rows. Ideally, we 
shouldn't need the ignoreNewerMutations flag at all. Perhaps that was the 
intent with IndexUpdateManager.fixUpCurrentUpdates(), but it doesn't to be 
working.

Would it work to simply generate all the index rows for the mutating data rows 
for all versions? We should walk through a series of examples to see if this 
would work.  For example, with the following data table:

|Type|RowKey|Value|Timestamp
| Put | 1 | A | 1000
| Put | 1 | C | 3000

the index table would look like this:

|Type|RowKey|Timestamp
| Put | A,1 | 1000
| Del | A,1 | 3000
| Put | C,1 | 3000

Then if a Put comes in out of order at 2000, the data table would look like 
this:

|Type|RowKey|Value|Timestamp
| Put | 1 | A | 1000
| Put | 1 | B | 2000
| Put | 1 | C | 3000

and the index table should look like this:

|Type|RowKey|Timestamp
| Put | A,1 | 1000
| Del | A,1 | 2000
| Put | B,1 | 2000
| Del | B,1 | 3000
| Put | C,1 | 3000

Given that we can't reverse Delete markers, I'm not sure we can get there 
completely. We'd still have a Delete of A,1 @ 3000. But perhaps this is not a 
problem? We'd need to play this out further and include scenarios with row 
delete as well.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to