[
https://issues.apache.org/jira/browse/PHOENIX-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008938#comment-16008938
]
Vincent Poon commented on PHOENIX-3847:
---------------------------------------
I guess then point queries wouldn't work. Hmm yea not sure we can get around
the extra A,1 at 3000
> Handle out of order rows during index maintenance
> -------------------------------------------------
>
> Key: PHOENIX-3847
> URL: https://issues.apache.org/jira/browse/PHOENIX-3847
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> Based on the investigation and work done in PHOENIX-3825 plus the existence
> of the ignoreNewerMutations flag, it seems that out of order rows are not
> handled correctly during index maintenance. When the user handles replaying
> failed batches, we force them to submit them in timestamp order. As long as
> the user provides the original timestamp, the order shouldn't matter.
> Regardless of the order the server processes data table mutations, the
> resulting index rows should be the same and should purely be based on the
> cell time stamp of the data rows. Ideally, we shouldn't need the
> ignoreNewerMutations flag at all. Perhaps that was the intent with
> IndexUpdateManager.fixUpCurrentUpdates(), but it doesn't to be working.
> Would it work to simply generate all the index rows for the mutating data
> rows for all versions? We should walk through a series of examples to see if
> this would work. For example, with the following data table:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | C | 3000
> the index table would look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 3000
> | Put | C,1 | 3000
> Then if a Put comes in out of order at 2000, the data table would look like
> this:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | B | 2000
> | Put | 1 | C | 3000
> and the index table should look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 2000
> | Put | B,1 | 2000
> | Del | B,1 | 3000
> | Put | C,1 | 3000
> Given that we can't reverse Delete markers, I'm not sure we can get there
> completely. We'd still have a Delete of A,1 @ 3000. But perhaps this is not a
> problem? We'd need to play this out further and include scenarios with row
> delete as well.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)