[
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579365#comment-16579365
]
Yonatan Gottesman commented on OMID-102:
----------------------------------------
Hi [~jamestaylor],
I cannot seem to pass testCheckpointAndRollback in Phoenix.
When i debug the cells that get called in filterKeyValue(Cell v) in our filter,
i can see im getting the same cells with different versions even if i return
INCLUDE_AND_NEXT_COL, then i noticed Tephra wrap there filter with another
CellSkipFilter that seems to take care of this problem. From the code comments:
/**
* \{@link Filter} that encapsulates another \{@link Filter}. It remembers the
last \{@link KeyValue}
* for which the underlying filter returned the \{@link ReturnCode#NEXT_COL} or
\{@link ReturnCode#INCLUDE_AND_NEXT_COL},
* so that when \{@link #filterKeyValue} is called again for the same \{@link
KeyValue} with different
* version, it returns \{@link ReturnCode#NEXT_COL} directly without consulting
the underlying \{@link Filter}.
* Please see TEPHRA-169 for more details.
*/
Is this a known issue in hbase that cells with lower versions get called to
filterKeyValue even if i return NEXT_COL?
> Implement visibility filter as pure HBase Filter
> ------------------------------------------------
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: Yonatan Gottesman
> Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the
> way it's implemented (i.e. the way the filtering is done *after* the next
> call). The reason is that the state of HBase filters get messed up since
> these filters will start to see cells that it shouldn't (i.e. cells that
> would be filtered based on snapshot isolation). It cannot be worked around by
> manually running filters afterwards because filters may issue seek calls
> which are handled during the running of scans by HBase.
>
> Instead, the filtering needs to be implemented as a pure HBase filter and
> that filter needs to delegate to the other, delegate filter once it's
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter
> and they way it calls the delegate filter (cellFilters) only after it's
> determined that the cell is visible. You may run into TEPHRA-169 without
> including the CellSkipFilter too.
> Because it'll be easier if you see shadow cells *before* their corresponding
> real cells you can prefix instead of suffix the column qualifiers to
> guarantee that you'd see the shadow cells prior to the actual cells. Or you
> could buffer cells in your filter prior to omitting them. Another issue would
> be if the shadow cells aren't found and you need to consult the commit table
> - I suppose if the shadow cells are first, this logic would be easier to know
> when it needs to be called.
>
> To reproduce, see the Phoenix unit tests
> FlappingTransactionIT.testInflightUpdateNotSeen() and
> testInflightDeleteNotSeen().
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)