[
https://issues.apache.org/jira/browse/PHOENIX-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15039763#comment-15039763
]
James Taylor commented on PHOENIX-2221:
---------------------------------------
Thanks, [~ayingshu]. There's an issue with this approach that needs to be fixed
IMO to make it viable. The edits to the table have already been written by the
time the index updates fail, so the table and index are already out of sync.
Since you're preventing writes to the data table until the index is back up to
date, you should remember the timestamp at which you've done the writes and use
this as an upper bound on the scan time range for any future scan of the data
table (until the index is back in sync). You could record this in the same
column you're using for the index. That way, the table and index would remain
in sync.
Here's some more detailed feedback:
- Mimic what MutableIndexFailureIT is doing for setup instead of doing the
setup yourself. You can also make you test parameterized to run for both local
and global mutable indexes. Your current version would stop and start a mini
cluster for each test which would be excruciatingly slow.
- Instead of blocking writes in DeleteCompiler and UpsertCompiler, block them
in MutationState.validate(), after the call to updateCache is done. This will
guarantee that the client has the latest schema for the table, including any
indexes that may have been added by a different client that aren't yet known
about. It'd also handle the case where the index is again back online, as the
updateCache call would pull over the new index metadata.
- You're not implementing a delegate correctly for IndexFailurePolicy. Create a
separate DelegateIndexFailurePolicy class that takes a IndexFailurePolicy in
the constructor, has a delegate member variable, and calls delegate.<method>
for each implementation. Eclipse will generate one of these for you.
- Can you rename PIndexState.READABLE to PIndexState.READ_ONLY.
- The change in QueryOptimizer can be simpler. You don't need to check
indexReadable. The index would never be put in that state if the option wasn't
configured (and this would force you to set the config on both the client and
the server). Just check if the index is ACTIVE or READABLE.
- Instead of creating a MetaDataClient.recoverPartialIndexFromTimeStamp method,
just add an argument to buildPartialIndexFromTimeStamp call that says what to
set the index back to in the event of failure.
> Option to make data regions not writable when index regions are not available
> -----------------------------------------------------------------------------
>
> Key: PHOENIX-2221
> URL: https://issues.apache.org/jira/browse/PHOENIX-2221
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Devaraj Das
> Assignee: Alicia Ying Shu
> Fix For: 4.7.0
>
> Attachments: PHOENIX-2221-v1.patch, PHOENIX-2221-v2.patch,
> PHOENIX-2221-v3.patch, PHOENIX-2221.patch
>
>
> In one usecase, it was deemed better to not accept writes when the index
> regions are unavailable for any reason (as opposed to disabling the index and
> the queries doing bigger data-table scans).
> The idea is that the index regions are kept consistent with the data regions,
> and when a query runs against the index regions, one can be reasonably sure
> that the query ran with the most recent data in the data regions. When the
> index regions are unavailable, the writes to the data table are rejected.
> Read queries off of the index regions would have deterministic performance
> (and on the other hand if the index is disabled, then the read queries would
> have to go to the data regions until the indexes are rebuilt, and the queries
> would suffer).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)