[ 
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090746#comment-15090746
 ] 

James Taylor commented on PHOENIX-2446:
---------------------------------------

Was thinking on my above idea a bit, and I don't think it'll help for all 
cases. The trickiest one is when a big UPSERT SELECT is running when a CREATE 
INDEX occurs. All of the rows being upserted will be timestamped back in time 
(at the timestamp at which the data was read). We need to do that so that the 
scans reading the data won't see the data being written (and get into an 
infinite loop when reading/writing to the same table). 

The sleep we're doing may help some simple cases, like an UPSERT VALUES 
occurring at the same time as the CREATE INDEX, but it's unlikely to help for 
the above case. One way to deal with this would be for the client to detect 
that a CREATE INDEX occurred while the UPSERT SELECT is running and start to do 
incremental maintenance on the rows being upserted. We'd need to update the 
metadata cache using LATEST_TIMESTAMP at which point the index would be found 
and incremental index maintenance would start.

One consideration is if a DELETE is being executed at the same as time as the 
UPSERT SELECT ( I think this would be an issue even outside of this overlapping 
create index/mutation case). In theory, mutable indexes should handle this as 
they handle out-of-order updates. For the immutable case, we could
# ignore it and document it as deletes over immutable data is more of a 
test-environment type of feature. The one case where it's not is DROP of a 
view, but this case could be handled by detecting this when we update the 
metadata cache (as we'd no longer find the view).
# tell the server that these indexes are mutable and let the out-of-order logic 
handle this case. Unless we always treat these indexes as mutable, we can't 
handle the DELETE at the same time as an UPSERT case 100% correctly, so we'd 
lose the perf benefit of indexes over immutable data.

Based on this, I think (1) is a better option with clear docs on the interplay 
between DELETE and UPSERT. To handle it this way, we'd need to:
- Issue the updateCache call we do in MutationState.validate() at the 
LATEST_TIMESTAMP if the table is not transactional. This will handle all the 
mutation cases: UPSERT VALUES, UPSERT SELECT, and DELETE, essentially 
initiating incremental index maintenance *before* the index has been created to 
ensure that we don't miss any rows. Worst case, we'll be issuing duplicate 
mutations (but that's better than no issuing enough).
- If upon updating the metadata cache we do not find the table any longer *and* 
the table is immutable, then the mutation should not be performed (the logic 
being that a DROP was performed after the mutation started). 
- None of this logic will be necessary for transactional tables as we have a 
different mechanism that relies on the transaction manager to detect these 
cases for us (see PHOENIX-2478).

Make sense, [~tdsilva]?




> Immutable index - Index vs base table row count does not match when index is 
> created during data load
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2446
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2446
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2446.patch
>
>
> I'll add more details later but here's the scenario that consistently 
> produces wrong row count for index table vs base table for immutable async 
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R 
> index finishes.
> 5. End data upsert. 
> Now count with index enabled vs count with hint to not use index is off by a 
> large factor. Will get a cleaner repro for this issue soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to