[
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102990#comment-15102990
]
Lars Hofhansl commented on PHOENIX-2446:
----------------------------------------
Chatted with James a bit. So to recap, the problem is the HBase MVCC
transaction in flight before the index created; those would be missed since
they did not exist when the index was created and are also not yet seen by a
more or less parallel upsert select statement, right?
A flush won't help.
The only thing that I can see would help is to await at least one MVCC
transaction on all region servers. That is annoying, but doable.
i.e. calling {{mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());}} that
will force all prior MVCC transactions - if any - to return.
{{mvcc}} get be retrieved from the region interface with the getMVCC() method.
(As an aside, I'd add that upsert select won't scale to the kind of data size
where HBase/Phoenix would actually be interesting. For less than maybe a few
100m rows, one should use Postgres or equivalent.)
> Immutable index - Index vs base table row count does not match when index is
> created during data load
> -----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2446
> URL: https://issues.apache.org/jira/browse/PHOENIX-2446
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.6.0
> Reporter: Mujtaba Chohan
> Assignee: Thomas D'Silva
> Fix For: 4.7.0
>
> Attachments: PHOENIX-2446-wip.patch, PHOENIX-2446.patch, server.log
>
>
> I'll add more details later but here's the scenario that consistently
> produces wrong row count for index table vs base table for immutable async
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R
> index finishes.
> 5. End data upsert.
> Now count with index enabled vs count with hint to not use index is off by a
> large factor. Will get a cleaner repro for this issue soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)