[
https://issues.apache.org/jira/browse/PHOENIX-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091222#comment-15091222
]
James Taylor commented on PHOENIX-2446:
---------------------------------------
Yes, the scn of the index population would be after the ts of the UPSERT
SELECT, but the UPSERT SELECT may still be running when the CREATE INDEX is
issued. That's the root of the problem, though - we may not have all the table
rows that are earlier than the initial index population ts. The fix I propose
will essentially turn on incremental index population for in-flight statements
that may be in the process of adding data table rows when the CREATE INDEX
statement is issued.
One more small change will be necessary that I didn't mention before too. When
auto commit is on, we call updateCache for UPSERT VALUES from FromCompiler here:
{code}
protected TableRef createTableRef(NamedTableNode tableNode, boolean
updateCacheImmediately) throws SQLException {
String tableName = tableNode.getName().getTableName();
String schemaName = tableNode.getName().getSchemaName();
long timeStamp = QueryConstants.UNSET_TIMESTAMP;
String fullTableName = SchemaUtil.getTableName(schemaName,
tableName);
PName tenantId = connection.getTenantId();
PTable theTable = null;
if (updateCacheImmediately || connection.getAutoCommit()) {
MetaDataMutationResult result = client.updateCache(schemaName,
tableName);
{code}
Instead, we should always let the updateCache call be done in
MutationState.validate(), so the following if statement should be removed:
{code}
private long validate(TableRef tableRef, Map<ImmutableBytesPtr,
RowMutationState> rowKeyToColumnMap) throws SQLException {
Long scn = connection.getSCN();
MetaDataClient client = new MetaDataClient(connection);
long serverTimeStamp = tableRef.getTimeStamp();
// If we're auto committing, we've already validated the schema
when we got the ColumnResolver,
// so no need to do it again here.
if (!connection.getAutoCommit()) {
{code}
> Immutable index - Index vs base table row count does not match when index is
> created during data load
> -----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2446
> URL: https://issues.apache.org/jira/browse/PHOENIX-2446
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.6.0
> Reporter: Mujtaba Chohan
> Assignee: Thomas D'Silva
> Fix For: 4.7.0
>
> Attachments: PHOENIX-2446.patch
>
>
> I'll add more details later but here's the scenario that consistently
> produces wrong row count for index table vs base table for immutable async
> index.
> 1. Start data upsert
> 2. Create async index
> 3. Trigger M/R index build
> 4. Keep data upsert going in background during step 2,3 and a while after M/R
> index finishes.
> 5. End data upsert.
> Now count with index enabled vs count with hint to not use index is off by a
> large factor. Will get a cleaner repro for this issue soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)