[
https://issues.apache.org/jira/browse/PHOENIX-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058796#comment-16058796
]
Samarth Jain commented on PHOENIX-3970:
---------------------------------------
The way index rebuild works is by kicking off a select count(*) query on the
data table with the index rebuild flag on. See this in
UngroupedAggregateRegionObserver:
{code}
if (ScanUtil.isIndexRebuild(scan)) {
return rebuildIndices(s, region, scan, env.getConfiguration());
}
{code}
This kicks off a raw scan which scans the data table and replays the mutations
by replaying those mutations on the *data table*. For every such batch of
"replay mutations", the indexer co-processor does the rebuild work.
See LocalTableState#getCurrentRowState()
{code}
// need to use a scan here so we can get raw state, which Get doesn't provide.
Scan s =
IndexManagementUtil.newLocalStateScan(Collections.singletonList(columns));
s.setStartRow(row);
s.setStopRow(row);
if (ignoreNewerMutations) {
// Provides a means of client indicating that newer cells should not be
considered,
// enabling mutations to be replayed to partially rebuild the index
when a write fails.
// When replaying mutations we want the oldest timestamp (as anything
newer we be replayed)
long ts = getOldestTimestamp(m.getFamilyCellMap().values());
s.setTimeRange(0,ts);
}
{code}
Now, the replay of mutations on data table needs to be on a handler pool that
is different from the handler pool doing indexer work. It looks like this patch
is making the count(*) query also use the index handler pool which can cause
deadlocks.
> Ensure that automatic partial index rebuilds are served from the index
> handler pool
> -----------------------------------------------------------------------------------
>
> Key: PHOENIX-3970
> URL: https://issues.apache.org/jira/browse/PHOENIX-3970
> Project: Phoenix
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Attachments: 3970.txt, 3970-v2.txt
>
>
> This (and other issues) have rendered multiple larger cluster inoperable.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)