[
https://issues.apache.org/jira/browse/PHOENIX-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352669#comment-17352669
]
ASF GitHub Bot commented on PHOENIX-6476:
-----------------------------------------
gokceni commented on a change in pull request #1240:
URL: https://github.com/apache/phoenix/pull/1240#discussion_r640841573
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRepairRegionScanner.java
##########
@@ -303,20 +306,47 @@ public Boolean call() throws Exception {
return dataRowKeys;
}
+ /**
+ * @param indexMutationMap actual index mutations for a page
+ * @param dataRowKeysSetList List of per-task data row keys
+ * @return For each set of data row keys, split the acutal index mutation
map into
+ * a per-task index mutation map and return the list of all index mutation
maps.
+ */
+ private List<Map<byte[], List<Mutation>>> getPerTaskIndexMutationMap(
+ Map<byte[], List<Mutation>> indexMutationMap, List<Set<byte[]>>
dataRowKeysSetList) {
+ List<Map<byte[], List<Mutation>>> mapList =
Lists.newArrayListWithExpectedSize(dataRowKeysSetList.size());
+ for (int i = 0; i < dataRowKeysSetList.size(); ++i) {
+ Map<byte[], List<Mutation>> perTaskIndexMutationMap = new
TreeMap<>(Bytes.BYTES_COMPARATOR);
+ mapList.add(perTaskIndexMutationMap);
+ }
+ for (Map.Entry<byte[], List<Mutation>> entry :
indexMutationMap.entrySet()) {
+ byte[] indexRowKey = entry.getKey();
+ List<Mutation> actualMutationList = entry.getValue();
+ byte[] dataRowKey = indexMaintainer.buildDataRowKey(new
ImmutableBytesWritable(indexRowKey), viewConstants);
+ for (int i = 0; i < dataRowKeysSetList.size(); ++i) {
+ if (dataRowKeysSetList.get(i).contains(dataRowKey)) {
+ mapList.get(i).put(indexRowKey, actualMutationList);
Review comment:
don't you need to break?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Index tool when verifying from index to data doesn't correctly split page
> into tasks
> ------------------------------------------------------------------------------------
>
> Key: PHOENIX-6476
> URL: https://issues.apache.org/jira/browse/PHOENIX-6476
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.3, 4.16.0, 4.16.1
> Reporter: Tanuj Khurana
> Assignee: Tanuj Khurana
> Priority: Major
>
> When running index tool with index table as source, it splits a page into
> tasks when the page size is greater than the configured task size (default
> 2048) and runs each task in parallel. Each task is assigned a set of data row
> keys but the index mutation map is not split according to the data row keys
> assigned to a particular task. As a result, the tool reports wrong results
> because the index mutation map is per page but the set of data row keys is
> per task.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)