Davis-Zhang-Onehouse commented on code in PR #13622:
URL: https://github.com/apache/hudi/pull/13622#discussion_r2234118854
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java:
##########
@@ -831,49 +828,23 @@ public HoodiePairData<String, Set<String>>
getSecondaryIndexRecords(HoodieData<S
}
}
- private HoodiePairData<String, Set<String>>
getSecondaryIndexRecordsV1(HoodieData<String> keys, String partitionName) {
+ private HoodiePairData<String, String>
getSecondaryIndexRecordsV1(HoodieData<String> keys, String partitionName) {
if (keys.isEmpty()) {
return HoodieListPairData.eager(Collections.emptyList());
}
- Map<String, Set<String>> res = getRecordsByKeyPrefixes(keys,
partitionName, false, SecondaryIndexKeyUtils::escapeSpecialChars)
- .map(record -> {
- if (!record.getData().isDeleted()) {
- return
SecondaryIndexKeyUtils.getSecondaryKeyRecordKeyPair(record.getRecordKey());
- }
- return null;
- })
- .filter(Objects::nonNull)
- .collectAsList()
- .stream()
- .collect(HashMap::new,
- (map, pair) -> map.computeIfAbsent(pair.getKey(), k -> new
HashSet<>()).add(pair.getValue()),
- (map1, map2) -> map2.forEach((k, v) ->
map1.computeIfAbsent(k, key -> new HashSet<>()).addAll(v)));
-
-
- return HoodieListPairData.eager(
- res.entrySet()
- .stream()
- .collect(Collectors.toMap(
- Map.Entry::getKey,
- entry ->
Collections.singletonList(entry.getValue())
- ))
- );
- }
-
- private HoodiePairData<String, Set<String>>
getSecondaryIndexRecordsV2(HoodieData<String> secondaryKeys, String
partitionName) {
+ return getRecordsByKeyPrefixes(keys, partitionName, false,
SecondaryIndexKeyUtils::escapeSpecialChars)
+ .mapToPair(hoodieRecord ->
SecondaryIndexKeyUtils.getSecondaryKeyRecordKeyPair(hoodieRecord.getRecordKey()));
+ }
+
+ private HoodiePairData<String, String>
getSecondaryIndexRecordsV2(HoodieData<String> secondaryKeys, String
partitionName) {
Review Comment:
it is called by
org.apache.hudi.metadata.HoodieBackedTableMetadata#getSecondaryIndexRecords,
which checks which secondary index version it is and route to either v1 or v2
specific API.
> Its called only from v1 API and could be redundant?
Not true as the caller handles both v1 and v2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]