pvary commented on code in PR #3053:
URL: https://github.com/apache/hive/pull/3053#discussion_r848648030
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java:
##########
@@ -422,21 +415,46 @@ void findUnknownPartitions(Table table, Set<Path>
partPaths, byte[] filterExp,
}
allPartDirs = partDirs;
}
- // don't want the table dir
- allPartDirs.remove(tablePath);
-
- // remove the partition paths we know about
- allPartDirs.removeAll(partPaths);
-
Set<String> partColNames = Sets.newHashSet();
for(FieldSchema fSchema : getPartCols(table)) {
partColNames.add(fSchema.getName());
}
Map<String, String> partitionColToTypeMap =
getPartitionColtoTypeMap(table.getPartitionKeys());
+
+ FileSystem fs = tablePath.getFileSystem(conf);
+ Set<Path> correctPartPathsInMS = new HashSet<>(partPathsInMS);
Review Comment:
At this place we have 4 more-or-less similar copies of file listing in
memory:
1. `partPaths` - Path objects from the HMS and every parent of the partitions
2. `partPathsInMS` - Path objects from the HMS
3. `correctPartPathsInMS` - This will be the final result, but here this is
a duplicate of the partPathsInMS`
4. `allPartDirs` - Recursive listing of the table root dir(?)
Do we need all of these? Would it be better to store only the difference of
the current `partPaths` and `partPathsInMS` in a list instead of storing the
full list again?
Could we build up the `correctPartPathsInMS` when we are iterating through
the `partPathsInMS`? Would that be comparable in time complexity and more
optimal in space complexity?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]