teeyog commented on a change in pull request #2475: URL: https://github.com/apache/hudi/pull/2475#discussion_r569889546
########## File path: hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java ########## @@ -84,6 +86,39 @@ public static String getTablePath(FileSystem fs, Path[] userProvidedPaths) throw throw new TableNotFoundException("Unable to find a hudi table for the user provided paths."); } + public static Option<String> getOnePartitionPath(FileSystem fs, Path tablePath) throws IOException { + // When the table is not partitioned + if (HoodiePartitionMetadata.hasPartitionMetadata(fs, tablePath)) { + return Option.of(tablePath.toString()); + } + FileStatus[] statuses = fs.listStatus(tablePath); + for (FileStatus status : statuses) { + if (status.isDirectory()) { + if (HoodiePartitionMetadata.hasPartitionMetadata(fs, status.getPath())) { + return Option.of(status.getPath().toString()); + } else { + Option<String> partitionPath = getOnePartitionPath(fs, status.getPath()); + if (partitionPath.isPresent()) { + return partitionPath; Review comment: Thank you for your review, this method of obtaining partitions is very fast. As long as one partition path is obtained, it will return directly. FSUtils.getAllPartitionPaths will obtain all partition paths, which is very time-consuming. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org