n3nash commented on a change in pull request #2496:
URL: https://github.com/apache/hudi/pull/2496#discussion_r565102473



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -415,17 +420,18 @@ public static boolean isLogFile(Path logPath) {
     return matcher.find() && logPath.getName().contains(".log");
   }
 
+  public static boolean isDataFile(Path path) {
+    String extension = FSUtils.getFileExtension(path.getName());
+    return DATA_FILE_EXTENSIONS.contains(extension);
+  }
+
   /**
    * Get the names of all the base and log files in the given partition path.
    */
   public static FileStatus[] getAllDataFilesInPartition(FileSystem fs, Path 
partitionPath) throws IOException {
-    final Set<String> validFileExtensions = 
Arrays.stream(HoodieFileFormat.values())
-        
.map(HoodieFileFormat::getFileExtension).collect(Collectors.toCollection(HashSet::new));
-    final String logFileExtension = 
HoodieFileFormat.HOODIE_LOG.getFileExtension();
-
     return Arrays.stream(fs.listStatus(partitionPath, path -> {
       String extension = FSUtils.getFileExtension(path.getName());
-      return validFileExtensions.contains(extension) || 
path.getName().contains(logFileExtension);
+      return DATA_FILE_EXTENSIONS.contains(extension);

Review comment:
       Isn't this changing the behavior ? Earlier looks like base-file and 
log-file both are acceptable, in your change, only base-file is acceptable, 
what is the reason for this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to