difin commented on code in PR #5389:
URL: https://github.com/apache/hive/pull/5389#discussion_r1712199632


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java:
##########
@@ -396,45 +396,49 @@ public static PartitionData toPartitionData(StructLike 
sourceKey, Types.StructTy
 
   /**
    * Returns list of data files filtered by specId and partitionPath as 
following:
-   *  1. If matchBySpecId is true, then filters files by specId == file's 
specId, else by specId != file's specId
-   *  2. If partitionPath is not null, then also filters files where 
partitionPath == file's partition path
+   *  1. If table is unpartitioned, returns all data files without filtering.
+   *  2. If matchBySpecId is true, then filters files by specId == file's 
specId, else by specId != file's specId
+   *  3. If partitionPath is not null, then also filters files where 
partitionPath == file's partition path
    * @param table the iceberg table
    * @param specId partition spec id
    * @param partitionPath partition path
-   * @param matchBySpecId filter that's applied on data files' spec ids
    */
-  public static List<DataFile> getDataFiles(Table table, int specId, String 
partitionPath,
-      Predicate<Object> matchBySpecId) {
+  public static List<DataFile> getDataFiles(Table table, Integer specId, 
String partitionPath) {
+    PartitionSpec spec = table.spec().isPartitioned() ? 
table.specs().get(specId) : table.spec();
+    Predicate<Object> matchByEquality = Predicate.isEqual(specId);
+    Predicate<Object> matchBySpecId = partitionPath != null ? matchByEquality 
: matchByEquality.negate();
     CloseableIterable<FileScanTask> fileScanTasks =
         
table.newScan().useSnapshot(table.currentSnapshot().snapshotId()).ignoreResiduals().planFiles();
     CloseableIterable<FileScanTask> filteredFileScanTasks =
         CloseableIterable.filter(fileScanTasks, t -> {
           DataFile file = t.asFileScanTask().file();
-          return matchBySpecId.test(file.specId()) && (partitionPath == null 
|| (partitionPath != null &&
-                  
table.specs().get(specId).partitionToPath(file.partition()).equals(partitionPath)));
+          return !spec.isPartitioned() || matchBySpecId.test(file.specId()) &&

Review Comment:
   I refactored this function by removing the specId parameter because the 
logic could be implemented without.
   I updated the javadoc as following:
   
   ```
      *  1. If the table is unpartitioned, returns all data files.
      *  2. If partitionPath is not provided, returns all data files that 
belong to the non-latest partition spec.
      *  3. If partitionPath is provided, returns all data files that belong to 
the corresponding partition.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to