Re: [PR] HIVE-28342: Iceberg: Major QB Compaction support filter in compaction… [hive]

via GitHub Wed, 04 Sep 2024 13:10:41 -0700


difin commented on code in PR #5393:
URL: https://github.com/apache/hive/pull/5393#discussion_r1736926789



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -2095,6 +2097,55 @@ public List<Partition> 
getPartitionsByExpr(org.apache.hadoop.hive.ql.metadata.Ta
     }
   }
 
+  @Override
+  public List<Partition> 
getPartitionsWithFilter(org.apache.hadoop.hive.ql.metadata.Table hmsTable,
+      ExprNodeDesc filter, boolean currentSpec) {
+    SearchArgument sarg = ConvertAstToSearchArg.create(conf, 
(ExprNodeGenericFuncDesc) filter);
+    Expression exp = HiveIcebergFilterFactory.generateFilterExpression(sarg);
+    Table table = IcebergTableUtil.getTable(conf, hmsTable.getTTable());
+    int tableSpecId = table.spec().specId();
+    List<Partition> partitions = Lists.newArrayList();
+
+    TableScan scan = 
table.newScan().filter(exp).caseSensitive(false).includeColumnStats().ignoreResiduals();
+
+    try (CloseableIterable<FileScanTask> tasks = scan.planFiles()) {
+      tasks.forEach(task -> {
+        DataFile file = task.file();
+        PartitionSpec spec = task.spec();
+        if ((currentSpec && file.specId() == tableSpecId) || (!currentSpec && 
file.specId() != tableSpecId)) {
+          PartitionData partitionData = 
IcebergTableUtil.toPartitionData(task.partition(), spec.partitionType());
+          String partName = spec.partitionToPath(partitionData);
+          Map<String, String> partSpecMap = Maps.newLinkedHashMap();
+          Warehouse.makeSpecFromName(partSpecMap, new Path(partName), null);
+          DummyPartition partition = new DummyPartition(hmsTable, partName, 
partSpecMap);
+          if (!partitions.contains(partition)) {
+            partitions.add(partition);
+          }
+        }
+      });
+    } catch (IOException ioe) {
+      LOG.warn("Failed to close task iterable", ioe);
+    }
+    partitions.sort(Comparator.comparing(Partition::getName));

Review Comment:
   In iceberg compaction q-tests I am using `show compactions` command to 
display status of compactions.
   When compacting partitioned table the output looks like below.
   If the partitions are not sorted, then the order of partitions in the output 
will be random and it won't be possible to test with `show compactions` command.
   
   ```
   CompactionId Database        Table   Partition       Type    State   Worker 
host     Worker  Enqueue Time    Start Time      Duration(ms)    HadoopJobId    
 Error message   Initiator host  Initiator       Pool name       TxnId   Next 
TxnId      Commit Time     Highest WriteId
   #Masked#     default ice_orc event_src_trunc=AAA/event_time_month=2024-08    
MAJOR   succeeded       #Masked#        manual  default 0       0       0       
 --- 
   #Masked#     default ice_orc event_src_trunc=AAA/event_time_month=2024-09    
MAJOR   succeeded       #Masked#        manual  default 0       0       0       
 --- 
   #Masked#     default ice_orc event_src_trunc=BBB/event_time_month=2024-07    
MAJOR   succeeded       #Masked#        manual  default 0       0       0       
 --- 
   #Masked#     default ice_orc event_src_trunc=BBB/event_time_month=2024-08    
MAJOR   succeeded       #Masked#        manual  default 0       0       0       
 --- 
   #Masked#     default ice_orc  ---    MAJOR   succeeded       #Masked#        
manual  default 0       0       0        --- 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-28342: Iceberg: Major QB Compaction support filter in compaction… [hive]

Reply via email to