Re: [PR] HIVE-27731: Iceberg: Perform metadata delete for queries with static filters [hive]

via GitHub Mon, 09 Oct 2023 07:59:09 -0700


SourabhBadhya commented on code in PR #4748:
URL: https://github.com/apache/hive/pull/4748#discussion_r1350422534



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -1831,4 +1833,47 @@ public ColumnInfo 
getColumnInfo(org.apache.hadoop.hive.ql.metadata.Table hmsTabl
       throw new SemanticException(String.format("Unable to find a column with 
the name: %s", colName));
     }
   }
+
+  @Override
+  public boolean supportsMetadataDelete() {
+    return true;
+  }
+
+  @Override
+  public boolean 
canPerformMetadataDelete(org.apache.hadoop.hive.ql.metadata.Table hmsTable, 
SearchArgument sarg) {
+    if (!supportsMetadataDelete()) {
+      return false;
+    }
+
+    Expression exp;
+    try {
+      exp = HiveIcebergFilterFactory.generateFilterExpression(sarg);
+    } catch (UnsupportedOperationException e) {
+      LOG.warn("Unable to create Iceberg filter," +
+              " continuing without metadata delete: ", e);
+      return false;
+    }
+    Table table = IcebergTableUtil.getTable(conf, hmsTable.getTTable());
+    FindFiles.Builder builder = new 
FindFiles.Builder(table).withRecordsMatching(exp).includeColumnStats();

Review Comment:
   Right now, I have changed the function to use same utils as what is used in 
Spark. Best to implement what works already well in an engine like Spark.



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -1831,4 +1833,47 @@ public ColumnInfo 
getColumnInfo(org.apache.hadoop.hive.ql.metadata.Table hmsTabl
       throw new SemanticException(String.format("Unable to find a column with 
the name: %s", colName));
     }
   }
+
+  @Override
+  public boolean supportsMetadataDelete() {
+    return true;
+  }
+
+  @Override
+  public boolean 
canPerformMetadataDelete(org.apache.hadoop.hive.ql.metadata.Table hmsTable, 
SearchArgument sarg) {
+    if (!supportsMetadataDelete()) {
+      return false;
+    }
+
+    Expression exp;
+    try {
+      exp = HiveIcebergFilterFactory.generateFilterExpression(sarg);
+    } catch (UnsupportedOperationException e) {
+      LOG.warn("Unable to create Iceberg filter," +
+              " continuing without metadata delete: ", e);
+      return false;
+    }
+    Table table = IcebergTableUtil.getTable(conf, hmsTable.getTTable());
+    FindFiles.Builder builder = new 
FindFiles.Builder(table).withRecordsMatching(exp).includeColumnStats();
+    Set<DataFile> dataFiles = Sets.newHashSet(builder.collect());
+    boolean result = true;
+    for (DataFile dataFile : dataFiles) {
+      PartitionData partitionData = (PartitionData) dataFile.partition();
+      Expression residual = ResidualEvaluator.of(table.spec(), exp, false)
+              .residualFor(partitionData);
+      StrictMetricsEvaluator strictMetricsEvaluator = new 
StrictMetricsEvaluator(table.schema(), residual);
+      if (!strictMetricsEvaluator.eval(dataFile)) {
+        result = false;

Review Comment:
   Right now, I have changed the function to use same utils as what is used in 
Spark. Best to implement what works already well in an engine like Spark.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Re: [PR] HIVE-27731: Iceberg: Perform metadata delete for queries with static filters [hive]

Reply via email to