okumin commented on code in PR #5409: URL: https://github.com/apache/hive/pull/5409#discussion_r1818953918
########## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java: ########## @@ -752,6 +753,39 @@ private void addCustomSortExpr(Table table, org.apache.hadoop.hive.ql.metadata. ).collect(Collectors.toList())); } + @Override + public boolean supportsPartitionAwareOptimization(org.apache.hadoop.hive.ql.metadata.Table table) { + if (hasUndergonePartitionEvolution(table)) { + // Don't support complex cases yet + return false; + } + final List<TransformSpec> specs = getPartitionTransformSpec(table); + // Currently, we support the only bucket transform + return specs.stream().anyMatch(HiveIcebergStorageHandler::isBucket); + } + + @Override + public PartitionAwareOptimizationCtx createPartitionAwareOptimizationContext( + org.apache.hadoop.hive.ql.metadata.Table table) { + // Currently, we support the only bucket transform + final List<String> bucketColumnNames = Lists.newArrayList(); + final List<Integer> numBuckets = Lists.newArrayList(); + getPartitionTransformSpec(table).stream().filter(HiveIcebergStorageHandler::isBucket).forEach(spec -> { + bucketColumnNames.add(spec.getColumnName()); + numBuckets.add(spec.getTransformParam().get()); + }); + + if (bucketColumnNames.isEmpty()) { + return null; + } + final IcebergBucketFunction bucketFunction = new IcebergBucketFunction(bucketColumnNames, numBuckets); + return new PartitionAwareOptimizationCtx(bucketFunction); + } + + private static boolean isBucket(TransformSpec spec) { Review Comment: I initially thought we could reuse TransformSpec to implement the Hudi storage format, for example. To clarify, I don't know even what spec Hudi supports, and I don't have a strong preference. I will move the logic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org