difin commented on code in PR #5328:
URL: https://github.com/apache/hive/pull/5328#discussion_r1688390400
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java:
##########
@@ -63,20 +63,33 @@ public boolean run(CompactorContext context) throws
IOException, HiveException,
HiveConf conf = new HiveConf(context.getConf());
String partSpec = context.getCompactionInfo().partName;
+ org.apache.hadoop.hive.ql.metadata.Table table =
Hive.get(conf).getTable(context.getTable().getDbName(),
+ context.getTable().getTableName());
+ Table icebergTable = IcebergTableUtil.getTable(conf, table.getTTable());
String compactionQuery;
if (partSpec == null) {
- HiveConf.setVar(conf, ConfVars.REWRITE_POLICY,
RewritePolicy.ALL_PARTITIONS.name());
- compactionQuery = String.format("insert overwrite table %s select * from
%<s", compactTableName);
+ if (!IcebergTableUtil.isPartitioned(icebergTable)) {
+ HiveConf.setVar(conf, ConfVars.REWRITE_POLICY,
RewritePolicy.ALL_PARTITIONS.name());
Review Comment:
For partitioned tables, starting with this PR, RewritePolicy.ALL_PARTITIONS
won’t be used anymore, because if table is partitioned it will always create
compaction request per partition and will use RewritePolicy.PARTITION for each
partition.
RewritePolicy.ALL_PARTITIONS is used now in case the table is unpartitioned
which means it compacts entire table.
Should we rename it to a proper name, like RewritePolicy.UNPARTITIONED or
RewritePolicy.FULL_TABLE?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]