veghlaci05 commented on code in PR #3775:
URL: https://github.com/apache/hive/pull/3775#discussion_r1063433513
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java:
##########
@@ -504,6 +504,47 @@ private CompactionType
determineCompactionType(CompactionInfo ci, AcidDirectory
if (initiateMajor) return CompactionType.MAJOR;
}
+ // bucket size calculation can be resource intensive if there are numerous
deltas, so we check for rebalance
+ // compaction only if the table is in an acceptable shape: no major
compaction required. This means the number of
+ // files shouldn't be too high
+ if ("tez".equalsIgnoreCase(HiveConf.getVar(conf,
HiveConf.ConfVars.HIVE_EXECUTION_ENGINE)) &&
Review Comment:
Yes, running a rebalance compaction on uncompacted tables could be resource
intensive due to the hive number of files and folders. So I decided to schedule
rebalance compactions only on tables already major compacted. This ensures that
the number of deltas are relatively low.
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java:
##########
@@ -504,6 +504,47 @@ private CompactionType
determineCompactionType(CompactionInfo ci, AcidDirectory
if (initiateMajor) return CompactionType.MAJOR;
}
+ // bucket size calculation can be resource intensive if there are numerous
deltas, so we check for rebalance
+ // compaction only if the table is in an acceptable shape: no major
compaction required. This means the number of
+ // files shouldn't be too high
+ if ("tez".equalsIgnoreCase(HiveConf.getVar(conf,
HiveConf.ConfVars.HIVE_EXECUTION_ENGINE)) &&
Review Comment:
Yes, running a rebalance compaction on uncompacted tables could be resource
intensive due to the high number of files and folders. So I decided to schedule
rebalance compactions only on tables already major compacted. This ensures that
the number of deltas are relatively low.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]