InvisibleProgrammer opened a new pull request, #4740: URL: https://github.com/apache/hive/pull/4740
Compare highest write ID of compaction records when trying to get the potential table/partitions for abort cleanup. Idea: If there exists a highest write ID of a record in COMPACTION_QUEUE for a table/partition which is greater than the max(aborted write ID) for that table/partition, then we can potentially ignore abort cleanup for such tables/partitions. This is because compaction will perform cleanup of obsolete deltas and aborted deltas hence doing abort cleanup is redundant here. This is more of an optimisation since it can potentially save some filesystem operations (mainly file-listing during construction of Acid state). ### What changes were proposed in this pull request? Skip abort cleanup for tables/partitions if there is a newer write id on them. ### Why are the changes needed? Reduce redundancy ### Does this PR introduce _any_ user-facing change? No ### Is the change a dependency upgrade? No ### How was this patch tested? New test added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org