InvisibleProgrammer opened a new pull request, #4740:
URL: https://github.com/apache/hive/pull/4740

   Compare highest write ID of compaction records when trying to get the 
potential table/partitions for abort cleanup.
   
   Idea: If there exists a highest write ID of a record in COMPACTION_QUEUE for 
a table/partition which is greater than the max(aborted write ID) for that 
table/partition, then we can potentially ignore abort cleanup for such 
tables/partitions. This is because compaction will perform cleanup of obsolete 
deltas and aborted deltas hence doing abort cleanup is redundant here.
   
   This is more of an optimisation since it can potentially save some 
filesystem operations (mainly file-listing during construction of Acid state).
   
   ### What changes were proposed in this pull request?
   Skip abort cleanup for tables/partitions if there is a newer write id on 
them.
   
   ### Why are the changes needed?
   Reduce redundancy
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### Is the change a dependency upgrade?
   No
   
   ### How was this patch tested?
   New test added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to