[ https://issues.apache.org/jira/browse/HIVE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881614#comment-16881614 ]
Vaibhav Gumashta commented on HIVE-21225: ----------------------------------------- [~vgarg] Thanks for the review: let me update the patch with feedback. On the 2 issues you raised, here is what I am thinking: 1. While building directory snapshots, check for isValidBase, isCompactedBase and isRawFormated and cache those for later within each snapshot. 2. For the union case, let me look at what the rest of the code is doing, but in any case, we will need to map union_mm/delta_0000001_0000001_0002/HIVE_UNION_SUBDIR_2 to delta_0000001_0000001_0002. I am guessing in other places as well we may be parsing delta_0000001_0000001_0002 portion of it, but I can verify and use the same approach. > ACID: getAcidState() should cache a recursive dir listing locally > ----------------------------------------------------------------- > > Key: HIVE-21225 > URL: https://issues.apache.org/jira/browse/HIVE-21225 > Project: Hive > Issue Type: Improvement > Components: Transactions > Reporter: Gopal V > Assignee: Vaibhav Gumashta > Priority: Major > Attachments: HIVE-21225.1.patch, HIVE-21225.2.patch, > HIVE-21225.3.patch, HIVE-21225.4.patch, HIVE-21225.4.patch, > HIVE-21225.5.patch, HIVE-21225.6.patch, HIVE-21225.7.patch, > HIVE-21225.7.patch, HIVE-21225.8.patch, HIVE-21225.9.patch, async-pid-44-2.svg > > > Currently getAcidState() makes 3 calls into the FS api which could be > answered by making a single recursive listDir call and reusing the same data > to check for isRawFormat() and isValidBase(). > All delta operations for a single partition can go against a single listed > directory snapshot instead of interacting with the NameNode or ObjectStore > within the inner loop. -- This message was sent by Atlassian JIRA (v7.6.3#76005)