Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13665 )
Change subject: IMPALA-8663 : FileMetadataLoader should skip hidden and tmp directories ...................................................................... Patch Set 8: (6 comments) http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@541 PS8, Line 541: temporary directories > the code now seems to also skip any files that match this pattern, not just The latest patchset changed it so that it only skips temp directories now. I saw code in FileMetadataLoader which keeps stats related to hidden files. Didn't want to mess up those, since temp/hidden files in valid directories seem intentional and long-lived, while temp directories most are transient in most cases. http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@545 PS8, Line 545: all underlying files (except which are : * in the ignored directories) > does this mean we no longer yield directories, and only yield files? does t Actually, I added this line in the doc since I noticed that the RecursingIterator yields files not directories even without the patch. I thought that was important and hence added it here. These semantics are not changed after the patch. http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@568 PS8, Line 568: isS3AFileSystem(p) > why'd you change from isS3AFileSystem(fs) to isS3AFileSystem(p)? In the cas this was unintentional change. Didn't realize there are two isS3AFileSystem methods and I used the wrong one. http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@568 PS8, Line 568: isS3AFileSystem > perhaps we can add a @VisibleForTesting way we can make this path get used Sure. Do you have any suggestions on how can we do this? static methods are hard to mock. Are you thinking of using a test-only flag? http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@626 PS8, Line 626: LISTING_TYPE > style nit: enums should be named like classes (ListingType) Done http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@699 PS8, Line 699: // if the current file is on a ignored path return early : if (isIgnoredPath(fileStatus)) return; > how does this prevent recursion into tmp dirs in the recursive listFiles ca yeah, I realized that when one of my own tests failed on the last patch set. The latest patch set addresses this. -- To view, visit http://gerrit.cloudera.org:8080/13665 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa Gerrit-Change-Number: 13665 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Thu, 27 Jun 2019 07:40:58 +0000 Gerrit-HasComments: Yes