Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12991 )
Change subject: Initial support for recursive file listing within a partition ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/12991/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12991/1//COMMIT_MSG@7 PS1, Line 7: Initial support for recursive file listing within a partition > nit: Create a jira, since the change is substantial? Done http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@154 PS1, Line 154: FlatBufferBuilder fbb = new FlatBufferBuilder(1); > Why remove the Precondition? it was sort of ugly to pass the FileSystem in here, since the FileSystem was only used for the purpose of this precondition. I think it's sort of redundant anyway since it would be tough to call this method (which takes BlockLocations) from a filesystem which doesn't yield BlockLocations http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@521 PS1, Line 521: if (isS3AFileSystem(fs)) { > Do you happen to know if Azure has a similar optimization builtin? It doesn't (I checked) http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@556 PS1, Line 556: private static class RecursingIterator implements RemoteIterator<FileStatus> { > Should we add some unit tests for this? I was hesitant to add a unit test which is more or less redundant with the test which uses this by way of FileMetadataLoader. i.e this code is already well covered by another relatively-small test. I'm not sure there's a lot of value in an additional one here. http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@560 PS1, Line 560: private FileStatus curFile; > nit:use _ for class members Done http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@564 PS1, Line 564: startPath > Preconditions.checkNotNull()? Done http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@569 PS1, Line 569: curFile == nul > mind adding docs? (what it means to be null) Done http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java File fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java: http://gerrit.cloudera.org:8080/#/c/12991/1/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java@36 PS1, Line 36: > Add an e-e test to read a bucketized hive table? Per the commit message, we don't currently actually have the recursive loading support hooked in anywhere. I didn't want to complicate this patch with the complexities of trying to handle bucketed tables during analysis, REFRESH, etc. -- To view, visit http://gerrit.cloudera.org:8080/12991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b151d7abb8443c0d9de0a0d82a9f13e07ad5109 Gerrit-Change-Number: 12991 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Sudhanshu Arora <sudhan...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Thu, 25 Apr 2019 07:17:46 +0000 Gerrit-HasComments: Yes