Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4029: Reduce memory requirements for storing file metadata ......................................................................
Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/6406/2/common/fbs/CMakeLists.txt File common/fbs/CMakeLists.txt: PS2, Line 64: message(${JAVA_FE_ARGS}) > Looks like debug printing. Remove? (below too) Why? http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: PS2, Line 67: storageIdtoInt > (Unrelated to your change). Looks obsolete. Done http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: Line 119: FlatBufferBuilder fbb = new FlatBufferBuilder(1); > Preconditions.checkState(isBlockMdSupported(fs))? Done PS2, Line 124: } else { : locations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen()); : } > Do we really need this? Per my understanding, we only fetch LocatedFileStat If fileStatus is not an instance of LocatedFileStatus we don't have the block locations. E.g. during refresh. Line 141: ListMap<TNetworkAddress> hostIndex) { > Preconditions.checkState(!isBlockMdSupported(fs)) We don't have the filesystem here. PS2, Line 425: fbFileBlock_.replicaHostIdxs(replicaIdx); > - replicaHostIdxs is [ushort] as per fbs definition in which case idx shoul In Java there is no unsigned short. Hence, ushort is stored as in int in flatbuffer java-generated code. I like the two masks as it makes the intention more explicit. Line 463: public static class FileDescStats { > For supportability purposes, I'm wondering if we should track the remote_ne That's a good point. I plan on augmenting this in a separate patch along with other useful stats. http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS2, Line 344: = > nit: spacing Done Line 365: private static String findCommonPrefix(String a, String b) { > Use Strings.commonPrefix() from guava? Good point. Didn't know this existed. Line 1933: Collections.sort(orderedFds, new Comparator<FileDescriptor>() { > Can we just make FileDescriptor implement Comparator<FileDescriptor> and ad How are we going to call decompressFileName that references a structure (file name prefixes) in HdfsTable if we do that? http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: Line 93: protected boolean storedInImpaladCatalogCache_ = false; > What is the motivation to track this? A quick look at the places it is used To avoid initializing data structures that aren't used in the catalog. Tests also need to access this because of the different way we have for loading metadata. -- To view, visit http://gerrit.cloudera.org:8080/6406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes