Alex Behm has posted comments on this change.

Change subject: IMPALA-4029: Reduce memory requirements for storing file 
metadata
......................................................................


Patch Set 4:

(11 comments)

Seems fine. Nothing major left to do here.

http://gerrit.cloudera.org:8080/#/c/6406/3/common/fbs/CatalogObjects.fbs
File common/fbs/CatalogObjects.fbs:

Line 40:   // Total length of the block
> You mean why not compute it on the fly by iterating on the file blocks?
It can be computed based on the offset, file length, and block size. No need to 
even iterate.

All but the last block have length == block size. You can determine last block 
based on its offset and compute the length with file length - offset.

We don't have to do this now, but seems simple enough to do later.


Line 65:   last_modification_time: long (id: 3);
> Hm, not sure I understand what the ask is here? Should I try to merge this 
I was mostly wondering if the field order matters. For example, could we save 
some space by moving the compression field last? Having it here in the middle, 
followed by a long seems like fb "might" add padding to align the 
last_modification_time member.


http://gerrit.cloudera.org:8080/#/c/6406/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

Line 227:           FbCompression.name(fbFileDescriptor_.compression()));
> I don't think any of these statements is making a call to NN. Which one are
I though loc.getNames() might because it can throw an IOException. I looked at 
the HDFS code and I don't see any reason why that method should throw... 
Bizarre.


http://gerrit.cloudera.org:8080/#/c/6406/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

Line 199:     private static int[] loadBlockMetadata(FlatBufferBuilder fbb,
createBlockMetadata()?

I'm not sure what the "load" was supposed to mean


Line 303:     private final boolean isCached_;
hasCachedReplica_


Line 372:     private static short[] loadDiskIds(BlockLocation location, 
FileDescStats stats) {
mapDiskIds() or createDiskIds()?

I'm not sure what "load" means here.


Line 380:         return Shorts.toArray(Collections.<Short>emptyList());
why not:

return new short[0];


Line 387:       List<Short> diskIDs = new ArrayList<Short>(storageIds.length);
Why not a short[]? The size is fixed and you can even use the 'i' index in the 
loop.


Line 403:     public boolean isCached() { return isCached_; }
hasCachedReplica()


http://gerrit.cloudera.org:8080/#/c/6406/4/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

Line 110:     return storedInImpaladCatalogCache_ || 
RuntimeEnv.INSTANCE.isTestEnv();
Comment on the test env condition


Line 287:     // Indicate that this table is stored in an Impalad catalog cache.
no need for comment


-- 
To view, visit http://gerrit.cloudera.org:8080/6406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to