Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11182 )
Change subject: IMPALA-7436: initial fetch-from-catalogd implementation ...................................................................... Patch Set 5: (11 comments) initial pass. still making my way through the providers. http://gerrit.cloudera.org:8080/#/c/11182/5/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/11182/5/be/src/service/fe-support.cc@536 PS5, Line 536: NULL nit: use nullptr for these to be consistent within the function. ideally we convert the whole file while here, but I think its worth it to keep the diffs lower for this change. http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift@283 PS5, Line 283: partitions is this guaranteed to have info for each partition that was requested? if not all partitions are present (either due to an error or staleness), is an error returned or is this list best-effort? http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift@311 PS5, Line 311: 2: optional TCatalogServiceRequestHeader header heh, I omitted these for the incremental stats pull and nothing complained. probably a missing sanity check somewhere... http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift@324 PS5, Line 324: // The status of the operation, OK if the operation was successful. nit: indentation off http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift@325 PS5, Line 325: optional make it required if it'll always be set? I think I went with not setting it if there was no error, but if it will be set to ok for that case, that it can be required. http://gerrit.cloudera.org:8080/#/c/11182/5/common/thrift/CatalogService.thrift@417 PS5, Line 417: TGetPartialCatalogObjectResponse GetPartialCatalogObject( pls add a comment http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1861 PS5, Line 1861: table I had a bit of trouble with this one for pulling incremental stats. For example, if the table is incomplete (e.g., an exception was throws while loading). Similarly, if the table is not null, and not incomplete, is it guaranteed to be loaded (e.g., isLoaded() == true)? The comment for getOrLoadTable allows for the case of a "not yet loaded" table... a bit unclear if that corresponds to isLoaded() == false or an IncompleteTable. http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@100 PS5, Line 100: * index 'dstIndex' instead of the original index 'origIndex'. perhaps add why this is needed. http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/catalog/Table.java@381 PS5, Line 381: public TGetPartialCatalogObjectResponse getPartialInfo( pls add a comment for this one stating that its only called on the catalog, e.g., the source of truth that we happen to be using for metadata (either direct or catalogd). easy to lose context with this class which is shared between two uses. http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@909 PS5, Line 909: ndvCap nit: looks fine, but pls add a reference to this fix to the commit message (just a heads-up for me to know what to expect). http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11182/5/fe/src/main/java/org/apache/impala/service/Frontend.java@1051 PS5, Line 1051: new EventSequence("Query Compilation"); This'll show up as the timeline for the query that succeeds. I think it would be useful to break this up differently so that the timeline also includes the time needed to retry, when it comes up. I doubt anyone would look to the logs to check whether their "slow" query can be explained by the retry log message on L1040. -- To view, visit http://gerrit.cloudera.org:8080/11182 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901 Gerrit-Change-Number: 11182 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Wed, 15 Aug 2018 17:38:38 +0000 Gerrit-HasComments: Yes