Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
......................................................................


Patch Set 1:

(11 comments)

Patch looks good to me overall, have a few suggestions.

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@367
PS1, Line 367:
IMO, this method should be documented, given this is the starting point of 
analysis.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@309
PS1, Line 309:     private final Map<TableName, Table> loadedTables;
nit: My feeling is that we can call this referencedTables_ (and backup with a 
comment saying they are loaded). Reason being, we are separating loading and 
analysis part, so anything that reached here means it is already loaded. Please 
ignore if you disagree, I'm not too strong about this either.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2372
PS1, Line 2372:     if (table.isLoaded() && table instanceof IncompleteTable) {
Preconditions.checkState(table.isLoaded())?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@835
PS1, Line 835:         // Database does not exist.
Maybe catch DatabaseNotFoundEx? CatalogEx seems generic (I checked 
Catalog#getTable() seems to be throwing DBNotFoundEx after you removed 
ImpaladCatalog#getTable()).


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@838
PS1, Line 838:       if (!tbl.isLoaded()) {
Should we handle tbl.isLoaded() && tbl instanceof IncompleteTable here and fail 
fast instead?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@859
PS1, Line 859:   private Set<TableName> collectTableCandidates(StatementBase 
stmt, String defaultDb) {
I feel like this method belongs to the AnalysisCtx() and not frontend?

Then we could just refactor requestTableLoadAndWait() to 
requestTableLoadAndWait(List<collectedTableRefsToLoad>....)


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@910
PS1, Line 910:     if (missingTbls.isEmpty()) return new LoadedTables(catalog, 
loadedTbls);
Should we update something in the timeline to reflect that all the metadata is 
locally available?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@931
PS1, Line 931: currCatalog != catalog
Not fully familiar with != implementation, so I could be wrong on this one.

Shouldn't we do something like (currCatalog.serviceId != catalog.serviceId) ? 
Else, any change to the catalog, like a new updates etc, can alter the 
hashCode() resulting in an incorrect equals() ?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@942
PS1, Line 942:           LOG.warn(String.format("Waiting for table metadata. " +
Should we dump current missingTables() too?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@951
PS1, Line 951:           getMissingTables(catalog, missingTbls, defaultDb, 
loadedTbls);
Make getMissingTables() always use getCatalog() instead of passing it around?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@958
PS1, Line 958: issueLoadRequest
Not sure I follow the need for this. L960 makes sure missingTbls is up to date 
and we always need to load the remaining ones right?

I don't see a case when issueLoadRequest is false and missingTbls is not empty. 
In other words, every time we pass the loop condition, we need to load 
something.



--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Jan 2018 01:21:20 +0000
Gerrit-HasComments: Yes

Reply via email to