Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
......................................................................


Patch Set 1:

(19 comments)

First round of comments. Overall approach looks reasonable to me.

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java@73
PS1, Line 73:     if (!fromClauseOnly) tblRefs.add(new 
TableRef(tableName_.toPath(), null));
Maybe replace the if check with a Preconditions.checkState(!fromClauseOnly); 
since it doesn't make sense to call this function with true in this context.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@66
PS1, Line 66: // Set in analyzeAndAuthorize().
            :   private ImpaladCatalog catalog_;
If I understand it correctly, that shouldn't be needed, given that analysis can 
proceed with the LoadedTables state.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@518
PS1, Line 518: analysisResult_.isAlterTableStmt() ||
             :             analysisResult_.isAlterViewStmt());
Curious, where did this come from?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2361
PS1, Line 2361: or the db
I don't think the existence of db is checked in this function anymore.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/FromClause.java
File fe/src/main/java/org/apache/impala/analysis/FromClause.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/FromClause.java@68
PS1, Line 68: public
@Override


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/StatementBase.java
File fe/src/main/java/org/apache/impala/analysis/StatementBase.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@64
PS1, Line 64: public abstract void collectTableRefs(List<TableRef> tblRefs, 
boolean fromClauseOnly);
If most of the statement classes that extend StatementBase don't return 
anything you can just provide a default implementation here and avoid adding a 
function in all these cases.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@823
PS1, Line 823:    * ignored and not returned or added to 'loadedTables'.
Comment where defaultDb is used.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@825
PS1, Line 825: private Set<TableName> getMissingTables
nit: can you put the following two functions below requestTblLoadAndWait so 
that the code reads top-down?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@866
PS1, Line 866: org.apache.impala.analysis.Path.getCandidateTables
nit: you can use import static if you want to avoid the classpath inside the 
function.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@872
PS1, Line 872: LoadedTables
The name sounds a bit generic. Maybe LoadedTablesCache or something to indicate 
the temporal nature of it (i.e. it is specific to a query not a global thing).


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@873
PS1, Line 873: public final ImpaladCatalog catalog;
I see that this is used for initializing the Analyzer. But given that the 
analysis proceeds based on the loaded tables (my understanding from the commit 
msg), does the Analyzer still need it?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@885
PS1, Line 885: If non-NULL
You mean the timeline, no?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@889
PS1, Line 889: defaultDb
nit: I would call it sessionDb


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@903
PS1, Line 903: public
public?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@916
PS1, Line 916: requestedTbls
It seems that this is only used for the log message in L965. Can't you use a 
counter instead?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@919
PS1, Line 919: final long debugLoggingFrequency = 10;
             :     final long retryLoadFrequency = 20;
nit: move them at the beginning of this function?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@924
PS1, Line 924: if (issueLoadRequest) {
Since we know there are missing tbls (condition in while), why do we need this 
check? The way I read this is "there are missing tbls but I will not ask them 
to be loaded" and I am not sure I understand the reason.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@943
PS1, Line 943: Tables remaining: %s
If views are used, that number may go up and down and I am wondering if this 
may cause some confusion to anyone looking at the logs.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1091
PS1, Line 1091: AnalysisContext analysisCtx = new AnalysisContext(queryCtx, 
authzConfig_, timeline);
              :     AnalysisResult analysisResult =
              :         
analysisCtx.analyzeAndAuthorize(queryCtx.client_request.stmt, this);
As is today, we have the following call sequence: 
Frontend::createExecRequest()->AnalysisContext::analyzeAndAuthorize()->Frontend::requestTableLoadAndWait().
 I think it may make more sense, to first collect the required tbls state (i.e. 
load any missing ones) and pass that to the AnalysisContext to perform the 
analysis.



--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Jan 2018 00:26:57 +0000
Gerrit-HasComments: Yes

Reply via email to