Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7564 )

Change subject: IMPALA-3548: Prune runtime filters based on query options in 
the FE
......................................................................


Patch Set 10:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/7564/10/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/7564/10/be/src/service/fe-support.cc@455
PS10, Line 455:     jbyteArray thrift_in_query_options) {
> Not sure what 'thrift_in_query_options' means. How about using 'tquery_opti
I've renamed the parameter.

I decided to pass in the existing query options because the thrift-generated 
C++ and Java TQueryOptions classes don't handle the isset flags properly.

If I don't pass in the existing query options, then:

1. I'd have to create an "empty" TQueryOptions instance inside this function.

2. The __isset member of that instance would contain true values for every 
option even though they haven't been set explicitly. Therefore I'd have to 
overwrite __isset with false values somehow before calling 
impala::ParseQueryOptions().

3. When FeSupport.ParseQueryOptions() deserializes the TQueryOptions instance 
returned by Java_org_apache_impala_service_FeSupport_NativeParseQueryOptions(), 
it would still find that some query options have been set, even if they haven't 
been touched (e.g. TQueryOptions.setExplain_levelIsSet() always returns true 
since explain_level != null even if it hasn't been set explicitly).

I'm not sure why thrift-generated classes behave this way. Maybe there's a 
better way to work around the isset bugs, but I couldn't think of any.


http://gerrit.cloudera.org:8080/#/c/7564/10/be/src/service/fe-support.cc@457
PS10, Line 457:   DeserializeThriftMsg(env, thrift_in_query_options, &options);
> Needs error handling, e.g. using THROW_IF_ERROR_RET
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/be/src/service/fe-support.cc@458
PS10, Line 458:   const char *o = env->GetStringUTFChars(options_str, NULL);
> * You need to release the string UTF chars, take a look at JniUtfCharGuard
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/be/src/service/fe-support.cc@461
PS10, Line 461:   TParseQueryOptionsResult result;
> I think it's simpler to convert all Status to an exception. That way we don
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/Planner.java@119
PS10, Line 119:     // Create runtime filters. 'singleNodePlan' now points to 
the root of the distributed
> Comment is wrong. 'singleNodePlan' definitely does not point to the root of
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@354
PS10, Line 354:     public void addTarget(RuntimeFilterTarget target) { 
targets_.add(target); }
> add newline before
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@397
PS10, Line 397:     Preconditions.checkState(maxNumFilters >= 0);
> Why is 0 not a valid value?
0 is a valid value. MAX_NUM_RUNTIME_FILTERS=0 effectively removes every runtime 
filter.


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@423
PS10, Line 423:       filter.computeNdvEstimate();
> My understanding is that IMPALA-3450 has been fixed and we can remove this
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@537
PS10, Line 537:    * The assigned filters are the ones for which 'scanNode' can 
be used a destination
> can be used as a destination node
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@539
PS10, Line 539:    * 1. If DISABLE_ROW_RUNTIME_FILTERING query option is set, a 
filter is only assigned to
> If the DISABLE_ROW_RUNTIME_FILTERING query option is set ...
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@541
PS10, Line 541:    * 2. If RUNTIME_FILTER_MODE query option is set to LOCAL, a 
filter is only assigned to
> If the RUNTIME_FILTER_MODE query option is set ...
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@542
PS10, Line 542:    *    'scanNode' if the filter is local to the scan node.
> This doesn't really explain what the LOCAL option means. How about:
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@563
PS10, Line 563:       if (!isSingleNodeExec && runtimeFilterMode == 
TRuntimeFilterMode.LOCAL &&
> Why the !isSingleNodeExec condition? I think that !isLocalTarget implies !i
Correct, removed !isSingleNodeExec


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@576
PS10, Line 576:   static private boolean IsLocalTarget(RuntimeFilter filter, 
ScanNode targetNode) {
> isLocalTarget()
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@583
PS10, Line 583:   static private boolean IsBoundByPartitionColumns(Analyzer 
analyzer, Expr targetExpr,
> isBoundByPartitionColumns()
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/service/FeSupport.java
File fe/src/main/java/org/apache/impala/service/FeSupport.java:

http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/service/FeSupport.java@92
PS10, Line 92:   public native static byte[] NativeParseQueryOptions(String 
optionsStr,
> to be clearer call the first arg csvQueryOptions
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/service/FeSupport.java@93
PS10, Line 93:
> Can we get rid of the second argument and use PlannerTestBase.mergeQueryOpt
See my response to be/src/service/fe-support.cc L455.

The thrift-generated TQueryOptions classes don't handle the isset flags 
properly which would confuse mergeQueryOptions().


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/main/java/org/apache/impala/service/FeSupport.java@282
PS10, Line 282:   public static TParseQueryOptionsResult 
ParseQueryOptions(String optionsStr,
> Public function needs a comment.
I've modified the method to return TQueryOptions and throw an exception if 
there was an error.


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@300
PS10, Line 300:   public void testRuntimeFilterPruning() {
> testRuntimeFilterQueryOptions()?
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/test/java/org/apache/impala/testutil/TestFileParser.java
File fe/src/test/java/org/apache/impala/testutil/TestFileParser.java:

http://gerrit.cloudera.org:8080/#/c/7564/10/fe/src/test/java/org/apache/impala/testutil/TestFileParser.java@288
PS10, Line 288:         parseQueryOptions(currentTestCase);
> Why do we need to call this here and in L338? Isn't the call in L338 enough
L288-L293 handles test cases found in the middle of the test file.
L338-L344 handles the very last test case found at the end of the test file.
We need to parse the query-options section in both places.


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test:

http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@2
PS10, Line 2: select count(*) from functional.alltypes a
> When using join hints it is recommended to also use straight_join. Also ple
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@85
PS10, Line 85: select count(*) from functional.alltypes a
> Can we make it more obvious which runtime filters are the most selective? T
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@91
PS10, Line 91: ---- PLAN
> To reduce the number of test lines, let's only keep those sections that are
Done


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@167
PS10, Line 167: # DISABLE_ROW_RUNTIME_FILTERING is set: only partition column 
filters are applied
> What's the effect of setting MAX_NUM_RUNTIME_FILTERS=3 in this test? Does i
Yes, it does. I've added a new test case to test this.


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@252
PS10, Line 252: # RUNTIME_FILTER_MODE is set to LOCAL: only local filters are 
applied
> Same question as above. What happens when we set MAX_NUM_RUNTIME_FILTERS=4.
Yes, it does. I've added a new test case to test this.


http://gerrit.cloudera.org:8080/#/c/7564/10/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test@418
PS10, Line 418: # RUNTIME_FILTER_MODE is OFF: no filters are applied
> Is this the same as MAX_NUM_RUNTIME_FILTERS=0?
Yes, it is. Added a new test case for MAX_NUM_RUNTIME_FILTERS=0



--
To view, visit http://gerrit.cloudera.org:8080/7564
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0f0b200e02442edcad8df3979f652d66c6e52eb
Gerrit-Change-Number: 7564
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mjac...@apache.org>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Oct 2017 18:59:40 +0000
Gerrit-HasComments: Yes

Reply via email to