Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
......................................................................


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG@15
PS6, Line 15: them into the Kudu client. Because the Kudu client doesn't 
provide a
> Is there any documentation for Kudu about what ordering Kudu uses for compa
I asked the Kudu team, and its not documented anywhere, but they confirmed they 
use the same ordering as Impala does.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@106
PS6, Line 106: bloom_filter
> has_bloom_filter() ? At callsites it reads like this should return a filter
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@109
PS6, Line 109: min_max_filter
> has_min_max_filter?
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc@1220
PS6, Line 1220:       MinMaxFilter::Or(src_type(), params.min_max_filter, 
min_max_filter_.get());
> I mentioned this in a later comment, but we should consider what happens if
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc@235
PS6, Line 235:   MinMaxFilter* min_max_filter = MinMaxFilter::Create(type, 
&obj_pool_);
> (nit) - could combine into one line.
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc@27
PS6, Line 27:   // to a) the atomicity of / pointer assignments and b) the x86 
TSO memory model.
> Comment not relevant any more since we're using actual atomics?
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc@53
PS6, Line 53:       min_str = string(value->ptr, value->len);
> We should think through the behaviour with very large strings. It may make
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h@83
PS6, Line 83: #define NUMERIC_MIN_MAX_FILTER(NAME, TYPE)                        
 \
> Yeah, the macro seems like it may be the least of the evils.
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc
File be/src/util/min-max-filter.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@24
PS6, Line 24: "
> #include "runtime/timestamp-value.inline.h"
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@77
PS6, Line 77:   std::string NAME##MinMaxFilter::DebugString() const {           
             \
> nit: don't need std:: prefix in .cc files.
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@139
PS6, Line 139:       out->min.__set_string_val(std::min(in.min.string_val, 
out->min.string_val));
> I feel like we should use our StringValue::Compare() function instead of re
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@291
PS6, Line 291:       BigIntMinMaxFilter::Or(in, out);
> Shouldn't this be calling the timestamp method?
TColumnData doesn't have a timestamp field, so I convert timestamps with 
UtcToUnixTimeMicros and pass them around in thrift as longs. (noted in a 
comment now)

Unless I'm missing a reason that comparing timestamps with way won't lead to 
the same results as comparing the unconverted TimestampValues.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@327
PS6, Line 327:   }
> Timestamp?
Same as above.


http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@379
PS6, Line 379:         if (slotRef == null || slotRef.getDesc().getColumn() == 
null
> Are there planner tests for all these cases?
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test:

http://gerrit.cloudera.org:8080/#/c/7793/6/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test@30
PS6, Line 30: kudu
> Can we make this something like table_format=kudu so that it's a bit more s
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/tests/query_test/test_runtime_filters.py
File tests/query_test/test_runtime_filters.py:

http://gerrit.cloudera.org:8080/#/c/7793/6/tests/query_test/test_runtime_filters.py@61
PS6, Line 61:       lambda v: v.get_value('table_format').file_format not in 
['hbase', 'kudu'])
> nit: 4 character indents on line continuations (I guess this is preserved f
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/tests/util/test_file_parser.py
File tests/util/test_file_parser.py:

http://gerrit.cloudera.org:8080/#/c/7793/6/tests/util/test_file_parser.py@252
PS6, Line 252:           allowed_formats = ['kudu']
> Thanks for adding this validation, hopefully will lead people in the right 
Done


http://gerrit.cloudera.org:8080/#/c/7793/6/tests/util/test_file_parser.py@254
PS6, Line 254: hint
> hint? Isn't it a table format?
Done



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mjac...@apache.org>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Oct 2017 19:08:39 +0000
Gerrit-HasComments: Yes

Reply via email to