[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8262 ) Change subject: Make build scripts more friendly to Ubuntu 16.04 .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh File bin/bootstrap_build.sh: http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh@23 PS1, Line 23: # 1. At least 8GB of free disk space > Is this still true? I vaguely recollect running out of space in a vm that h Yes, check the df output: https://jenkins.impala.io/view/Experimental/job/ubuntu-16.04-build-only/4/consoleFull You may have been loading data, building static (and building all tests), or other developer activities. -- To view, visit http://gerrit.cloudera.org:8080/8262 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317 Gerrit-Change-Number: 8262 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 12 Oct 2017 05:00:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8235 ) Change subject: IMPALA-5429: Multi threaded block metadata loading .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42 PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10, I would be more aggressive with this parameter and put it at 20. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@787 PS5, Line 787: int threadPoolSize = FileSystemUtil.supportsStorageIds(tableFs) ? What is the expected behavior for tables with mixed FSs? As a mix of S3 and HDFS partitions. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801 PS5, Line 801: getLoadingThreadPoolSize > can different partitions have different number of files? if so, work across Parallelization of metadata loading is done on per partition granularity. -- To view, visit http://gerrit.cloudera.org:8080/8235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481 Gerrit-Change-Number: 8235 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 12 Oct 2017 04:51:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04
Zoram Thanga has posted comments on this change. ( http://gerrit.cloudera.org:8080/8262 ) Change subject: Make build scripts more friendly to Ubuntu 16.04 .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh File bin/bootstrap_build.sh: http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh@23 PS1, Line 23: # 1. At least 8GB of free disk space Is this still true? I vaguely recollect running out of space in a vm that had 32GB of disk space. -- To view, visit http://gerrit.cloudera.org:8080/8262 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317 Gerrit-Change-Number: 8262 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 12 Oct 2017 04:48:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04
Jim Apple has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8262 Change subject: Make build scripts more friendly to Ubuntu 16.04 .. Make build scripts more friendly to Ubuntu 16.04 This commit bundles two changes. The first extracts from bootstrap_development.sh the commands to prepare a system to build-and-test without actually doing so. This enables custom build commands that might not load the test data, for instance. The second changes bootstrap_build.sh to work with Ubuntu 16.04. It should still work with Ubuntu 14.04, but I don't anticipate that being part of the Jenkins pre-merge job anymore, so I removed that from the comment at the top of the file explaining what it does. Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317 --- M bin/bootstrap_build.sh M bin/bootstrap_development.sh A bin/bootstrap_system.sh 3 files changed, 227 insertions(+), 174 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/8262/1 -- To view, visit http://gerrit.cloudera.org:8080/8262 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317 Gerrit-Change-Number: 8262 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (34 comments) Another batch of comments... http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@63 PS3, Line 63: ongoing transmission from a client to a server as a 'stream' I don't think that's accurate. see questions in krpc-data-stream-recvr.h about stream definition. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@93 PS3, Line 93: process it immediately, add it to a fixed-size 'batch queue' for later processing XXX check whether these are really different http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@164 PS3, Line 164: deferred processing is that because the recvr hasn't showed up yet, or because the stream's queue is full, or both? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@235 PS3, Line 235: node_id dest_node_id? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@239 PS3, Line 239: Ownership of the receiver is shared between this DataStream mgr instance and the : /// caller. that seems unnecessary but don't change it now. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@246 PS3, Line 246: 'proto_batch'? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@248 PS3, Line 248: . 'request'. Also document what 'response' and 'context' are. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@266 PS3, Line 266: Notifies the receiver is this an RPC handler? I think we should just be explicit about which of these methods are RPC handlers. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@267 PS3, Line 267: The RPC what RPC is this talking about? If this is a handler, then it's clear. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@274 PS3, Line 274: Closes Does it close or cancel? (or is there no difference?) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@284 PS3, Line 284: RPCs which were buffered To be consistent with terminology used in class comment, maybe say "deferred RPCs" http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@340 PS3, Line 340: ragment instance id, 0 what is that saying? is that a misplaced comma or am I reading this wrong? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@341 PS3, Line 341: instance id changes I don't understand this. it kinda sounds like we're trying to be able to find all instances of this fragment, but then wouldn't we iterate until the fragment id changes (not the instance id)? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@349 PS3, Line 349: struct EarlySendersList { hmm, I guess we need this now that we can't block the RPC thread? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@358 PS3, Line 358: Time Monotonic time http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@374 PS3, Line 374: time monotonic time http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@382 PS3, Line 382: boost::unordered_set closed_stream_cache_; all this parallel startup stuff really needs to be revisited (but not for this change). it's too complex and brittle. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@386 PS3, Line 386: Deserialize maybe call it DeserializeDeferred() or DeserializeWorker() to make it clearer that this is only for the deferred (slow) path, since the normal path will also have to deserialize (but doesn't use this code). http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@404 PS3, Line 404: void EnqueueRowBatch(DeserializeWorkItem&& payload); how about grouping this with Deferred function above since it's related. Also, I think the name should be less generic. Like maybe EnqueueDeferredBatch() or EnqueueDeferredRpc() (does the response happen before or after this deferred work?) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@413 PS3, Line 413: block what's that? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@414 PS3, Line 414: Status I think that status is not getting checked by the caller. I thought Tim made Status warn on unused result -- why is it not catching this? (Or do we still need to annotate each method?). http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-da
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 12 Oct 2017 02:16:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. IMPALA-6040: skip test_multi_compression_types where Hive isn't supported A recent commit "IMPALA-5448: fix invalid number of splits reported in Parquet scan node" neglected to account for the fact that in some environments, Impala runs without Hive. The typical pattern for tests that use Hive is skip them if they are executed against such environments. Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Reviewed-on: http://gerrit.cloudera.org:8080/8259 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M tests/query_test/test_scanners.py 1 file changed, 4 insertions(+), 0 deletions(-) Approvals: Michael Brown: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test File testdata/workloads/tpcds/queries/tpcds-q26.test: http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6 PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1, > The query text in this file now reliably produces the results in this file, I set the scale adjustment to 10^5 in this line and adjusted 3 result lines so the approximation of +1/3 (.3) would be clearer. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Thu, 12 Oct 2017 02:09:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. IMPALA-5940: Avoid log spew by using Status::Expected() This change converts two more callers of Status() to Status::Expected() to avoid log spew and the overhead of stack crawling. These two call sites can easily fill the log when there is any network issue. Testing done: Ran concurrent TPC-DS workload with 256 users and verified the log spew is not there. Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Reviewed-on: http://gerrit.cloudera.org:8080/8255 Reviewed-by: Michael Ho Tested-by: Impala Public Jenkins --- M be/src/rpc/thrift-client.cc M be/src/service/impala-server.cc 2 files changed, 9 insertions(+), 4 deletions(-) Approvals: Michael Ho: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 12 Oct 2017 01:34:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8170 ) Change subject: IMPALA-5789: Add always_false flag in bloom filter .. Patch Set 4: (3 comments) I need to do a proper pass over this. Did a quick pass now and had some questions. http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/coordinator.cc@a20 PS4, Line 20: Thanks for doing cleanup here. How did you determine which includes were needed? Is there a good tool to use? http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/runtime-filter-ir.cc File be/src/runtime/runtime-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/runtime-filter-ir.cc@27 PS4, Line 27: if (UNLIKELY(bloom_filter_->AlwaysFalse())) return false; Won't these branches be likely in some cases? IMO best to only use these annotations when it's very strongly unlikely, e.g. error paths. http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/util/bloom-filter.h@200 PS4, Line 200: if (UNLIKELY(always_false_)) return false; Doesn't this mean that RuntimeFilter::Eval() checks always_false_ twice? Maybe best to only check it in here. -- To view, visit http://gerrit.cloudera.org:8080/8170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d Gerrit-Change-Number: 8170 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 12 Oct 2017 01:07:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7822 ) Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner .. Patch Set 4: (12 comments) Did a first pass over it - thought I should flush out comments since you've been waiting quite a while for feedback here. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-column-readers.cc@1277 PS4, Line 1277: switch (node.element->type) { This case is only going to get bigger with the follow on patches - it would be best to make it a separate function. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h File be/src/exec/parquet-common.h: http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h@213 PS4, Line 213: template Logical type and physical type I think aren't quite right, since e.g. "StringValue" isn't a logical type. It's really decoding/encoding between Parquet physical types and Impala's internal representation. Maybe INTERNAL_TYPE and PARQUET_PHYSICAL_TYPE? Or PARQUET_TYPE seems ok since that's what parquet calls it. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h@376 PS4, Line 376: inline int ParquetPlainEncoder::Decode( I think you can avoid stamping this out if you leave it still parameterized by the internal Impala type, since the logic is the same in all cases. There may be an opportunity to reduce the redundancy above too, since there are functions above with identical bodies. I tried this out on a dummy program and it looks like it's possible: #include template int foo(T* a, S* b) { return 0; } template <> int foo(int* a, int* b) { return 1; } template int foo(T* a, double* b) { return 2; } int main(int argc, char**argv) { int a, b, c; double x, y, z; printf("%d\n", foo(&a, &b)); printf("%d\n", foo(&a, &x)); } http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc File be/src/exec/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@66 PS4, Line 66: bool IsSupportedPhysicalType(PrimitiveType impala_type, let's make it static - we don't want to export the symbol to be linked with code outside this file. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@70 PS4, Line 70: ( unnecessary parens http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@173 PS4, Line 173: // TODO: Is this check too strict? I don't see why this shouldn't match the file metadata, this seems valid to me. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@182 PS4, Line 182: parquet::SchemaElement* Why not pass by const ref like above? http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@209 PS4, Line 209: stringstream ss; Can you convert to using Substitute() while we're here? We've been very gradually converting to using that for cases like this. http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc File be/src/exec/parquet-plain-test.cc: http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@176 PS4, Line 176: TestType(Decimal4Value(test_val), It would be nice to combine the FIXED_LEN_BYTE_ARRAY and BYTE_ARRAY tests. Mostly the tests are the same except for the expected size, right? Could we make them table-driven so that we test all the same cases but just have different expected output? http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@193 PS4, Line 193: int32_t Any reason to use sizeof(int..) instead of sizeof(Decimal*Value) like above? Or if this is the better way to express it, should we change it above? http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@212 PS4, Line 212: for (int i = 1; i <=16; ++i) { nit: missing space http://gerrit.cloudera.org:8080/#/c/7822/4/testdata/data/binary_decimal_dictionary.parquet File testdata/data/binary_decimal_dictionary.parquet: PS4: Can you update the README to describe how these files were generated. -- To view, visit http://gerrit.cloudera.org:8080/7822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e Gerrit-Change-Number: 7822 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 12 Oct 2017 00:52:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/7822 ) Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-column-stats.inline.h File be/src/exec/parquet-column-stats.inline.h: http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-column-stats.inline.h@89 PS2, Line 89: switch(parquet_type){ > Do you mean to have a Decode wrapper around the templatized Decode methods? The former, so that the interface of Decode() is simpler. How this is implemented seems more a concern of the decoder than the column stats. http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h File be/src/exec/parquet-common.h: http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h@237 PS2, Line 237: ByteSize > Do you mean to have something like I think this particular call here will always return sizeof(int32_t) (line 220). I'd just put that here, since your change explicitly documents that as a template parameter. http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h@338 PS2, Line 338: template <> > thats kinda difficult because DecimalUtil::DecodeFromFixedLenByteArray is a I'm not sure I'm following: it looks like the next three methods are exactly the same. Couldn't you move them into a new method DecodeDecimalValue(const uint8_t* buffer, const uint8_t* buffer_end, int fixed_len_size, T* v) and then call it and return in one line here? I may be missing something though :) -- To view, visit http://gerrit.cloudera.org:8080/7822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e Gerrit-Change-Number: 7822 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 12 Oct 2017 00:33:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 6: (17 comments) http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG@15 PS6, Line 15: them into the Kudu client. Because the Kudu client doesn't provide a Is there any documentation for Kudu about what ordering Kudu uses for comparisons for different types? I guess we already push filters so we're already depending on the orderings being the same as Impala, but I'd be interested to know. We had a similar discussion about Parquet and the spec got clarified a lot as a result (https://github.com/apache/parquet-format/blob/master/LogicalTypes.md now specifies it). Parquet-MR also had various odd behaviour that got shaken out as a result of this. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h File be/src/exec/filter-context.h: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@106 PS6, Line 106: bloom_filter has_bloom_filter() ? At callsites it reads like this should return a filter rather than a bool. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@109 PS6, Line 109: min_max_filter has_min_max_filter? http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc@1220 PS6, Line 1220: MinMaxFilter::Or(src_type(), params.min_max_filter, min_max_filter_.get()); I mentioned this in a later comment, but we should consider what happens if the min/max values are very large - maybe thre's some memory management similar to above. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc File be/src/runtime/runtime-filter-bank.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc@235 PS6, Line 235: MinMaxFilter* min_max_filter = MinMaxFilter::Create(type, &obj_pool_); (nit) - could combine into one line. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc File be/src/runtime/runtime-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc@27 PS6, Line 27: // to a) the atomicity of / pointer assignments and b) the x86 TSO memory model. Comment not relevant any more since we're using actual atomics? http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc File be/src/util/min-max-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc@53 PS6, Line 53: min_str = string(value->ptr, value->len); We should think through the behaviour with very large strings. It may make sense to disable the filters if the strings are too large instead of allocating arbitrarily large amounts of memory. Maybe we could also do some trickery with truncating the strings to avoid handling the filters not existing. E.g. if we have a 4 byte limit on min/max, replace min="aaa" with min="" and max="zz" with max="zzz(". Interested to hear what you think - might be too clever but seems like it could work. It would be good to also allocate tracked memory for the string. The Parquet ColumnStats class solves a similar problem using StringBuffers that allocate from a MemPool. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h@83 PS6, Line 83: #define NUMERIC_MIN_MAX_FILTER(NAME, TYPE) \ Yeah, the macro seems like it may be the least of the evils. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@77 PS6, Line 77: std::string NAME##MinMaxFilter::DebugString() const { \ nit: don't need std:: prefix in .cc files. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@139 PS6, Line 139: out->min.__set_string_val(std::min(in.min.string_val, out->min.string_val)); I feel like we should use our StringValue::Compare() function instead of relying on std::string's comparison function. I'm not sure if they are exactly the same but if we compare these the same way as we compare StringValue's it's easier to reason about. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@291 PS6, Line 291: BigIntMinMaxFilter::Or(in, out); Shouldn't this be calling the timestamp method? http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@327 PS6, Line 327: } Timestamp? http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilt
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test File testdata/workloads/tpcds/queries/tpcds-q26.test: http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6 PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1, > What is the reason behind changing the query text along with the results? The query text in this file now reliably produces the results in this file, and the results match what I received as TPC-DS reference results from Greg. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Thu, 12 Oct 2017 00:05:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 16: Passing run of gerrit-verify-dryrun-external: https://jenkins.impala.io/job/gerrit-verify-dryrun-external/17/ -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Thu, 12 Oct 2017 00:01:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node Since we introduced the FLAGS_is_coordinator, we've forgotten to disable the coordinator specific thread pools on nodes that have only the executor role. This patch fixes that. Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Reviewed-on: http://gerrit.cloudera.org:8080/8242 Reviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h 2 files changed, 16 insertions(+), 6 deletions(-) Approvals: Dan Hecht: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 22:32:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1333/ -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 22:27:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 2: Code-Review+2 Hit IMPALA-6027; rebased to include its fix and carry +2. -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 22:26:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1328/ -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 22:15:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading
Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8235 ) Change subject: IMPALA-5429: Multi threaded block metadata loading .. Patch Set 5: (22 comments) http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@215 PS5, Line 215: // Block metadata loading stats for a single HDFS path. nit: File/block (since we're also loading/refreshing files). Also, you may want to change the name of the private class to reflect that, e.g. PathMetadataLoadingStats or something like that. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217 PS5, Line 217: loadedFiles_ You may want to add a comment here. What is loaded vs refreshed? Is the one superset of the other. Good to clarify to avoid confusion. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@218 PS5, Line 218: _ I believe the convention is that we don't use '_' for public members. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@231 PS5, Line 231: Runnable I don't know how this is used later on, but alternatively you can make PathBlockMetadataLoadRequest implement Callable, hence returning the stats when calling call(). Now you seem to access the stats only through the debugString() function. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@247 PS5, Line 247: Blocks on the loadBlockMetadata() call. Not following this comment. run() either calls refreshBlockMetadata() or loadBlockMetadata(), so I can't really interpret what this comment is saying. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@333 PS5, Line 333: loadBlockMetadata I know I am guilty for some of these names but maybe rename this to "resetAndLoadBlockMetadata"? Then it is more clear what the differences are between this function and rerfreshBlockMetadata(). http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@363 PS5, Line 363: numUnknownDiskIds Are you overriding or incrementing the value of numUnknownDiskIds in the create() function? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368 PS5, Line 368: newFileDescs Hm, that doesn't seem particularly safe (i.e. using the same list for every partition). Are we certain that any other partition modification operation (e.g. alter partition set location) will not try to override the file descriptors, thereby affecting all the other partitions? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@380 PS5, Line 380: changed :* mtime It's not just the changed mtime that we're looking for in order to determiner modification, so you may want to either remove this or mention all the criteria we use. I prefer the former. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@383 PS5, Line 383: The initial table load still uses the listFiles() :* on the data directory that fetches both the FileStatus as well as BlockLocations in :* a single call. remove http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@398 PS5, Line 398: get(0) Comment why we take the file descriptors of one partition and that it doesn't really matter which one we choose. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@399 PS5, Line 399: public String apply(FileDescriptor desc) { : return desc.getFileName(); : } Add @Override and make it a single line (if it fits) http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@448 PS5, Line 448: (fd == null) || (fd.getFileLength() != status.getLen()) || : (fd.getModificationTime() != status.getModificationTime()); I think we were also checking if the partition was cached. Isn't this check needed anymore? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@455 PS5, Line 455: refreshFileMetadata Maybe call it refreshPartitionStorageMetadata? Overall, it may make sense to replace "File/Block" with "Storage" whenever it makes sense. Thoughts? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@694 PS5, Line 694: Exception Why this change? http://gerrit.cloudera.org:8080/#/c/8235/5/fe
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 7: I think this is really close. Please consider adding more tests as suggested in the comments. -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 22:06:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc File be/src/rpc/thrift-server-test.cc: http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc@96 PS7, Line 96: int kdc_port = GetServerPort(); static http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc@247 PS7, Line 247: As discussed offline, may be it'd help to test with short renewal period to exercise the renewal path too if possible. -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 22:06:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1628 PS3, Line 1628: LOG(INFO) << diagnostic_handler->error_str_; Include the query id for context, which should be accessible via state_? Will be hard to track down where the log message is coming from otherwise. http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1634 PS3, Line 1634: (std:: Should be able to drop the std:: - we import it into the namespace in common/names.h (there is other code in the codebase that is inconsistent about this). http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1635 PS3, Line 1635: error_str_.clear(); > I might be confused about the C++11 stuff, but given that you just std::mov Yup http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py File tests/query_test/test_codegen.py: http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py@45 PS1, Line 45: def test_codegen_diagnostic_handler(self, vector): > I don't know LLVM well enough to suggest how to get at these errors. Yeah I think if you directly create a function or global variable with a conflicting name in the module you could induce an error that way. -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 11 Oct 2017 21:51:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. IMPALA-6027: Retry downloading toolchain components. We've seen intermittent 500 errors when downloading the toolchain from S3 over the HTTPS URLs. As a first stab, this commit retries 3 times, with some jitter. I also changed the threadpool introduced previously to have a limit of 4 threads, because that's sufficient to get the speed improvement. The 500 errors have been observed both before and after the threadpool change. For testing, I ran the straight-forward case directly. I introduced a broken version string to observe that retries would happen on any error from wget. Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Reviewed-on: http://gerrit.cloudera.org:8080/8258 Reviewed-by: Thomas Tauber-Marshall Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M bin/bootstrap_toolchain.py 1 file changed, 15 insertions(+), 4 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, but someone else must approve Michael Brown: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 3 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 11 Oct 2017 21:45:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1621 PS2, Line 1621: DiagnosticHandler* diagnostic_handler = reinterpret_cast(context); > I dont think there is any scenario as far as how we are using this handler, You are right. I confused this with something I read about LLVMContext usage online. I don't think we need to check for a null here. -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 11 Oct 2017 21:43:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (23 comments) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.cc@77 PS3, Line 77: Should we add a log message stating which services we registered with KRPC? It might be useful later on as we add more services, while trying to debug user issues to know which services are on KRPC and which are on thrift. Granted there are other ways to find that, but this is easily accessible and straightforward. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@314 PS3, Line 314: w susper-nit: Capital 'W' http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc@198 PS1, Line 198: deseria > Deserialization pool's purpose is to avoid executing deserialization in lin Hmm, this seems like it would be a nice thing to have. Is the absence of a MemTracker the only hindrance to early deserialization? Is there some way we could add this to the process MemTracker if we can't attribute it to a query? If it's too complicated for now, let's track this with a JIRA and write down some ideas there for the next KRPC milestone. http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc@227 PS1, Line 227: // to make sure that the close operation is performed so add to per-recvr list of : // pending closes. It's possible for a sender to issue EOS RPC without sending any : // rows if no rows are m > This may be a bit subtle but this is equivalent to the logic in the non-KRP Ah you're right. Case 2 is what changed in IMPALA-5199, but it looks like that's automatically fixed here. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@112 PS3, Line 112: for (unique_ptr& ctx : early_senders.waiting_sender_ctxs) { : EnqueueRowBatch({recvr->fragment_instance_id(), move(ctx)}); : num_senders_waiting_->Increment(-1); : } : for (unique_ptr& ctx : early_senders.closed_sender_ctxs) { : recvr->RemoveSender(ctx->request->sender_id()); : Status::OK().ToProto(ctx->response->mutable_status()); : ctx->context->RespondSuccess(); : num_senders_waiting_->Increment(-1); : } It's not possible for the same sender to be in waiting_senders_ctxs and closed_sender_ctxs for a given receiver right? Because if it would, it would make more sense to service the 'closed_sender_ctxs' before the 'waiting_sender_ctxs' since we may as well close the receiver instead of wasting CPU processing those RPCs for a while. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@140 PS3, Line 140: while (range.first != range.second) { : shared_ptr recvr = range.first->second; : if (recvr->fragment_instance_id() == finst_id && : recvr->dest_node_id() == node_id) { : return recvr; : } : ++range.first; : } I'm thinking it makes sense to prioritize finding the receiver with the assumption that we will find it in the receiver_map_, rather than assume that it most likely will already be unregistered. In other words, I think it may be more beneficial CPU-wise for general workloads to look in the 'receiver_map_' before looking into the 'closed_stream_cache_'. What do you think? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@151 PS3, Line 151: KrpcDataStreamMgr::AddEarlySender We could merge the implementations of AddEarlySender() and AddEarlyClosedSender() by using templates and some extra params, but maybe the code complexity isn't worth it. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@174 PS3, Line 174: // that it is still preparing, so add payload to per-receiver list. Add comment "In the worst case, this RPC is so late that the receiver is already unregistered and removed from the closed_stream_cache_, in which case it will be responded to by the Maintenance thread after FLAGS_datastream_sender_timeout_ms." http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@242 PS3, Line 242: {
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1332/ -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 21:34:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 21:18:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Hello Sailesh Mukil, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8255 to look at the new patch set (#3). Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. IMPALA-5940: Avoid log spew by using Status::Expected() This change converts two more callers of Status() to Status::Expected() to avoid log spew and the overhead of stack crawling. These two call sites can easily fill the log when there is any network issue. Testing done: Ran concurrent TPC-DS workload with 256 users and verified the log spew is not there. Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 --- M be/src/rpc/thrift-client.cc M be/src/service/impala-server.cc 2 files changed, 9 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8255/3 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test File testdata/workloads/tpcds/queries/tpcds-q26.test: http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6 PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1, What is the reason behind changing the query text along with the results? -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 11 Oct 2017 19:32:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Hello Matthew Mulder, Michael Brown, David Knupp, Alex Behm, Mostafa Mokhtar, Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8102 to look at the new patch set (#17). Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. Main source for TPCDS query and result definitions: https://github.com/gregrahn/tpcds-kit. TPC-DS v2.5.0 qualification queries from G. Rahn, Cloudera, Inc. Data set constructed in mini-cluster using $IMPALA_HOME/buildall.sh -testdata This commit continues previous work on IMPALA-5376 in the ASF Impala repo and the Cloudera Gerrit service. This commit splits multi-query tests in the TPC-DS suite definition into one query and result set per test file, as the test framework requires. Names for such files have -1, -2... inner suffixes. The portion of the TPC-DS test suite in this commit passes. It contains no failures, as reflected by runs of $IMPALA_HOME/tests/run-tests.py query_test/test_tpcds_queries.py ... IMPALA-6007 addresses the TPC-DS cases that require skipping (because we don't support them or they flap) or expected-failure (xfail, because we support them but they fail due to bugs.) These require some added tooling for non-Pytest frameworks like the stress test to avoid attempting them until they work. Tests that flap are marked to skip, with a bug ID, since they don't reliably pass or xfail. Expected result sets come from the TPC-DS kit. Some TPC-DS test cases in this commit have been modified in sematically-neutral ways so as to pass on Impala. The tests/query_test/test_tpcds_queries.py driver file is authoritative for the active/skip/xfail status for each case and a brief reason. The following list describes the current status as: --- test-name deviance from TPC-DS spec changes made --- tpcds-q22a.test RESULT MISMATCH in LSD of AVG() values Fixed AVG()s --- tpcds-q26.test RESULT MISMATCH in LSD of AVG() values -- ADDED TRUNCATE()s TO AVG()s IN SELECT COLUMNS --- tpcds-q30.test UNRECOGNIZED CHARACTER ABSENT, IMPALA-5961. --- tpcds-q31.test RESULT MISMATCH in LSD of DECIMAL values ABSENT, IMPALA-5956. --- tpcds-q35a.test RESULT MISMATCH ABSENT, IMPALA-5950. --- tpcds-q36a.test RESULT MISMATCH ABSENT, IMPALA-4741 --- tpcds-q47.test RESULT MISMATCH in LSD of DECIMAL values ADDED TRUNCATE(2) TO 8th COLUMN OF WITH TABLE, TAKE ACTUAL RESULT AS EXPECTED. --- tpcds-q48.test RESULT MISMATCH in scalar value ABSENT, IMPALA-5950. --- tpcds-q49.test RESULT MISMATCH in LSD of DECIMAL values ABSENT, IMPALA-5945 --- tpcds-q57.test RESULT MISMATCH, excess scale in DECIMAL values FIXED, ADDED TRUNCATE(2) AROUND 6th COLUMN. --- tpcds-q58.test RESULT MISMATCH in DECIMAL values ABSENT, IMPALA-5946 --- tpcds-q59.test RESULT MISMATCH, excess scale in DECIMAL values FIXED, ADDED TRUNCATE(2) AROUND 4th-10th COLUMNS. --- tpcds-q61.test RESULT MISMATCH in DECIMAL value FIXED. CAST RESULT QUOTIENT TO DECIMAL(15, 4), TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q63.test RESULT MISMATCH, excess scale in DECIMAL values ADDED TRUNCATE(2) TO 3rd COLUMN --- tpcds-q64.test RESULT MISMATCH ADDED ORDER BY COLUMNS. --- tpcds-q66.test RESULT MISMATCH ABSENT, IMPALA-4741 --- tpcds-q77a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q78.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q83.test RESULT MISMATCH ABSENT, IMPALA-5945. --- tpcds-q85.test MISSING TABLE "reason" ABSENT, IMPALA-5960 --- tpcds-q86a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q89.test RESULT MISMATCH, DECIMAL values flap ABSENT, ADDED ROUND(2) TO 8th COLUMN, TAKE ACTUAL RESULTS AS EXPECTED, IMPALA-5956. --- tpcds-q90.test RESULT MISMATCH ABSENT, IMPALA-5945. --- tpcds-q93.test MISSING TABLE "reason" ABSENT, IMPALA-5960 --- tpcds-q98.test RESULT MISMATCH FIXED, ADDED ROUND() TO LAST COLUMN Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 --- A testdata/workloads/tpcds/queries/tpcds-q10a.test A testdata/workloads/tpcds/queries/tpcds-q11.test A testdata/workloads/tpcds/queries/tpcds-q12.test A testdata/workloads/tpcds/queries/tpcds-q13.test A testdata/workloads/tpcds/queries/tpcds-q15.test A testdata/workloads/tpcds/queries/tpcds-q16.test A testdata/workloads/tpcds/queries/tpcds-q17.test A testdata/workloads/tpcds/queries/tpcds-q18a.test A testdata/workloads/tpcds/queries/tpcds-q20.test A testdata/workloads/tpcds/queries/tpcds-q21.test A testdata/workloads/tpcds/queries/tpcds-q22a.test D testdata/workloads/tpcds/queries/tpcds-q23-1.test D testdata/workloads/tpcds/queries/tpcds-q23-2.test A testdata/workloads/tpcds/queries/tpcds-q25.test A testdata/workloads/tpcds/queries/tpcds-q26.test D testdata/workloads/tpcds/queries/tpcds-q27.test D testdata/workloads/tpcds/queries/tpcds-q27a.test M testdata/workloads/tpcds/queries/tpcds-q28.test A testda
[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8170 ) Change subject: IMPALA-5789: Add always_false flag in bloom filter .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d Gerrit-Change-Number: 8170 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 19:22:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@644 PS3, Line 644: for (int i = 0; i < channels_.size(); ++i) { The RowBatch is serialized once per channel which is very wasteful. IMPALA-6041. Compare to https://github.com/michaelhkw/incubator-impala/blob/krpc-testing-hung/be/src/runtime/data-stream-sender.cc#L429 -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 19:14:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 19:08:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8255/2/be/src/rpc/thrift-client.cc File be/src/rpc/thrift-client.cc: http://gerrit.cloudera.org:8080/#/c/8255/2/be/src/rpc/thrift-client.cc@74 PS2, Line 74: are typo: aren't -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 19:00:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc File be/src/rpc/thrift-client.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71 PS1, Line 71: } > I think we should make it clear why we use Expected() when we do (since we Done http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173 PS1, Line 1173: INTERNAL_ERROR > hmm, if this is expected (as the comment claims), then this shouldn't be an Fixed. Was just preserving the previous logic. -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:59:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Hello Sailesh Mukil, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8255 to look at the new patch set (#2). Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. IMPALA-5940: Avoid log spew by using Status::Expected() This change converts two more callers of Status() to Status::Expected() to avoid log spew and the overhead of stack crawling. These two call sites can easily fill the log when there is any network issue. Testing done: Ran concurrent TPC-DS workload with 256 users and verified the log spew is not there. Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 --- M be/src/rpc/thrift-client.cc M be/src/service/impala-server.cc 2 files changed, 9 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8255/2 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8235 ) Change subject: IMPALA-5429: Multi threaded block metadata loading .. Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@35 PS5, Line 35: DEFINE_int32(num_metadata_loading_threads, 16, : "(Advanced) The number of metadata loading threads (degree of parallelism) to use " : "when loading catalog metadata."); I'm confused by the commit message which talks about not loading from hms using multiple threads and this flag which indicates that hms is loaded using multiple threads. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217 PS5, Line 217: public int loadedFiles_ = 0; : public int refreshedFiles_ = 0; : public int ignoredFiles_ = 0; add comments for these-- see the question regarding refreshedFiles below, for example. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368 PS5, Line 368: for (HdfsPartition partition: partitions) partition.setFileDescriptors( Am I misreading this or does each partition get set to the same list of newly found descriptors? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@426 PS5, Line 426: new Reference(Long.valueOf(0) why not use numUnknownDiskIds here? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@431 PS5, Line 431: ++loadStats.refreshedFiles_; does refreshedFiles mean "file blocks reloaded" or "file checked for reload and possibly reloaded"? would be good to track how many times the if-block on L418 was entered since this method is intended to be used when few changes are present. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@433 PS5, Line 433: for (HdfsPartition partition: partitions) partition.setFileDescriptors(n same question as in the load method. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773 PS5, Line 773: HDFS and S3 just to clarify, HdfsTable covers both hdfs table metadata as well as metadata needed for s3? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@783 PS5, Line 783: getFileSystem(CONF) I noticed that this is called in many places in this class-- is it bc a given table can be stored on multiple filesystems? http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801 PS5, Line 801: getLoadingThreadPoolSize can different partitions have different number of files? if so, work across threads may vary. what's costly here: per file call, per partition call, or number of blocks per file? -- To view, visit http://gerrit.cloudera.org:8080/8235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481 Gerrit-Change-Number: 8235 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 11 Oct 2017 18:51:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 11 Oct 2017 18:36:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613 PS3, Line 613: if (seconds < numeric_limits::min() || : seconds > numeric_limits::max()) { : // TimeStampVal() takes int64_t. : return TimestampVal::null(); > Good question - my guess is that the compiler is smart enough to eliminate Looks like it should, but it may be worth confirming, e.g. with the compiler explorer. -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 11 Oct 2017 18:36:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1329/ -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5736: Add impala-shell argument to set default query options
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8038 ) Change subject: IMPALA-5736: Add impala-shell argument to set default query options .. Patch Set 4: (10 comments) Thanks! I think this is much clearer than the previous iteration. All of my comments are either nits or requests to add a few more comments. http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@686 PS4, Line 686: tmp_set = {} Add a comment: # Use a temporary to avoid changing set_query_options during iteration. (An alternative is to use del, but use .items() instead of .iteritems()) http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1330 PS4, Line 1330: if __name__ == "__main__": Perhaps we need a big comment: """ There are two types of options: shell options and query_options. You can set these in the shell itself, which overrides settings on the command line, which override the defaults in impala_shell_config_defaults.py. Query options have no defaults within the impala-shell, but they do have defaults on the server. """ http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1350 PS4, Line 1350: # default options loaded in from impala_shell_config_defaults.py Let's take the time to update this comment by disambiguating "shell options" vs "query options" here? http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1354 PS4, Line 1354: impala_shell_defaults.update(loaded_shell_options) I think "impala_query_options_default" is empty, but this assymetry between impala_shell_defaults and query_options is weird. It's weird that you're updating impala_shell_defaults, rather than updating "shell_options". http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1437 PS4, Line 1437: query_options.update( Add comment: Override query_options with those specified on the command line. http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@36 PS4, Line 36: def convert_loaded_shell_option(option, value, default_options, config_filename): Please document config_filename. I'm unclear how it's used. http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@47 PS4, Line 47: # if the option is not set to either true or false, use the default Do we need to log about the ignored value here? http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@36 PS4, Line 36: def convert_loaded_shell_option(option, value, default_options, config_filename): : """Converts some values based on their type in default_options : """ : if default_options[option] in [True, False]: : # validate the option if it can only be a boolean value : # the only choice for these options is true or false : if value.lower() == "true": : return True : elif value.lower() == 'false': : return False : else: : # if the option is not set to either true or false, use the default : return default_options[option] : elif value.lower() == "none": : return None : elif option.lower() == "config_file": : return config_filename This function is mixing 2-space indent and 4-space indent. Please clean up. http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@164 PS4, Line 164: "Only specify this as a option in the commandline." s/as a option/as an option/ http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@183 PS4, Line 183: help="Sets default query options." Add: "May be specified multiple times." Unless it's clear from the usage? -- To view, visit http://gerrit.cloudera.org:8080/8038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986 Gerrit-Change-Number: 8038 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 11 Oct 2017 18:42:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186 PS2, Line 186: Thread > I think this is meant to be a thread pool just for ExecQueryFInstances RPC You're right. Done. -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:20:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8242 to look at the new patch set (#3). Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node Since we introduced the FLAGS_is_coordinator, we've forgotten to disable the coordinator specific thread pools on nodes that have only the executor role. This patch fixes that. Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h 2 files changed, 16 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/8242/3 -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1328/ -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 18:16:29 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc File be/src/kudu/rpc/client_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111 PS1, Line 111: const boost::optional& authn_token, > Should we just take that patch to make (this and) future cherry picks easie That patch is a tidy cleanup that adds the 'modernize-pass-by-value' rule, which we don't use ourselves. Looking over the patch, it looks like not having it could cause a few more conflicts, but they can all be trivially resolved just as I did here. Also, we will keep running into this issue of cherry-picked patches that conflict due to not having the entire history. So it's something we need to solve on a case by case basis. My take is, let's only cherry-pick the patches that we really need. So as for this patch, since it doesn't help us a lot, I think we can leave it out. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:12:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186 PS2, Line 186: General I think this is meant to be a thread pool just for ExecQueryFInstances RPC (and that's what's meant by the "exec_rpc" in the name). -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:06:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8259 ) Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 18:03:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 2: > It'd be good to update the header comment for these fields to say > that they are set only for coordinator role. Done. -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:01:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8242 to look at the new patch set (#2). Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node Since we introduced the FLAGS_is_coordinator, we've forgotten to disable the coordinator specific thread pools on nodes that have only the executor role. This patch fixes that. Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h 2 files changed, 16 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/8242/2 -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc File be/src/kudu/rpc/client_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111 PS1, Line 111: const boost::optional& authn_token, > There was a conflict here since we don't have this patch from the kudu code Should we just take that patch to make (this and) future cherry picks easier? -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:57:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc File be/src/kudu/rpc/server_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc@128 PS1, Line 128: const security::TokenVerifier* token_verifier, > Similarly, there was a conflict here since we don't have that same patch fr Oops, sorry this wasn't a conflict. I misspoke. It was just in client_negotiation. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:55:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: (2 comments) > Is this a conflict free cherry-pick from something already in Kudu > codebase? Yes this is a cherry-pick from the following kudu commit: https://github.com/apache/kudu/commit/31d58522b50aea948f977c7cbc8b64f1b849f323 There were a couple of conflicts, but with simple resolutions. http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc File be/src/kudu/rpc/client_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111 PS1, Line 111: const boost::optional& authn_token, There was a conflict here since we don't have this patch from the kudu code base: https://github.com/apache/kudu/commit/50c7d3249ab5ca19fb4d3c0c8748a4a1c5945a12#diff-ced439e8f22cae7f007af6cb1daf945a The above patch changes removes the const ref on this variable. But the resolution was simple. http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc File be/src/kudu/rpc/server_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc@128 PS1, Line 128: const security::TokenVerifier* token_verifier, Similarly, there was a conflict here since we don't have that same patch from the kudu code base: https://github.com/apache/kudu/commit/50c7d3249ab5ca19fb4d3c0c8748a4a1c5945a12#diff-ced439e8f22cae7f007af6cb1daf945a The above patch changes removes the const ref on this variable. But the resolution was simple. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:54:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 1: It'd be good to update the header comment for these fields to say that they are set only for coordinator role. -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 11 Oct 2017 17:53:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1327/ -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 11 Oct 2017 17:47:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8233 ) Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@308 PS3, Line 308: } nit: This might look like "... to main module.Bla bla." I.e., no space after the period. You might want to add '<< " "' in here to make it look nicer. http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1635 PS3, Line 1635: I might be confused about the C++11 stuff, but given that you just std::move'd it, I don't think you need to clear it. http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py File tests/query_test/test_codegen.py: http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py@45 PS1, Line 45: def test_codegen_diagnostic_handler(self, vector): > I tried moving this to llvm-codegen-test.cc but it turns out that a lot ug I don't know LLVM well enough to suggest how to get at these errors. That said, LoadModuleFromFile() takes a regular file, I think, and not one on HDFS. You might be able to just cp ./llvm-ir/test-loop.bc to a temporary file to induce this. (Or create a second test-loop that has a link-time conflict: presumably name collisions cause errors?) -- To view, visit http://gerrit.cloudera.org:8080/8233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff Gerrit-Change-Number: 8233 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 11 Oct 2017 17:43:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: Is this a conflict free cherry-pick from something already in Kudu codebase? -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:44:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc File be/src/rpc/thrift-client.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71 PS1, Line 71: } I think we should make it clear why we use Expected() when we do (since we don't want to normally use it). So how about a quick comment about that explaining under what condition this error is expected. http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173 PS1, Line 1173: INTERNAL_ERROR hmm, if this is expected (as the comment claims), then this shouldn't be an INTERNAL_ERROR. INTERNAL_ERROR means we've hit a bug of some sort (i.e. some invariant was violated). That's not this case. Anyway, you don't have to fix that now if you don't want. -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:39:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported
Michael Brown has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8259 Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't supported .. IMPALA-6040: skip test_multi_compression_types where Hive isn't supported A recent commit "IMPALA-5448: fix invalid number of splits reported in Parquet scan node" neglected to account for the fact that in some environments, Impala runs without Hive. The typical pattern for tests that use Hive is skip them if they are executed against such environments. Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e --- M tests/query_test/test_scanners.py 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/8259/1 -- To view, visit http://gerrit.cloudera.org:8080/8259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e Gerrit-Change-Number: 8259 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Brown
[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@10 PS1, Line 10: which returns multiple batches instead of one. > I'm ok with the stream staying pinned, the main goal was to remove the GetR WFM, agree the important step here is to eliminate GetRows(). -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 16:55:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 11 Oct 2017 16:53:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 11 Oct 2017 16:48:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Philip Zeyliger has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8258 ) Change subject: IMPALA-6027: Retry downloading toolchain components. .. IMPALA-6027: Retry downloading toolchain components. We've seen intermittent 500 errors when downloading the toolchain from S3 over the HTTPS URLs. As a first stab, this commit retries 3 times, with some jitter. I also changed the threadpool introduced previously to have a limit of 4 threads, because that's sufficient to get the speed improvement. The 500 errors have been observed both before and after the threadpool change. For testing, I ran the straight-forward case directly. I introduced a broken version string to observe that retries would happen on any error from wget. Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b --- M bin/bootstrap_toolchain.py 1 file changed, 15 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/8258/2 -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8258 Change subject: IMPALA-6027: Retry downloading toolchain components. .. IMPALA-6027: Retry downloading toolchain components. We've seen intermittent 500 errors when downloading the toolchain from S3 over the HTTPS URLs. As a first stab, this commit retries 3 times, with some jitter. I also changed the threadpool introduced previously to have a limit of 4 threads, because that's sufficient to get the speed improvement. The 500 errors have been observed both before and after the threadpool change. For testing, I ran the straight-forward case directly. I introduced a broken version string to observe that retries would happen on any error from wget. Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b --- M bin/bootstrap_toolchain.py 1 file changed, 15 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/8258/1 -- To view, visit http://gerrit.cloudera.org:8080/8258 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b Gerrit-Change-Number: 8258 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger
[Impala-ASF-CR] IMPALA-4704: Disallow client connections to imapalad until catalog is received.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Disallow client connections to imapalad until catalog is received. .. Patch Set 7: A Frontend test was asserting an error message removed by this change. Latest patch fixes that test. All tests pass with this latest patch. -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 7 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 11 Oct 2017 16:33:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Disallow client connections to imapalad until catalog is received.
Hello Philip Zeyliger, Balazs Jeszenszky, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8202 to look at the new patch set (#7). Change subject: IMPALA-4704: Disallow client connections to imapalad until catalog is received. .. IMPALA-4704: Disallow client connections to imapalad until catalog is received. Currently, impalad starts beeswax and hs2 servers even if the catalog has not yet been received. As a result, client connections see an error message stating that the impalad is not yet ready. This patch changes the impalad startup sequence to wait until the catalog is received before opening beeswax and hs2 ports and starting their servers. Testing: - python e2e tests that start a cluster without a catalog and check that client connections are rejected as expected. Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/testutil/in-process-servers.cc M bin/start-impala-cluster.py M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_wait.py M tests/custom_cluster/test_coordinators.py 13 files changed, 157 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/7 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 7 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613 PS3, Line 613: if (seconds < numeric_limits::min() || : seconds > numeric_limits::max()) { : // TimeStampVal() takes int64_t. : return TimestampVal::null(); > Is this branch now also evaluated for Decimal[4/8]Values? If so, will it ha Good question - my guess is that the compiler is smart enough to eliminate this branch for 4/8. http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h File be/src/exprs/decimal-operators.h: http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@166 PS3, Line 166: // > nit: Keep comment formatting consistent with the rest of the file. Done http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@168 PS3, Line 168: TDecimal > Did you follow some other example h No, I just wanted to highlight that this T is a kind of decimal. -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 11 Oct 2017 15:44:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8051 to look at the new patch set (#4). Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals The timestamp conversion from negative fractional Decimal types (interpreted as unix timestamp) was wrong - the whole part was rounded toward zero, and fractional part was being added instead of being subtracted. This is fixed by subtracting the fractional part in case of negative decimals. Example for the wrong behaviour: +-+ | cast(cast(-0.1 as decimal(18,10)) as timestamp) | +-+ | 1970-01-01 00:00:00.1 | +-+ while casting to double works correctly: +-+ | cast(cast(-0.1 as double) as timestamp) | +-+ | 1969-12-31 23:59:59.9 | +-+ Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 --- M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/decimal-operators.h M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.h 4 files changed, 44 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8051/4 -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613 PS3, Line 613: if (seconds < numeric_limits::min() || : seconds > numeric_limits::max()) { : // TimeStampVal() takes int64_t. : return TimestampVal::null(); Is this branch now also evaluated for Decimal[4/8]Values? If so, will it have a perf impact? http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h File be/src/exprs/decimal-operators.h: http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@166 PS3, Line 166: // nit: Keep comment formatting consistent with the rest of the file. http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@168 PS3, Line 168: TDecimal This naming convention usually indicates Thrift structures throughout the codebase. Did you follow some other example here? Otherwise "const T& decimal_value" seems more consistent. -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 11 Oct 2017 15:21:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8051 ) Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG@9 PS2, Line 9: The timestamp conversion from negative fractional Decimal types : (interpreted as unix timestamp) was wrong - the whole part : was rounded toward zero, and fractional part was being added : instead of being subtracted. > The commit message should include a brief description of how change fixes i Done http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG@14 PS2, Line 14: Example for the wrong behaivour: > Spelling. Please consider setting up an automatic spell checker in your tex I am still trying different linux text editors. http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@617 PS2, Line 617: if(dv.is_negative()) nanoseconds *= -1; > I think it might be easier to read if you extract the sign in in line 610 a I refactored this part a bit, now this "if" is no longer duplicated. http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@625 PS2, Line 625: int64_t > why is this 64bit? You are right, a 32 bit int is always enough for the nanoseconds part. I have changed to code accordingly. http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@640 PS2, Line 640: int128_t > why is this 128bit? Done http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/expr-test.cc@2363 PS2, Line 2363: TestTimestampValue("cast(cast(-123.45 as decimal(9,2)) as timestamp)", > If you include a negative example in the commit message, it should also be I have added a very similar test and changed the commit message to contain the same example. The new test is useful because there are more than 9 fractional digits, which exercises a different branch in DecimalOperators::ConvertToNanoseconds. -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 11 Oct 2017 14:43:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8051 to look at the new patch set (#3). Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals The timestamp conversion from negative fractional Decimal types (interpreted as unix timestamp) was wrong - the whole part was rounded toward zero, and fractional part was being added instead of being subtracted. This is fixed by subtracting the fractional part in case of negative decimals. Example for the wrong behaviour: +-+ | cast(cast(-0.1 as decimal(18,10)) as timestamp) | +-+ | 1970-01-01 00:00:00.1 | +-+ while casting to double works correctly: +-+ | cast(cast(-0.1 as double) as timestamp) | +-+ | 1969-12-31 23:59:59.9 | +-+ Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 --- M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/decimal-operators.h M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.h 4 files changed, 43 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8051/3 -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Lars Volker