[Impala-ASF-CR] IMPALA-7051: Serialize Maven invocations.
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10460 Change subject: IMPALA-7051: Serialize Maven invocations. .. IMPALA-7051: Serialize Maven invocations. I've observed some rare cases where Impala fails to build. I believe it's because two Maven targets (yarn-extras and ext-data-source) are being executed simultaneously. Maven's handling of ~/.m2/repository, for example, is known to be not safe. This patch serializes the Maven builds with the following dependency graph: fe -> yarn-extras -> ext-data-source -> impala-parent The ordering of yarn-extras -> ext-data-source is arbitrary. I decided that this artificial dependency was the clearest way to prevent parallel executions. Having mvn-quiet.sh take a lock seemed considerably more complex. Change-Id: Ie24f34f421bc7dcf9140938464d43400da95275e --- M common/yarn-extras/CMakeLists.txt 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10460/1 -- To view, visit http://gerrit.cloudera.org:8080/10460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie24f34f421bc7dcf9140938464d43400da95275e Gerrit-Change-Number: 10460 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 19 May 2018 03:02:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10165 ) Change subject: IMPALA-6941: load more text scanner compression plugins .. Patch Set 13: This change did not cherrypick successfully into branch 2.x. To resolve this, please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/518/ . -- To view, visit http://gerrit.cloudera.org:8080/10165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6 Gerrit-Change-Number: 10165 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 19 May 2018 01:57:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10455 ) Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 19 May 2018 01:50:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#14). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 676 insertions(+), 241 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/14 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302 PS13, Line 302: for (const TScanRangeLocationList& range : entry.second.concrete_range) { : expanded_locations.push_back(range); : } > that could be just expanded_locations.insert(concrete_range.begin(), concre Done http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108 PS13, Line 108: concrete_range > plural since the field is a list: concrete_ranges, split_specs Done http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224 PS13, Line 224: scanRangeSpecs_.getSplit_specSize() > this seems wrong - a single spec may result in multiple hosts, no? removed. this class never adds specs, but added a precondition check in init in the event that's changed for some reason. http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126 PS13, Line 1126: scanRangeSpecs_.getSplit_specSize(); > shouldn't that do some calculation based on file size and blocks size and i good catch. needed to change other places that directly use the size of the spec list. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 19 May 2018 01:39:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7039: Ignore the port in HBase planner tests
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10459 Change subject: IMPALA-7039: Ignore the port in HBase planner tests .. IMPALA-7039: Ignore the port in HBase planner tests Before this patch, we used to check the HBase port in the HBase planner tests. This caused a failure when HBase was running on a different port than expected. We fix the problem in this patch by not checking the HBase port. Testing: ran the FE tests and they passed. Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0 --- M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test 3 files changed, 20 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/10459/1 -- To view, visit http://gerrit.cloudera.org:8080/10459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0 Gerrit-Change-Number: 10459 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting
Hello Balazs Jeszenszky, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10457 to look at the new patch set (#2). Change subject: IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting .. IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_shell_options.xml 2 files changed, 31 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10457/2 -- To view, visit http://gerrit.cloudera.org:8080/10457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e Gerrit-Change-Number: 10457 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10457 Change subject: IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting .. IMPALA-7050: [DOCS] Document the max serialized incremental stat size setting Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e --- M docs/topics/impala_perf_stats.xml 1 file changed, 23 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10457/1 -- To view, visit http://gerrit.cloudera.org:8080/10457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifa80325f0008d42a9cc8178e7c144fc2b49d7d4e Gerrit-Change-Number: 10457 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10319 ) Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker. .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10319 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f Gerrit-Change-Number: 10319 Gerrit-PatchSet: 4 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Sat, 19 May 2018 00:37:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10319 ) Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker. .. IMPALA-6070: Adding ASAN, --tail to test-with-docker. * Adds -ASAN suites to test-with-docker. * Adds --tail flag, which starts a tail subprocess. This isn't pretty (there's potential for overlap), but it's a dead simple way to keep an eye on what's going on. * Fixes a bug wherein I could call "docker rm " twice simultaneously, which would make Docker fail the second call, and then fail the related "docker rmi". It's better to serialize, and I did that with a simple lock. Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f Reviewed-on: http://gerrit.cloudera.org:8080/10319 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M docker/entrypoint.sh M docker/test-with-docker.py 2 files changed, 48 insertions(+), 7 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10319 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f Gerrit-Change-Number: 10319 Gerrit-PatchSet: 5 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10365 ) Change subject: IMPALA-6035: Add query options to limit thread reservation .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10365/6/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10365/6/be/src/scheduling/admission-controller.cc@419 PS6, Line 419: DCHECK_GT(e.second.instance_params.size(), 0); after switch to using "e.first" instead of "e.second.instance_params[0]->host" I think we can get rid of this DCHECK. Or was this used for another reason? http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc@672 PS4, Line 672: Only -1 > Ideally no, but I wanted to keep it consistent with other options that beha makes sense. Lets document the -1 option in the thrift declaration too then -- To view, visit http://gerrit.cloudera.org:8080/10365 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 Gerrit-Change-Number: 10365 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 19 May 2018 00:34:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Moving default sanitizer options into init.cc from shell scripts. When running tests with ASAN, you need to set ASAN_OPTIONS explicitly, to avoid various failures. In particular, backend tests fail complaining about memory leaks and tests that use the parquet-reader binary complain similarly. It turns out that we can shove the default options into our code base directly, avoiding the need for users to set it explicitly. I've done the same thing for TSAN and UBSAN. I've manually checked that these are being read. In the UBSAN case, I checked both with gdb and with inotifywatch on the suppressions file. Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Reviewed-on: http://gerrit.cloudera.org:8080/10404 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins --- M be/CMakeLists.txt M be/src/common/init.cc M bin/run-backend-tests.sh M bin/start-catalogd.sh M bin/start-impalad.sh M bin/start-statestored.sh 6 files changed, 28 insertions(+), 16 deletions(-) Approvals: Jim Apple: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 7 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 19 May 2018 00:26:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10365 ) Change subject: IMPALA-6035: Add query options to limit thread reservation .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@414 PS4, Line 414: int64_t max_min_mem_reservation_bytes = -1; > (yet another rename nit for this variable. Dont feel too strongly about it Done http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@414 PS4, Line 414: int64_t max_min_mem_reservation_bytes = -1; : const TNetworkAddress* max_min_mem_reservation_bytes_addr = nullptr; : int64_t cluster_min_mem_reservation_bytes = 0; : int64_t max_thread_reservation = 0; : const TNetworkAddress* max_thread_reservation_addr = nullptr; : int64_t cluster_thread_reservation = 0; > it it dosent hurt readability then we can use a pair for b Done http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/scheduling/admission-controller.cc@424 PS4, Line 424: e.second.instance_params[0] > can use e.first instead of this since Done http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/10365/4/be/src/service/query-options.cc@672 PS4, Line 672: Only -1 > do we need both 0 and -1 to represent "no limit"? Ideally no, but I wanted to keep it consistent with other options that behave like this. I guess one alternative is to make "0" mean literally zero and have it reject all queries :). But I think that might cause more confusion. -- To view, visit http://gerrit.cloudera.org:8080/10365 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 Gerrit-Change-Number: 10365 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 19 May 2018 00:03:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation
Hello Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10365 to look at the new patch set (#5). Change subject: IMPALA-6035: Add query options to limit thread reservation .. IMPALA-6035: Add query options to limit thread reservation Adds two options: THREAD_RESERVATION_LIMIT and THREAD_RESERVATION_AGGREGATE_LIMIT, which are both enforced by admission control based on planner resource requirements and the schedule. The mechanism used is the same as the minimum reservation checks. THREAD_RESERVATION_LIMIT limits the total number of reserved threads in fragments scheduled on a single backend. THREAD_RESERVATION_AGGREGATE_LIMIT limits the sum of reserved threads across all fragments. This also slightly improves the minimum reservation error message to include the host name. Testing: Added end-to-end tests that exercise the code paths. Ran core tests. Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test A testdata/workloads/functional-query/queries/QueryTest/resource-limits.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test A tests/query_test/test_resource_limits.py 12 files changed, 248 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10365/5 -- To view, visit http://gerrit.cloudera.org:8080/10365 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 Gerrit-Change-Number: 10365 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc@408 PS3, Line 408: = : make_pair(nullptr, std::numeric_limits::max()) I think you can just initialise this directly: pair min_proc_mem_limit(nullptr, std::numeric_limits::max()); -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 23:54:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10060 to look at the new patch set (#11). Change subject: IMPALA-5216: Make admission control queuing async .. IMPALA-5216: Make admission control queuing async Implement asynchronous admission control queuing. This is achieved by running the admission control code-path in a separate thread. Major changes include: propagating cancellation to the admission control thread and dequeuing thread, and adding a new Query Operation State called "PENDING" that represents the state between completion of planning and starting of query execution. Testing: - Added a deterministic end to end test - Ran multiple stress tests successfully with a cancellation probability of 60% and with different values for the following parameters: max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a valid state afterwards (no orphan fragments or wrong metrics). - Ran all exhaustive tests and ASAN core tests successfully. TODO: change terminology of "in_flight_query" to "submitted_queries". Need to identify all references of this terminology, eg. in comments, tests, variable names, etc. Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf --- M be/src/common/atomic.h M be/src/common/logging.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/debug-util.cc M be/src/util/debug-util.h M be/src/util/promise-test.cc M be/src/util/promise.h M common/thrift/ImpalaService.thrift M tests/authorization/test_authorization.py M tests/beeswax/impala_beeswax.py M tests/common/impala_connection.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/hs2/hs2_test_suite.py M tests/hs2/test_hs2.py M tests/query_test/test_observability.py M www/query_backends.tmpl 29 files changed, 636 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/11 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 11 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147 PS7, Line 1147: return beeswax::QueryState::EXCEPTION; : } : } : } : : void ClientRequestState::FinishExecQueryOrDmlRequest() { : #ifndef NDEBUG : SleepIfSetInDebugOptions(schedule_->query_options(), SLEEP_BEFORE_ADMISSION_MS); : #endif : DCHECK(exec_env_->admission_controller() != nullptr); : Status admit_status = ExecEnv::GetInstance()->admission_controller()->AdmitQuery( : s > Yes, it is equivalent to cancellation happening right after this check, whi Done. -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 10 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 23:24:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 13: (4 comments) Thanks, that looks simpler and clearer now. Just some minor things. http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302 PS13, Line 302: for (const TScanRangeLocationList& range : entry.second.concrete_range) { : expanded_locations.push_back(range); : } that could be just expanded_locations.insert(concrete_range.begin(), concrete_range.end())? http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108 PS13, Line 108: concrete_range plural since the field is a list: concrete_ranges, split_specs http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224 PS13, Line 224: scanRangeSpecs_.getSplit_specSize() this seems wrong - a single spec may result in multiple hosts, no? Though I guess for hbase this won't be set so in practice doesn't matter. But should we instead just assert that the split_spec list size is 0? http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126 PS13, Line 1126: scanRangeSpecs_.getSplit_specSize(); shouldn't that do some calculation based on file size and blocks size and is splittable? i.e. after your change we'll get a different number for numRemoteRanges when running on S3, right? -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 18 May 2018 23:09:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@129 PS4, Line 129: static FileDescriptor createEc(FileStatus fileStatus, BlockLocation[] blockLocations, The logic here seems almost identical to the create() function. Maybe modify the create() function to accept the isEC argument instead of creating a new function? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 23:05:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Testing: It passes core tests. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 88 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/4 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10455 ) Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 22:28:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/10455 ) Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. Patch Set 3: Code-Review+2 Fixed a typo in commit msg, carrying +2. -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 22:28:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
Hello Philip Zeyliger, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10455 to look at the new patch set (#3). Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. IMPALA-6317: Add -cmake_only option to buildall.sh It's sometimes useful to be able to build a complete Impala dev environment without necessarily building the Impala binary itself -- e.g., when one wants to use the internal test framework to run tests against an instance of Impala running on a remote cluster. - This patch adds a -cmake_only flag to buildall.sh, which then gets propagated to the make_impala.sh. - Added a missing line to the help text re: passing the -ninja command line option Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f --- M buildall.sh 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10455/3 -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG@12 PS3, Line 12: Mention how this patch was tested. Did you run the exhaustive build? http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@29 PS3, Line 29: IS_ISILON, : IS_LOCAL, : IS_HDFS, : IS_S3, : IS_ADLS, : SECONDARY_FILESYSTEM, : IS_EC This should be sorted alphabetically http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@150 PS3, Line 150: doesn't work. "do not work" http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py File tests/util/filesystem_utils.py: http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py@33 PS3, Line 33: os.getenv("ERASURE_CODING") We want to check if ERASURE_CODING == true here. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 21:57:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 13: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2507/ -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 21:55:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10455 ) Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. Patch Set 2: Code-Review+2 Thanks! -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 2 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 21:53:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6317: Add -cmake only option to buildall.sh
David Knupp has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10455 Change subject: IMPALA-6317: Add -cmake_only option to buildall.sh .. IMPALA-6317: Add -cmake_only option to buildall.sh It's sometimes useful to be able to build a complete Impala dev environment without necessarily building the Impala binary itself -- e.g., when one wants to use the internal test framework to run tests against an instance of Impala running on a remote cluster. - This patch add a -cmake_only flag to buildall.sh, which then gets propagated to the make_impala.sh. - Added a missing line to the help text re: passing the -ninja command line option Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f --- M buildall.sh 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10455/2 -- To view, visit http://gerrit.cloudera.org:8080/10455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If31a4e29425a6a20059cba2f43b72e4fb908018f Gerrit-Change-Number: 10455 Gerrit-PatchSet: 2 Gerrit-Owner: David Knupp
[Impala-ASF-CR] [DOCS] Fixed misleading documentation on Impala + HDFS caching
Alex Rodoni has removed Jim Apple from this change. ( http://gerrit.cloudera.org:8080/10454 ) Change subject: [DOCS] Fixed misleading documentation on Impala + HDFS caching .. Removed reviewer Jim Apple. -- To view, visit http://gerrit.cloudera.org:8080/10454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7 Gerrit-Change-Number: 10454 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] [DOCS] Fixed misleading documentation on Impala + HDFS caching
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10454 Change subject: [DOCS] Fixed misleading documentation on Impala + HDFS caching .. [DOCS] Fixed misleading documentation on Impala + HDFS caching Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7 --- M docs/topics/impala_perf_hdfs_caching.xml 1 file changed, 5 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/10454/1 -- To view, visit http://gerrit.cloudera.org:8080/10454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I63cd1ff7b885a094a4a3e91c31101d25414b4db7 Gerrit-Change-Number: 10454 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10277 ) Change subject: IMPALA-5737: Tighten minicluster memory limit .. Patch Set 6: > Patch Set 6: > > I kicked off https://jenkins.impala.io/job/test-with-docker-parameterized/13 > with this change to see how it'll work. The logic in impala-config.sh is > affected by cgroups which might be set in Docker. > > I didn't look to see why GVO 2501 failed. GVO 2501 failed because I forgot to git add a file and I didn't run it on jenkins after the renaming. I'm testing it and will re-run the docker job afterwards. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/10277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe Gerrit-Change-Number: 10277 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 21:29:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#13). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 665 insertions(+), 240 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/13 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift@375 PS12, Line 375: per_node_scan_ranges > Thanks, the new thrift structures make more sense to me. Maybe it's simpler yeah, it lets me revert to upfront expansion as I had couple of changes back (but without mixed concrete/spec) and get rid of that iterator. http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift@202 PS12, Line 202: THdfsFileSplitGeneratorSpec > update Done -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 18 May 2018 21:24:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10277 ) Change subject: IMPALA-5737: Tighten minicluster memory limit .. Patch Set 6: I kicked off https://jenkins.impala.io/job/test-with-docker-parameterized/13 with this change to see how it'll work. The logic in impala-config.sh is affected by cgroups which might be set in Docker. I didn't look to see why GVO 2501 failed. -- To view, visit http://gerrit.cloudera.org:8080/10277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe Gerrit-Change-Number: 10277 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 21:23:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10452 Change subject: IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE .. IMPALA-6909: [DOCS] SET ROW FORMAT in ALTER TABLE Change-Id: Ib680f25d8c929e1194a2274777f83851b04bcb93 --- M docs/topics/impala_alter_table.xml 1 file changed, 36 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/10452/1 -- To view, visit http://gerrit.cloudera.org:8080/10452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib680f25d8c929e1194a2274777f83851b04bcb93 Gerrit-Change-Number: 10452 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10319 ) Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker. .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2510/ -- To view, visit http://gerrit.cloudera.org:8080/10319 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f Gerrit-Change-Number: 10319 Gerrit-PatchSet: 4 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 May 2018 21:15:36 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10446 ) Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS. .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5 Gerrit-Change-Number: 10446 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 May 2018 21:14:09 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10446 ) Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS. .. IMPALA-7035: Configure jceks.key.serialFilter for KMS. Configures a Java property for KMS to account for JDK 8u171's security fixes. I was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the following error: AssertionError: Error creating encryption zone: RemoteException: Can't recover key for testkey1 from keystore file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. In the meanwhile, setting this property seems to do the trick. Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5 Reviewed-on: http://gerrit.cloudera.org:8080/10418 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10446 --- M testdata/cluster/node_templates/cdh5/etc/init.d/kms 1 file changed, 6 insertions(+), 0 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5 Gerrit-Change-Number: 10446 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Jenkins is complaining about our HBase planner tests, which have been flaky recently. The "all-builds" job succeeded. Anyway, I asked Jenkins to try again. -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 21:11:41 + Gerrit-HasComments: No
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2509/ -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 21:11:05 + Gerrit-HasComments: No
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2506/ -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 21:04:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc File be/src/exec/plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc@102 PS4, Line 102: consumer_cv_.NotifyAll(); > We could make these NotifyOne() but I think it's inconsequential yeah seems fine to me the way it was since there's no important difference in this case. -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 4 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 21:01:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Hello Michael Ho, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10449 to look at the new patch set (#5). Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. IMPALA-7011: Simplify PlanRootSink control logic 1) The eos_ and sender_done_ bits really encode three possible states that the sender can be in. Make this explicit using an enum with three values. 2) The purpose of CloseConsumer() has changed over time and we can clean this up now: a) Originally, it looks like it was used to unblock the sender when the consumer finishes before eos, but also keep the sink alive long enough for the coordinator. This is no longer necessary now that control structures are owned by the QueryState whose lifetime is controlled by a reference count taken by the coordinator. So, we don't need the coordinator to tell the sink it's done calling it and we don't need the consumer_done_ state. b) Later on, CloseConsumer() was used as a cancellation mechinism. We need to keep this around (or use timeouts on the condvars) to kick both the consumer and producer on cancellation. But let's make the cancellation logic similar to the exec nodes and other sinks by driving the cancellation using the RuntimeState's cancellation flag. Now that CloseConsumer() is only about cancellation, rename it to Cancel() (later we may promote it to DataSink and implement in the data stream sender as well). Testing: - Exhaustive - Minicluster concurrent_select.py stress Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f --- M be/src/exec/plan-root-sink.cc M be/src/exec/plan-root-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc 4 files changed, 58 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/5 -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 5 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h File be/src/exec/plan-root-sink.h: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@49 PS3, Line 49: > dup Done -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 5 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 20:59:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 20:48:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. Patch Set 4: Code-Review+1 (1 comment) LGTM, someone else should +2 once everyone is happy http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc File be/src/exec/plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/10449/4/be/src/exec/plan-root-sink.cc@102 PS4, Line 102: consumer_cv_.NotifyAll(); We could make these NotifyOne() but I think it's inconsequential -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 4 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 20:47:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h File be/src/exec/plan-root-sink.h: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@49 PS3, Line 49: the dup -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 20:43:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. Patch Set 3: (9 comments) http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@7 PS3, Line 7: IMPALA_7011 > nit: IMPALA-7011 Done http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@20 PS3, Line 20: control > nit: controlled Done http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@24 PS3, Line 24: to > nit: extra word? Done http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h File be/src/exec/plan-root-sink.h: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@57 PS3, Line 57: respecively > respectively Done http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@102 PS3, Line 102: Canecl > Cancel Done http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@106 PS3, Line 106: /// State of the sender: > Producer instead of sender? Seems inconsistent with above. Did the opposite, only because the code variables use sender more (so changed the comments from producer to sender above). http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@107 PS3, Line 107: MORE_ROWS > nit: how about ROWS_PENDING Done http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc File be/src/exec/plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@a76 PS3, Line 76: > just curious, was this dead code all along or is there ever a possibility o I didn't look back at git history, but like you said, it's clearly a dead conditional. It may have been possible to pass batch == nullptr before the FragmentInstanceState refactor. Okay, looked at git history, and this was a do-while loop (so the dereference wasn't there. But still, ValidateCollectionSlots() dereferences. Plus, it doesn't make sense to do the check after the Wait() anyway. So, I'm not really sure why it was written. http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@116 PS3, Line 116: enounters > encounters Done -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 20:19:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7011: Simplify PlanRootSink control logic
Hello Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10449 to look at the new patch set (#4). Change subject: IMPALA-7011: Simplify PlanRootSink control logic .. IMPALA-7011: Simplify PlanRootSink control logic 1) The eos_ and sender_done_ bits really encode three possible states that the sender can be in. Make this explicit using an enum with three values. 2) The purpose of CloseConsumer() has changed over time and we can clean this up now: a) Originally, it looks like it was used to unblock the sender when the consumer finishes before eos, but also keep the sink alive long enough for the coordinator. This is no longer necessary now that control structures are owned by the QueryState whose lifetime is controlled by a reference count taken by the coordinator. So, we don't need the coordinator to tell the sink it's done calling it and we don't need the consumer_done_ state. b) Later on, CloseConsumer() was used as a cancellation mechinism. We need to keep this around (or use timeouts on the condvars) to kick both the consumer and producer on cancellation. But let's make the cancellation logic similar to the exec nodes and other sinks by driving the cancellation using the RuntimeState's cancellation flag. Now that CloseConsumer() is only about cancellation, rename it to Cancel() (later we may promote it to DataSink and implement in the data stream sender as well). Testing: - Exhaustive - Minicluster concurrent_select.py stress Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f --- M be/src/exec/plan-root-sink.cc M be/src/exec/plan-root-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc 4 files changed, 58 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/4 -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 4 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] [DOCS] Sentry is required for Impala to enable delegation
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10451 to look at the new patch set (#2). Change subject: [DOCS] Sentry is required for Impala to enable delegation .. [DOCS] Sentry is required for Impala to enable delegation Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb --- M docs/topics/impala_delegation.xml 1 file changed, 42 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10451/2 -- To view, visit http://gerrit.cloudera.org:8080/10451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb Gerrit-Change-Number: 10451 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] [DOCS] Sentry is required for Impala to enable delegation
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10451 Change subject: [DOCS] Sentry is required for Impala to enable delegation .. [DOCS] Sentry is required for Impala to enable delegation Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb --- M docs/topics/impala_delegation.xml 1 file changed, 43 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10451/1 -- To view, visit http://gerrit.cloudera.org:8080/10451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I002d3d33eee6a9b9336f21c81a4de75ed3bd5efb Gerrit-Change-Number: 10451 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc@408 PS3, Line 408: pair min_proc_mem_limit = : make_pair(nullptr, std::numeric_limits::max()); > @Tim: let me know if you decide not to adopt this pattern for your patch fo WFM http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py@454 PS3, Line 454: exec_options['mem_limit'] = "3G" Does this line do anything? It's already set to 3G above. Is it intentional that num_nodes=1 still at this point? It might be easier to follow if we just created a copy of exec_options for each query rather than mutating the one version. We use the copy() function in a few other tests for this purpose. What do you think? http://gerrit.cloudera.org:8080/#/c/10396/3/tests/custom_cluster/test_admission_controller.py@473 PS3, Line 473:"2.00 GB but only 1.00 GB out of 2.00 GB was available.", str(e)), str(e) long line -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 19:20:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA_7011: Simplify PlanRootSink control logic .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h File be/src/exec/plan-root-sink.h: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@57 PS3, Line 57: respecively respectively http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@106 PS3, Line 106: /// State of the sender: Producer instead of sender? Seems inconsistent with above. http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc File be/src/exec/plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@116 PS3, Line 116: enounters encounters -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 19:00:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 3: Using the original block length/offset breaks more tests. I disabled them. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 18:50:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 87 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/3 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA_7011: Simplify PlanRootSink control logic .. Patch Set 3: Code-Review+1 (5 comments) http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@7 PS3, Line 7: IMPALA_7011 nit: IMPALA-7011 http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@20 PS3, Line 20: control nit: controlled http://gerrit.cloudera.org:8080/#/c/10449/3//COMMIT_MSG@24 PS3, Line 24: to nit: extra word? http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h File be/src/exec/plan-root-sink.h: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.h@107 PS3, Line 107: MORE_ROWS nit: how about ROWS_PENDING http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc File be/src/exec/plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/10449/3/be/src/exec/plan-root-sink.cc@a76 PS3, Line 76: just curious, was this dead code all along or is there ever a possibility of passing a null batch to Send()(granted that would crash at L67 anyway or at L73 on a release build)? -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 18:29:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2507/ -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 18:03:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 18:02:41 + Gerrit-HasComments: No
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2506/ -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 17:31:40 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10446 ) Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS. .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2505/ -- To view, visit http://gerrit.cloudera.org:8080/10446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5 Gerrit-Change-Number: 10446 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 May 2018 17:29:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868 PS12, Line 2868: // this would make it necessary to reload the table after alter_table in order to > I agree that today we will always reload the table metadata from the HMS an Thanks for the explanation! I have rewritten the comment, I hope that it is better now. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 17:04:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10116 to look at the new patch set (#13). Change subject: IMPALA-6131: Track time of last statistics update in metadata .. IMPALA-6131: Track time of last statistics update in metadata The timestamp of the last COMPUTE STATS operation is saved to table property "impala.lastComputeStatsTime". The format is the same as in "transient_lastDdlTime", so the two can be compared to check if the schema has changed since computing statistics. Other changes: - Handling of "transient_lastDdlTime" is simplified - the old logic set it to current time + 1, if the old version was >= current time, to ensure that it is always increased by DDL operations. This was useful in the past, as IMPALA-387 used lastDdlTime to check if partition data needs to be reloaded, but since IMPALA-1480, Impala does not rely on lastDdlTime at all. - Computing / setting stats on HDFS tables no longer increases "transient_lastDdlTime". - When Kudu tables are (re)loaded, it is checked if their HMS representation is up to date, and if it is, then IMetaStoreClient.alter_table() is not called. The old logic always called alter_table() after loading metadata from Kudu. This change was needed to ensure that "transient_lastDdlTime" works similarly in HDFS and Kudu tables, and should also make (re)loading Kudu tables faster. Notes: - Kudu will be able to sync its tables to HMS in the near future (see KUDU-2191), so the Kudu metadata handling in Impala may need to be redesigned. Testing: tests/metadata/test_last_ddl_time_update.py is extended by - also checking "impala.lastComputeStatsTime" - testing more SQL statements - tests for Kudu tables Note that test_last_ddl_time_update.py is ran only in exhaustive testing. Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M tests/metadata/test_last_ddl_time_update.py 8 files changed, 228 insertions(+), 173 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/13 -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868 PS12, Line 2868: // catalogd would read it back after alter_table, but as this would not work for > Do you have an example in mind when the table should not be reloaded? I loo I agree that today we will always reload the table metadata from the HMS and set a new msTbl. That behavior is not by design though - I think it's unintentional and probably a bug (see for example IMPALA-6994). Personally, I think this comment is difficult to understand because it's explanation relies on (1) understanding what callers will do after calling this function, and (2) understanding what is done for Kudu tables, so there's some loss of abstraction. Imo, it would be clearer to declare a general policy of setting the lastDdlTime in Impala to not rely on the HMS setting it and to avoid having to fetch it from HMS if possible. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 16:39:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6994: Avoid reloading a table's HMS data for file-only operations.
Pranay Singh has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10450 Change subject: IMPALA-6994: Avoid reloading a table's HMS data for file-only operations. .. IMPALA-6994: Avoid reloading a table's HMS data for file-only operations. Catalog currently loads the HMS data even, if only the file metadata load is requested. This change detects some of those updates and tries to skip the loading of HMS data if only the file metadata load is requested. Testing: Ran the core test without failures. Change-Id: Iaabdf38af3f30c65ada9734eb471dbfa6ecdd74a --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java 1 file changed, 39 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10450/1 -- To view, visit http://gerrit.cloudera.org:8080/10450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iaabdf38af3f30c65ada9734eb471dbfa6ecdd74a Gerrit-Change-Number: 10450 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868 PS12, Line 2868: // catalogd would read it back after alter_table, but as this would not work for > I don't think it's true that we'd always read it back after calling this fu Do you have an example in mind when the table should not be reloaded? I looked at the callers of applyAlterTable(): loadTableMetadata() is called after alter*() + dropTableStats(), and createTable() only creates an "incomplete table", so the table will be reloaded before it is actually used. Because of this, my impression was that tables are always reloaded by design. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 May 2018 15:43:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA 7011: Simplify PlanRootSink control logic
Dan Hecht has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/10449 ) Change subject: IMPALA_7011: Simplify PlanRootSink control logic .. IMPALA_7011: Simplify PlanRootSink control logic 1) The eos_ and sender_done_ bits really encode three possible states that the sender can be in. Make this explicit using an enum with three values. 2) The purpose of CloseConsumer() has changed over time and we can clean this up now: a) Originally, it looks like it was used to unblock the sender when the consumer finishes before eos, but also keep the sink alive long enough for the coordinator. This is no longer necessary now that control structures are owned by the QueryState whose lifetime is control by a reference count taken by the coordinator. So, we don't need the coordinator to tell the sink it's done calling it and we don't need the consumer_done_ state. b) Later on, CloseConsumer() was used to as a cancellation mechinism. We need to keep this around (or use timeouts on the condvars) to kick both the consumer and producer on cancellation. But let's make the cancellation logic similar to the exec nodes and other sinks by driving the cancellation using the RuntimeState's cancellation flag. Now that CloseConsumer() is only about cancellation, rename it to Cancel() (later we may promote it to DataSink and implement in the data stream sender as well). Testing: - Exhaustive - Minicluster concurrent_select.py stress Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f --- M be/src/exec/plan-root-sink.cc M be/src/exec/plan-root-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc 4 files changed, 57 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10449/3 -- To view, visit http://gerrit.cloudera.org:8080/10449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc75617a253fd43a6122baa4b4dc7aeb1dbe633f Gerrit-Change-Number: 10449 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Hecht
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py@135 PS12, Line 135: selected as the pivot (the old method), the sorter tends to get stuck selecting the > I gave this a second look and apparently I mis-measured something the last As discussed offline, I couldn't reduce the runtime of this test. As agreed, I'm dropping it as the majority of the spilling sorts have already been covered by the existing tests. -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 10:20:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Hello Tim Armstrong, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9943 to look at the new patch set (#13). Change subject: IMPALA-5706: Spilling sort optimisations .. IMPALA-5706: Spilling sort optimisations This patch covers multiple changes with the purpose of optimizing spilling sort mechanism: - Remove the hard-coded maximum limit of buffers that can be used for merging the sorted runs. Instead this number is calculated based on the available memory through buffer pool. - The already sorted runs are distributed more optimally between the last intermediate merge and the final merge to avoid that a heavy intermediate merge is followed by a light final merge. - Right before starting the merging phase Sorter tries to allocate additional memory through the buffer pool. - An output run is not allocated anymore for the final merge. Note, double-buffering the runs during a merge was also planned with this patch. However, performance testing showed that except some exotic queries with unreasonably small amount of buffer pool memory available double-buffering doesn't add to the overall performance. It's basically because the half of the available buffers have to be sacrificed to do double-buffering and as a result the merge tree can get deeper. In addition the amount of I/O wait time is not reaching the level where double-buffering could countervail the reduced number of runs during a particular merge. Performance measurements were made during manual testing to verify that this is in fact an optimization: - In case doing a sort on top of a join when working with a restricted amount of memory then the Sort node successfully allocates additional memory right before the merging phase. This is feasible because once Join finishes sending new input data and calls InputDone() then it releases memory that can be picked up by the Sorter. This results in shallower merging trees (more runs grabbed for a merge). - On a multi-node cluster I verified that in cases when at least one merging step is done then this change reduces the execution time for sorts. - The more merging steps are done the bigger the performance gain is compared to the baseline. Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 --- M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M tests/query_test/test_sort.py 3 files changed, 149 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/9943/13 -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10404 ) Change subject: Moving default sanitizer options into init.cc from shell scripts. .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b Gerrit-Change-Number: 10404 Gerrit-PatchSet: 6 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 May 2018 07:08:19 + Gerrit-HasComments: No