[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16462 ) Change subject: IMPALA-10185 Use bool stats for selectivity calculations. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7266/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12 Gerrit-Change-Number: 16462 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 06:42:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16462 ) Change subject: IMPALA-10185 Use bool stats for selectivity calculations. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6471/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12 Gerrit-Change-Number: 16462 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Thu, 24 Sep 2020 06:22:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16462 ) Change subject: IMPALA-10185 Use bool stats for selectivity calculations. .. Patch Set 2: Just a small change nothing urgent. -- To view, visit http://gerrit.cloudera.org:8080/16462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12 Gerrit-Change-Number: 16462 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 06:23:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16462 to look at the new patch set (#2). Change subject: IMPALA-10185 Use bool stats for selectivity calculations. .. IMPALA-10185 Use bool stats for selectivity calculations. Factor in numTrues and numFalses stats when computing selectivity for boolean columns. Testing: * New test method in ExprCardinalityTest Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12 --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java 2 files changed, 39 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/16462/2 -- To view, visit http://gerrit.cloudera.org:8080/16462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12 Gerrit-Change-Number: 16462 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16489 ) Change subject: IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16489 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0 Gerrit-Change-Number: 16489 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 05:04:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16489 ) Change subject: IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling .. IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling BufferedPlanRootSink has a Promise, all_results_spooled_, that could be accessed by different threads, e.g. the fragment execution thread and cancellation threads. The main purpose of setting this Promise is to unblock the coordinator if it's waiting for this. So we can simply declare this Promise's mode to be MULTIPLE_PRODUCER to avoid hitting the DCHECK in Promise.Set(). Tests: - Run TestResultSpoolingFailpoints::test_failpoints for more than 4000 iterations Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0 Reviewed-on: http://gerrit.cloudera.org:8080/16489 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/exec/buffered-plan-root-sink.cc M be/src/exec/buffered-plan-root-sink.h 2 files changed, 11 insertions(+), 15 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16489 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0 Gerrit-Change-Number: 16489 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6469/ -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 04:57:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16499 ) Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter .. Patch Set 1: (4 comments) Thanks for implementing and testing this out Riza! http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@24 PS1, Line 24: inacurate nit: inaccurate http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@9 PS1, Line 9: This patch remove FpRateTooHigh() check for bloom filter that can : disable filter if the observed false-positive probability (FPP) rate is : higher than FLAGS_max_filter_error_rate. Such filter with high FPP rate : is still worth to evaluate for several reasons: : : 1. Partition filters are probably still worth evaluating even if there :are false positives, because it's cheap and eliminating a partition :is still beneficial. : 2. Runtime filters are dynamically disabled on the scan side if they are :ineffective. An always true filter is also still being evaluated and :not entirely free. : 3. The disabling is fairly unlikely to kick in for partitioned joins :because it's only applied to a small subset of the filter, before the :Or() operation. : 4. FpRateTooHigh() use num_build_rows to approximate actual FPP rate of :resulting filter. This can be inacurate because it does not take :account of duplicate values of the filter key on the build side. : : This patch also remove some tests in test_runtime_filters.py that check : cancellation of filters having high FPP rate. nit: The grammar here is a little hard to parse, might want to run through it again or pass it through a grammar checker? http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@33 PS1, Line 33: little to no performance regression If you have the performance numbers handy it would be good to include them in the commit message to aid any future readers; however it's not critical if rerunning the benchmark is required. http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc@a941 PS1, Line 941: Is the always_true_filter dead code now or is it used elsewhere? -- To view, visit http://gerrit.cloudera.org:8080/16499 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 Gerrit-Change-Number: 16499 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 03:36:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 12: Hi Zoltan, already modify test case which failed during local catalog mode, but Jenkins failed based on patch 12: https://jenkins.impala.io/job/pre-review-test/713/ Seems also related to IMPALA-9923. The failed log is load-tpcds-core-hive-generated-orc-def-block.sql.log, error log is 'Error: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to get checksum, since file /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2452612/base_005/_orc_acid_version is under construction. (state=08S01,code=1) java.sql.SQLException: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to get checksum, since file /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2452612/base_005/_orc_acid_version is under construction.' -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 12 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 24 Sep 2020 02:26:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16499 ) Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@7 PS1, Line 7: blom bloom -- To view, visit http://gerrit.cloudera.org:8080/16499 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 Gerrit-Change-Number: 16499 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 02:03:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16499 ) Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc@927 PS1, Line 927: // Use 'num_build_rows' to estimate FP-rate of each Bloom filter, and publish Comment is out of date. http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc@936 PS1, Line 936: // TODO: Consider checking this every few batches or so. Comment is out of date http://gerrit.cloudera.org:8080/#/c/16499/1/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test File testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test: http://gerrit.cloudera.org:8080/#/c/16499/1/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test@11 PS1, Line 11: #SET RUNTIME_FILTER_WAIT_TIME_MS=3; You can remove the commented out test. It makes sense to leave the description of the test case about as a tombstone for the test, but I don't think we need the result of it. http://gerrit.cloudera.org:8080/#/c/16499/1/testdata/workloads/functional-query/queries/QueryTest/bloom_filters_wait.test File testdata/workloads/functional-query/queries/QueryTest/bloom_filters_wait.test: PS1: Maybe just delete this file and the python test that uses it http://gerrit.cloudera.org:8080/#/c/16499/1/tests/query_test/test_runtime_filters.py File tests/query_test/test_runtime_filters.py: http://gerrit.cloudera.org:8080/#/c/16499/1/tests/query_test/test_runtime_filters.py@195 PS1, Line 195: # IMPALA-10112: This test is disabled because high FP rate check has been removed Just delete the test, no need to keep it around, we can always restore it from version control. -- To view, visit http://gerrit.cloudera.org:8080/16499 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 Gerrit-Change-Number: 16499 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 01:54:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7265/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 24 Sep 2020 01:43:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7264/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 24 Sep 2020 01:41:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9930 (part 1): Initial refactor for admission control service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16411 ) Change subject: IMPALA-9930 (part 1): Initial refactor for admission control service .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7263/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16411 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7974a979cf05ed569f31e1ab20694e29fd3e4508 Gerrit-Change-Number: 16411 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 24 Sep 2020 01:41:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16412 to look at the new patch set (#4). Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. IMPALA-9930 (part 2): Introduce new admission control rpc service This patch introduces a new krpc service, AdmissionControlService, which coordinators can use to submit queries for admission. This patch adds some simple configuration flags that make it possible to have coordinators use this service to submit their queries for admission to other coordinators. These flags are only to make this patch testable will be replaced when the separate admission control daemon is introduced in IMPALA-9975. The interface consists of the following RPCs: - AdmitQuery: takes a TQueryExecRequest and a TQueryOptions (serialized into sidecars), places the request on a queue to be processed by a thread pool and then immediately returns. - GetQueryStatus: takes a query id and returns the current admission status, including the QuerySchedulePB if admission has completed successfully but the query has not been released yet. - ReleaseQueryBackends: called when individual backends complete but the overall query is still running to release resources incrementally. This RPC will be called at most O(log(# backends)) per query due to BackendResourceState, which batches backends to release together. - ReleaseQuery: called when the query has completely finished. Releases all remaining resources. - CancelAdmission: called if a query is cancelled before an admission decision has been made to indicate that it should no longer be considered for admission. The majority of the patch consists of two classes: - AdmissionControlClient: used to abstract whether admission is being performed locally or remotely. In the local case, it is basically just a wrapper around AdmissionController. In the remote case, it handles serializing/deserializing of RPC params, polling GetQueryStatus() until a decision has been made, etc. - AdmissionControlService: exports the RPC interface and acts as a wrapper around AdmissionController. Testing: - Modified existing admission control tests to run both with and without the admission control service enabled, including both the functional and stress tests. The 'num_queries' param in the stress test is modified to only use a single value to reduce the number of tests that are run and keep the running time reasonable. - Ran tpch10 on a local minicluster and observed no significant regressions. Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-control-client.cc M be/src/scheduling/admission-control-client.h A be/src/scheduling/admission-control-service.cc A be/src/scheduling/admission-control-service.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/local-admission-control-client.cc M be/src/scheduling/local-admission-control-client.h A be/src/scheduling/remote-admission-control-client.cc A be/src/scheduling/remote-admission-control-client.h M be/src/scheduling/schedule-state.cc M be/src/scheduling/schedule-state.h M be/src/service/client-request-state.cc M be/src/service/impala-http-handler.cc M be/src/util/sharded-query-map-util.cc M common/protobuf/admission_control_service.proto M tests/common/resource_pool_config.py M tests/custom_cluster/test_admission_controller.py M tests/hs2/hs2_test_suite.py M tests/util/web_pages_util.py 24 files changed, 1,217 insertions(+), 186 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16412/4 -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 4: (23 comments) http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-control-client.cc File be/src/scheduling/admission-control-client.cc: http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-control-client.cc@210 PS1, Line 210: > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc File be/src/scheduling/admission-control-client.cc: http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc@31 PS3, Line 31: const string AdmissionControlClient::QUERY_EVENT_SUBMIT_FOR_ADMISSION = > line too long (95 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc@33 PS3, Line 33: const string AdmissionControlClient::QUERY_EVENT_QUEUED = "Queued"; > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.h@339 PS1, Line 339: std::string* request_pool = nullptr); > line too long (108 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.h@343 PS1, Line 343: /// 'timeout_ms' has passed. If the function returns due to a timeout 'wait_timed_out' > line too long (99 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.cc@1116 PS1, Line 1116: DebugActionNoFail(request.query_options, "AC_BEFORE_ADMISSION"); > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/1/be/src/scheduling/admission-controller.cc@1292 PS1, Line 1292: const ErrorMsg& rejected_msg = ErrorMsg(TErrorCode::ADMISSION_TIMED_OUT, > line too long (97 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/schedule-state.h File be/src/scheduling/schedule-state.h: http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/schedule-state.h@150 PS3, Line 150: /// For testing only: specify 'init=false' to build a ScheduleState object but do not > line too long (99 > 90) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@602 PS1, Line 602: > flake8: E501 line too long (94 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@803 PS1, Line 803: > flake8: E501 line too long (91 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@816 PS1, Line 816: > flake8: E501 line too long (95 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1082 PS1, Line 1082: > flake8: E501 line too long (113 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1140 PS1, Line 1140: > flake8: E501 line too long (92 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1287 PS1, Line 1287: > flake8: E501 line too long (94 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1318 PS1, Line 1318: > flake8: E501 line too long (97 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1321 PS1, Line 1321: > flake8: E501 line too long (92 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1903 PS1, Line 1903: # Each query mem limit (set the query option to override the per-host memory > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1919 PS1, Line 1919: > flake8: E501 line too long (97 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/custom_cluster/test_admission_controller.py@1922 PS1, Line 1922: > flake8: E501 line too long (92 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/16412/1/tests/hs2/hs2_test_suite.py File tests/hs2/hs2_test_suite.py: http://gerrit.cloudera.org:8080/#/c/16412/1/tests/hs2/hs2_test_suite.py@338 PS1, Line 338: 0 > flake8: E251 unexpected spaces around keyword / parameter equals Done http://gerrit.cloudera.o
[Impala-ASF-CR] [WIP] IMPALA-9930 (part 2): Introduce new admission control rpc service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: [WIP] IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc File be/src/scheduling/admission-control-client.cc: http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc@31 PS3, Line 31: const string AdmissionControlClient::QUERY_EVENT_SUBMIT_FOR_ADMISSION = "Submit for admission"; line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/admission-control-client.cc@33 PS3, Line 33: const string AdmissionControlClient::QUERY_EVENT_COMPLETED_ADMISSION = "Completed admission"; line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/schedule-state.h File be/src/scheduling/schedule-state.h: http://gerrit.cloudera.org:8080/#/c/16412/3/be/src/scheduling/schedule-state.h@150 PS3, Line 150: /// For testing only: specify 'init=false' to build a ScheduleState object but do not run Init(). line too long (99 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 24 Sep 2020 01:20:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-9930 (part 2): Introduce new admission control rpc service
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16412 to look at the new patch set (#3). Change subject: [WIP] IMPALA-9930 (part 2): Introduce new admission control rpc service .. [WIP] IMPALA-9930 (part 2): Introduce new admission control rpc service This patch introduces a new krpc service, AdmissionControlService, which coordinators can use to submit queries for admission. This patch adds some simple configuration flags that make it possible to have coordinators use this service to submit their queries for admission to other coordinators. These flags are only to make this patch testable will be replaced when the separate admission control daemon is introduced in IMPALA-9975. The interface consists of the following RPCs: - AdmitQuery: takes a TQueryExecRequest and a TQueryOptions (serialized into sidecars), places the request on a queue to be processed by a thread pool and then immediately returns. - GetQueryStatus: takes a query id and returns the current admission status, including the QuerySchedulePB if admission has completed successfully but the query has not been released yet. - ReleaseQueryBackends: called when individual backends complete but the overall query is still running to release resources incrementally. This RPC will be called at most O(log(# backends)) per query due to BackendResourceState, which batches backends to release together. - ReleaseQuery: called when the query has completely finished. Releases all remaining resources. - CancelAdmission: called if a query is cancelled before an admission decision has been made to indicate that it should no longer be considered for admission. The majority of the patch consists of two classes: - AdmissionControlClient: used to abstract whether admission is being performed locally or remotely. In the local case, it is basically just a wrapper around AdmissionController. In the remote case, it handles serializing/deserializing of RPC params, polling GetQueryStatus() until a decision has been made, etc. - AdmissionControlService: exports the RPC interface and acts as a wrapper around AdmissionController. Testing: - Modified existing admission control tests to run both with and without the admission control service enabled, including both the functional and stress tests. The 'num_queries' param in the stress test is modified to only use a single value to reduce the number of tests that are run and keep the running time reasonable. - Ran tpch10 on a local minicluster and observed no significant regressions. Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-control-client.cc M be/src/scheduling/admission-control-client.h A be/src/scheduling/admission-control-service.cc A be/src/scheduling/admission-control-service.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/local-admission-control-client.cc M be/src/scheduling/local-admission-control-client.h A be/src/scheduling/remote-admission-control-client.cc A be/src/scheduling/remote-admission-control-client.h M be/src/scheduling/schedule-state.cc M be/src/scheduling/schedule-state.h M be/src/service/client-request-state.cc M be/src/service/impala-http-handler.cc M be/src/util/sharded-query-map-util.cc M common/protobuf/admission_control_service.proto M tests/common/resource_pool_config.py M tests/custom_cluster/test_admission_controller.py M tests/hs2/hs2_test_suite.py M tests/util/web_pages_util.py 24 files changed, 1,213 insertions(+), 184 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16412/3 -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9930 (part 1): Initial refactor for admission control service
Hello Sahil Takiar, Joe McDonnell, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16411 to look at the new patch set (#3). Change subject: IMPALA-9930 (part 1): Initial refactor for admission control service .. IMPALA-9930 (part 1): Initial refactor for admission control service This patch contains the following refactors that are needed for the admission control service, in order to make the main patch easier to review: - Adds a new class AdmissionControlClient which will be used to abstract the logic for submitting queries to either a local or remote admission controller out from ClientRequestState/Coordinator. Currently only local submission is supported. - SubmitForAdmission now takes a BackendId representing the coordinator instead of assuming that the local impalad will be the coordinator. - The CRS_BEFORE_ADMISSION debug action is moved into SubmitForAdmission() so that it will be executed on whichever daemon is performing admission control rather than always on the coordinator (needed for TestAdmissionController.test_cancellation). - ShardedQueryMap is extended to allow keys to be either TUniqueId or UniqueIdPB and Add(), Get(), and Delete() convenience functions are added. - Some utils related to seralizing Thrift objects into sidecars are added. Testing: - Passed a run of existing core tests. Change-Id: I7974a979cf05ed569f31e1ab20694e29fd3e4508 --- A be/src/rpc/sidecar-util.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/scheduling/CMakeLists.txt A be/src/scheduling/admission-control-client.cc A be/src/scheduling/admission-control-client.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h A be/src/scheduling/local-admission-control-client.cc A be/src/scheduling/local-admission-control-client.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h D be/src/service/query-driver-map.cc D be/src/service/query-driver-map.h M be/src/util/CMakeLists.txt A be/src/util/sharded-query-map-util.cc M be/src/util/sharded-query-map-util.h M common/thrift/ImpalaService.thrift M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_restart_services.py M tests/query_test/test_observability.py 29 files changed, 520 insertions(+), 237 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/16411/3 -- To view, visit http://gerrit.cloudera.org:8080/16411 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7974a979cf05ed569f31e1ab20694e29fd3e4508 Gerrit-Change-Number: 16411 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-9930 (part 1): Initial refactor for admission control service
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16411 ) Change subject: IMPALA-9930 (part 1): Initial refactor for admission control service .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/rpc/sidecar-util.h File be/src/rpc/sidecar-util.h: http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/rpc/sidecar-util.h@83 PS2, Line 83: Status > In the original code in Coordinator::BackendState::ExecAsync, this Status w I don't think so - Expected() is for common errors (which this shouldn't be) and just prevents logging of a stack trace (which we actually need now, since this function could be called from multiple places and the error doesn't have any way to indicate what sidecar is too big). http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-control-client.h File be/src/scheduling/admission-control-client.h: http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-control-client.h@35 PS2, Line 35: // Creates a new AdmissionControlClient and returns it in 'client'. : static void Create( : const TUniqueId& query_id, std::unique_ptr* client); : : // Called to schedule and admit the query. Blocks until an admission decision is made. : virtual Status SubmitForAdmission(const AdmissionController::AdmissionRequest& request, : std::unique_ptr* schedule_result) = 0; : : // Called when the query has completed to release all of its resources. : virtual void ReleaseQue > nit: docs Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-control-client.h@47 PS2, Line 47: // for the query on those backends. > nit: docs Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-control-client.h@51 PS2, Line 51: virtual void CancelAdmission() = 0; > would it be better to model AdmissionControlClient as an purely virtual cla Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-controller.cc@1525 PS2, Line 1525: It' > nit: It's Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/scheduling/admission-controller.cc@1527 PS2, Line 1527: membership > nit: typo Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/util/sharded-query-map-util.h@61 PS2, Line 61: // Returns the value associated with 'key' in 'value', returning an error if 'key' > nit: docs Done http://gerrit.cloudera.org:8080/#/c/16411/2/be/src/util/sharded-query-map-util.h@165 PS2, Line 165: : GenericScopedShardedMapRef(query_id, sharded_map) {} : }; : : } // namespace impala : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : > should all of this (and possibly more) be moved into a .cc file? I remember Done -- To view, visit http://gerrit.cloudera.org:8080/16411 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7974a979cf05ed569f31e1ab20694e29fd3e4508 Gerrit-Change-Number: 16411 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 24 Sep 2020 01:19:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10170: Data race on Webserver::UrlHandler::is on nav bar
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16459 ) Change subject: IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Gerrit-Change-Number: 16459 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 24 Sep 2020 00:45:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10170: Data race on Webserver::UrlHandler::is on nav bar
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16459 ) Change subject: IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ .. IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ This data race can be reproduced by TestCompactCatalogUpdates.test_restart_catalogd, although it does not seem to always occur. The data race was originally reported in a Jenkins job, but I could not reproduce it locally. The fix is to acquire a read lock while reading some UrlHandler objects. I cleaned up some of the other involved variables and made them const. These variables are set during construction time, and never modified afterwards. Testing: * Ran be and custom cluster TSAN tests Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Reviewed-on: http://gerrit.cloudera.org:8080/16459 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/util/webserver.cc M be/src/util/webserver.h 2 files changed, 15 insertions(+), 12 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Gerrit-Change-Number: 16459 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 20: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7262/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 24 Sep 2020 00:14:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has uploaded a new patch set (#20). ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through standard deviation (stddev). A high stddev (say > 5) usually implies the existence of skew. With the fix, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE profile 2. ProbeRows and BuildRows in HASH_JOIN_NODE profile 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE profile and reported as follows: 1. In a new skew summary in execution profile that lists the names of the operators with skews; 2. In each corresponding operator in the averaged profile, the name of the counter, the list of values of the counter across the impalad backend processes, and the stddev value. Examples of skews reported for a hash join and an hdfs scan. ... ... Execution Profile ... ... ... ... skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) Per Node Peak Memory Usage: ... Per Node Bytes Read: ... Per Node User Time: ... Per Node System Time: ... ... HASH_JOIN_NODE (id=4): ... Skew details: ProbeRows ([16904,17750,19197], stddev=946.77) ... ... HDFS_SCAN_NODE (id=0): ... Skew details: RowsRead ([913887,917913,1048604], stddev=62578.85) Testing: 1. Added a new test test_skew_reporting_in_runtime_profile in test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 --- M be/src/runtime/coordinator.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M tests/query_test/test_hash_join_timer.py M tests/query_test/test_observability.py 7 files changed, 201 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/20 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7261/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:55:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16489 ) Change subject: IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6470/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16489 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0 Gerrit-Change-Number: 16489 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:48:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16489 ) Change subject: IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/16489/1/be/src/exec/buffered-plan-root-sink.cc File be/src/exec/buffered-plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/16489/1/be/src/exec/buffered-plan-root-sink.cc@168 PS1, Line 168: MonotonicStopWatch wait_timeout_timer; > Back in my PhD days I spent a bunch of time doing a literature survey on pr Haha, awesome! :D -- To view, visit http://gerrit.cloudera.org:8080/16489 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0 Gerrit-Change-Number: 16489 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:47:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7260/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:44:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6469/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:35:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 6: Code-Review+2 Rebased. Carrying forward Tim's +2 -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:34:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16385 to look at the new patch set (#6). Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. IMPALA-9792: Add ability to split kudu scan ranges This patch adds the ability to split kudu scan token via the provided kudu java API. A query option "TARGETED_KUDU_SCAN_RANGE_LENGTH" has been added to set the scan range length used in this implementation. Potential benefit: This helps increase parallelism during scanning which can result in more efficient use of CPU with higher mt_dop. Limitation: - The scan range length sent to kudu is just a hint and does not guarantee that the token will be split at that limit. - Comes at an added cost of an RPC to tablet server per token in order to split it. A slow tablet server which can already slow down scanning during execution can now also potentially slow down planning. - Also adds the cost of an RPC per token to open a new scanner for it on the kudu side. Therefore, scanning many smaller split tokens can slow down scanning and we can also lose benefits of scanning a single large token sequentially with a single scanner. Testing: - Added an e2e test Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M tests/query_test/test_kudu.py 7 files changed, 86 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16385/6 -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/16385/4/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16385/4/common/thrift/ImpalaService.thrift@583 PS4, Line 583: CONVERT_LEGACY_HIVE_PARQUET_UTC_TIMESTAMPS = 112 > nit: missing blank line Done -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 23:24:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9792: Add ability to split kudu scan ranges
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16385 to look at the new patch set (#5). Change subject: IMPALA-9792: Add ability to split kudu scan ranges .. IMPALA-9792: Add ability to split kudu scan ranges This patch adds the ability to split kudu scan token via the provided kudu java API. A query option "TARGETED_KUDU_SCAN_RANGE_LENGTH" has been added to set the scan range length used in this implementation. Potential benefit: This helps increase parallelism during scanning which can result in more efficient use of CPU with higher mt_dop. Limitation: - The scan range length sent to kudu is just a hint and does not guarantee that the token will be split at that limit. - Comes at an added cost of an RPC to tablet server per token in order to split it. A slow tablet server which can already slow down scanning during execution can now also potentially slow down planning. - Also adds the cost of an RPC per token to open a new scanner for it on the kudu side. Therefore, scanning many smaller split tokens can slow down scanning and we can also lose benefits of scanning a single large token sequentially with a single scanner. Testing: - Added an e2e test Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M tests/query_test/test_kudu.py 7 files changed, 85 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16385/5 -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16499 ) Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7259/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16499 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 Gerrit-Change-Number: 16499 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 21:32:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7258/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 21:28:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7257/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 21:16:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16499 Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter .. IMPALA-10112: Remove FpRateTooHigh() check for blom filter This patch remove FpRateTooHigh() check for bloom filter that can disable filter if the observed false-positive probability (FPP) rate is higher than FLAGS_max_filter_error_rate. Such filter with high FPP rate is still worth to evaluate for several reasons: 1. Partition filters are probably still worth evaluating even if there are false positives, because it's cheap and eliminating a partition is still beneficial. 2. Runtime filters are dynamically disabled on the scan side if they are ineffective. An always true filter is also still being evaluated and not entirely free. 3. The disabling is fairly unlikely to kick in for partitioned joins because it's only applied to a small subset of the filter, before the Or() operation. 4. FpRateTooHigh() use num_build_rows to approximate actual FPP rate of resulting filter. This can be inacurate because it does not take account of duplicate values of the filter key on the build side. This patch also remove some tests in test_runtime_filters.py that check cancellation of filters having high FPP rate. Testing: - Run and pass core tests. - Manually test and verify in real large cluster (TPC-DS 10TB scale) that there is only a little to no performance regression incurred from the removal of high FPP check. TPC-DS queries used to test are Q14a, Q50, Q64, Q71, Q84, Q93, and modification of Q93 where we replace the left outer join with inner join. Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 --- M be/src/exec/partitioned-hash-join-builder.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test M testdata/workloads/functional-query/queries/QueryTest/bloom_filters_wait.test M tests/query_test/test_runtime_filters.py 6 files changed, 44 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16499/1 -- To view, visit http://gerrit.cloudera.org:8080/16499 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4 Gerrit-Change-Number: 16499 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 19: (2 comments) http://gerrit.cloudera.org:8080/#/c/16474/19/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/16474/19/tests/query_test/test_observability.py@806 PS19, Line 806: = flake8: E225 missing whitespace around operator http://gerrit.cloudera.org:8080/#/c/16474/19/tests/query_test/test_observability.py@814 PS19, Line 814: flake8: E221 multiple spaces before operator -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 21:08:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through standard deviation (stddev). A high stddev (say > 5) usually implies the existence of skew. With the fix, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE profile 2. ProbeRows and BuildRows in HASH_JOIN_NODE profile 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE profile and reported as follows: 1. In a new skew summary in execution profile that lists the names of the operators with skews; 2. In each corresponding operator in the averaged profile, the name of the counter, the list of values of the counter across the impalad backend processes, and the stddev value. Examples of skews reported for a hash join and an hdfs scan. ... ... Execution Profile ... ... ... ... skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) Per Node Peak Memory Usage: ... Per Node Bytes Read: ... Per Node User Time: ... Per Node System Time: ... ... HASH_JOIN_NODE (id=4): ... Skew details: ProbeRows ([16904,17750,19197], stddev=946.77) ... ... HDFS_SCAN_NODE (id=0): ... Skew details: RowsRead ([913887,917913,1048604], stddev=62578.85) Testing: 1. Added a new test test_skew_reporting_in_runtime_profile in test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 --- M be/src/runtime/coordinator.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M tests/query_test/test_hash_join_timer.py M tests/query_test/test_observability.py 7 files changed, 195 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/19 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 18: (4 comments) http://gerrit.cloudera.org:8080/#/c/16474/18/be/src/util/stat-util.h File be/src/util/stat-util.h: http://gerrit.cloudera.org:8080/#/c/16474/18/be/src/util/stat-util.h@45 PS18, Line 45: /// Computes the mean and the standard deviation (population) from an array of line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/18/tests/query_test/test_hash_join_timer.py File tests/query_test/test_hash_join_timer.py: http://gerrit.cloudera.org:8080/#/c/16474/18/tests/query_test/test_hash_join_timer.py@141 PS18, Line 141: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/16474/18/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/16474/18/tests/query_test/test_observability.py@806 PS18, Line 806: = flake8: E225 missing whitespace around operator http://gerrit.cloudera.org:8080/#/c/16474/18/tests/query_test/test_observability.py@814 PS18, Line 814: flake8: E221 multiple spaces before operator -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:55:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through standard deviation (stddev). A high stddev (say > 5) usually implies the existence of skew. With the fix, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE profile 2. ProbeRows and BuildRows in HASH_JOIN_NODE profile 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE profile and reported as follows: 1. In a new skew summary in execution profile that lists the names of the operators with skews; 2. In each corresponding operator in the averaged profile, the name of the counter, the list of values of the counter across the impalad backend processes, and the stddev value. Examples of skews reported for a hash join and an hdfs scan. ... ... Execution Profile ... ... ... ... skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) Per Node Peak Memory Usage: ... Per Node Bytes Read: ... Per Node User Time: ... Per Node System Time: ... ... HASH_JOIN_NODE (id=4): ... Skew details: ProbeRows ([16904,17750,19197], stddev=946.77) ... ... HDFS_SCAN_NODE (id=0): ... Skew details: RowsRead ([913887,917913,1048604], stddev=62578.85) Testing: 1. Added a new test test_skew_reporting_in_runtime_profile in test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 --- M be/src/runtime/coordinator.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M tests/query_test/test_hash_join_timer.py M tests/query_test/test_observability.py 7 files changed, 195 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/18 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16057 ) Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. Patch Set 11: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7255/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:21:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16057 ) Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. Patch Set 12: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7256/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:23:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16057 ) Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. Patch Set 11: Rebased this. It's ready for review. I decided to push out the final work to a third patch to avoid expanding this one further. -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:03:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16057 to look at the new patch set (#12). Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. IMPALA-9382: part 2/3: aggregate profiles sent to coordinator This reworks the status reporting so that serialized AggregatedRuntimeProfile objects are sent from executors to coordinators. These profiles are substantially denser and faster to process for higher mt_dop values. The aggregation is also done in a single step, merging the aggregated thrift profile from the executor directly into the final aggregated profile, instead of converting it to an unaggregated profile first. The changes required were: * A new Update() method for AggregatedRuntimeProfile that updates the profile from a serialised AggregateRuntimeProfile for a subset of the instances. The code is generalized from the existing InitFromThrift() code path. * Per-fragment reports included in the status report protobuf when --gen_experimental_profile=true. * Logic on the coordinator that either consumes serialized AggregatedRuntimeProfile per fragment, when --gen_experimental_profile=true, or consumes a serialized RuntimeProfile per finstance otherwise. This also adds support for event sequences and time series in the aggregated profile, so the amount of information in the aggregated profile is now on par with the basic profile. We also finish off support for JSON profile. The JSON profile is more stripped down because we do not need to round-trip profiles via JSON and it is a much less dense profile representation. Part 3 will clean up and improve the display of the profile. Testing: * Add sanity tests for aggregated runtime profile. * Ran core tests. Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/RuntimeProfile.thrift A testdata/workloads/functional-query/queries/QueryTest/runtime-profile-aggregated.test A tests/custom_cluster/test_runtime_profile.py 17 files changed, 892 insertions(+), 184 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/16057/12 -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 9: Rebased this onto master. -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:00:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16057 ) Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/16057/11/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/16057/11/be/src/util/runtime-profile.cc@2132 PS11, Line 2132: void RuntimeProfileBase::AveragedCounter::ToJsonImpl(Document& document, Value* val) const { line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16057/11/tests/custom_cluster/test_runtime_profile.py File tests/custom_cluster/test_runtime_profile.py: http://gerrit.cloudera.org:8080/#/c/16057/11/tests/custom_cluster/test_runtime_profile.py@21 PS11, Line 21: class TestRuntimeProfile(CustomClusterTestSuite): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 20:01:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9382: part 2/3: aggregate profiles sent to coordinator
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16057 to look at the new patch set (#11). Change subject: IMPALA-9382: part 2/3: aggregate profiles sent to coordinator .. IMPALA-9382: part 2/3: aggregate profiles sent to coordinator This reworks the status reporting so that serialized AggregatedRuntimeProfile objects are sent from executors to coordinators. These profiles are substantially denser and faster to process for higher mt_dop values. The aggregation is also done in a single step, merging the aggregated thrift profile from the executor directly into the final aggregated profile, instead of converting it to an unaggregated profile first. The changes required were: * A new Update() method for AggregatedRuntimeProfile that updates the profile from a serialised AggregateRuntimeProfile for a subset of the instances. The code is generalized from the existing InitFromThrift() code path. * Per-fragment reports included in the status report protobuf when --gen_experimental_profile=true. * Logic on the coordinator that either consumes serialized AggregatedRuntimeProfile per fragment, when --gen_experimental_profile=true, or consumes a serialized RuntimeProfile per finstance otherwise. This also adds support for event sequences and time series in the aggregated profile, so the amount of information in the aggregated profile is now on par with the basic profile. We also finish off support for JSON profile. The JSON profile is more stripped down because we do not need to round-trip profiles via JSON and it is a much less dense profile representation. Part 3 will clean up and improve the display of the profile. Testing: * Add sanity tests for aggregated runtime profile. * Ran core tests. Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/RuntimeProfile.thrift A testdata/workloads/functional-query/queries/QueryTest/runtime-profile-aggregated.test A tests/custom_cluster/test_runtime_profile.py 17 files changed, 890 insertions(+), 184 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/16057/11 -- To view, visit http://gerrit.cloudera.org:8080/16057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Gerrit-Change-Number: 16057 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10170: Data race on Webserver::UrlHandler::is on nav bar
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16459 ) Change subject: IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6468/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Gerrit-Change-Number: 16459 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 23 Sep 2020 19:03:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10170: Data race on Webserver::UrlHandler::is on nav bar
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16459 ) Change subject: IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Gerrit-Change-Number: 16459 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 23 Sep 2020 19:03:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10170: Data race on Webserver::UrlHandler::is on nav bar
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16459 ) Change subject: IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_ .. Patch Set 6: I'm not really sure what happened in the last run. Seeing some weird errors in the impalad logs, but not clear indication of why the processes crashed. Re-running to see if it reproduces. -- To view, visit http://gerrit.cloudera.org:8080/16459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a Gerrit-Change-Number: 16459 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 23 Sep 2020 19:04:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10078: Proper codegen for KuduPartitionExpr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16419 ) Change subject: IMPALA-10078: Proper codegen for KuduPartitionExpr .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt@339 PS7, Line 339: "-isystem${BOOST_INCLUDEDIR}" > This seems to imply that boost includes are read from the system directorie -isystem mostly behaves the same as -I but has implications for search order and warnings - https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html Although I think the fact that we have -I/usr/include on the search path means that /usr/include has priority over the boost headers. I think that's creeping in from OPENSSL_INCLUDE_DIR. Maybe OPENSSL_INCLUDE_DIR and SASL_INCLUDE_DIR should be -isystem as well. -- To view, visit http://gerrit.cloudera.org:8080/16419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcae34f71b407837e2c5f1b97aa230e490a268df Gerrit-Change-Number: 16419 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 18:37:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [not for merge] Investigating clang to compilation issue
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16497 ) Change subject: [not for merge] Investigating clang to compilation issue .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7254/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 23 Sep 2020 18:14:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7252/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 18:10:08 + Gerrit-HasComments: No
[Impala-ASF-CR] [not for merge] Investigating clang to compilation issue
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16497 ) Change subject: [not for merge] Investigating clang to compilation issue .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7253/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 23 Sep 2020 18:08:47 + Gerrit-HasComments: No
[Impala-ASF-CR] [not for merge] Investigating clang to compilation issue
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16497 to look at the new patch set (#5). Change subject: [not for merge] Investigating clang to compilation issue .. [not for merge] Investigating clang to compilation issue Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 --- M be/CMakeLists.txt 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/16497/5 -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10078: Proper codegen for KuduPartitionExpr
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16419 ) Change subject: IMPALA-10078: Proper codegen for KuduPartitionExpr .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt@337 PS7, Line 337: KUDU_INLCUDE_DIR I realized that this is a typo - inlcude vs include Probably it was instantiated with an empty string, which messed up compilation somehow. -- To view, visit http://gerrit.cloudera.org:8080/16419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcae34f71b407837e2c5f1b97aa230e490a268df Gerrit-Change-Number: 16419 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 18:01:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [not for merge] Investigating clang to compilation issue
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16497 to look at the new patch set (#4). Change subject: [not for merge] Investigating clang to compilation issue .. [not for merge] Investigating clang to compilation issue Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 --- M be/CMakeLists.txt 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/16497/4 -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16497 ) Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7251/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 23 Sep 2020 17:53:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through standard deviation (stddev). A high stddev (say > 5) usually implies the existence of skew. With the fix and in an average fragment profile, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE profile 2. ProbeRows and BuildRows in HASH_JOIN_NODE profile 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE profile and reported as follows: 1. In the skew summary section which lists the names of the operators with skews; 2. In each corresponding operator, the name of the counters and the corresponding stddev values. Examples of skews reported for a hash join and an hdfs scan. Averaged Fragment F00:(Total: 1s075ms, non-child: 26.919ms, ... ... ... num instances: 3 skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) HASH_JOIN_NODE (id=4):(Total: 1s204ms, non-child: 2.166ms, ... Skew details: ProbeRows ([16904, 17750, 19197], stddev=946.77) ... ... HDFS_SCAN_NODE (id=0):(Total: 1s032ms, non-child: 1s032ms, ... Skew details: RowsRead ([913887, 917913, 1048604], stddev=62578.85) Testing: 1. Added a new test test_skew_reporting_in_runtime_profile to test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 --- M be/src/runtime/coordinator-backend-state.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M tests/query_test/test_observability.py 6 files changed, 195 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/16 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 16: (8 comments) http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/runtime-profile-counters.h@414 PS16, Line 414: /// all valid raw values backing this average counter. line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/runtime-profile-counters.h@415 PS16, Line 415: /// line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/runtime-profile-counters.h@417 PS16, Line 417: /// all valid raw values and the population stddev in the form of: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/runtime-profile-counters.h@419 PS16, Line 419: /// line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/stat-util.h File be/src/util/stat-util.h: http://gerrit.cloudera.org:8080/#/c/16474/16/be/src/util/stat-util.h@45 PS16, Line 45: /// Computes the mean and the standard deviation (population) from an array of line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16474/16/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/16474/16/tests/query_test/test_observability.py@801 PS16, Line 801: # flake8: E265 block comment should start with '# ' http://gerrit.cloudera.org:8080/#/c/16474/16/tests/query_test/test_observability.py@807 PS16, Line 807: = flake8: E225 missing whitespace around operator http://gerrit.cloudera.org:8080/#/c/16474/16/tests/query_test/test_observability.py@817 PS16, Line 817: flake8: E221 multiple spaces before operator -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 17:53:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 15: (7 comments) http://gerrit.cloudera.org:8080/#/c/16474/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16474/15//COMMIT_MSG@22 PS15, Line 22: 3. RowsReturned in GroupingAggregator profile > It would be good to add this for sort operations too. We have SortDataSize Good point. Added the following: {"EXCHANGE_NODE", "RowsReturned"} and {"SORT_NODE", "RowsReturned"}. Since sort does not drop tuples, I guess RowsReturned should be OK. http://gerrit.cloudera.org:8080/#/c/16474/15//COMMIT_MSG@36 PS15, Line 36: skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) > I thought a bit about whether using the info string was the right approach Yeah. The skew summary follows the current model by adding some extra info strings to the aggregated profile. http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/runtime-profile-counters.h@415 PS15, Line 415: o. > can you explicitly say that it's returned in 'details'. Done http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/runtime-profile-counters.h@416 PS15, Line 416: > nit: convention is to use pointer for output args. Done http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/runtime-profile.cc@1969 PS15, Line 1969: int num_valid_values = NumValidValues(); > I don't think this quite works since valid values could be added concurrent Done http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/stat-util.h File be/src/util/stat-util.h: http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/stat-util.h@28 PS15, Line 28: /// Computes standard deviation given mean > Is this the population standard deviation or the sample standard deviation? Added some comments to clarify that it is the population version that is computed. Also Add 'P' in the function name. http://gerrit.cloudera.org:8080/#/c/16474/15/be/src/util/stat-util.h@28 PS15, Line 28: /// Computes standard deviation given mean > I guess this documentation was already missing but would be good to fix Done -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 15 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 17:51:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10078: Proper codegen for KuduPartitionExpr
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16419 ) Change subject: IMPALA-10078: Proper codegen for KuduPartitionExpr .. Patch Set 7: -Code-Review I have created a review to test what leads to the failures, and it seems that the changes in be/CMakeLists.txt What I see as a sure way to avoid the issue is to wrap the call to Kudu behind one more non-Clang compiled function/class, and remove the new Kudu headers from ir compilation. It would nicer of course to do more investigation and find out what exactly is the problem. -- To view, visit http://gerrit.cloudera.org:8080/16419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcae34f71b407837e2c5f1b97aa230e490a268df Gerrit-Change-Number: 16419 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 17:36:00 + Gerrit-HasComments: No
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16497 to look at the new patch set (#3). Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. [not for merge] Testing whether changing impala-ir.cc leads to compilation issues Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 --- M be/CMakeLists.txt M be/src/codegen/impala-ir.cc 2 files changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/16497/3 -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16497 ) Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7250/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 23 Sep 2020 17:27:19 + Gerrit-HasComments: No
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16497 to look at the new patch set (#2). Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. [not for merge] Testing whether changing impala-ir.cc leads to compilation issues Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 --- M be/CMakeLists.txt M be/src/codegen/impala-ir.cc 2 files changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/16497/2 -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16497 ) Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7249/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 23 Sep 2020 17:02:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10078: Proper codegen for KuduPartitionExpr
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16419 ) Change subject: IMPALA-10078: Proper codegen for KuduPartitionExpr .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/16419/7/be/CMakeLists.txt@339 PS7, Line 339: "-isystem${BOOST_INCLUDEDIR}" This seems to imply that boost includes are read from the system directories. My best bet at the cause of the build failures are the changes in this file. -- To view, visit http://gerrit.cloudera.org:8080/16419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifcae34f71b407837e2c5f1b97aa230e490a268df Gerrit-Change-Number: 16419 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Sep 2020 16:52:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [not for merge] Testing whether changing impala-ir.cc leads to compilation issues
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16497 Change subject: [not for merge] Testing whether changing impala-ir.cc leads to compilation issues .. [not for merge] Testing whether changing impala-ir.cc leads to compilation issues Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 --- M be/src/codegen/impala-ir.cc 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/16497/1 -- To view, visit http://gerrit.cloudera.org:8080/16497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iecd13502dd5ff90b63c8271cc62df254bebe06b0 Gerrit-Change-Number: 16497 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 10: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6467/ -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 10 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 12:49:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8304: Generate JUnitXML if a command run by CMake fails
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12668 ) Change subject: IMPALA-8304: Generate JUnitXML if a command run by CMake fails .. Patch Set 9: Code-Review+1 Thanks for the enhancement, Joe. I especially enjoyed the bash redirection artwork :) Carrying and agreeing with +1 from David; letting some other folks take a look. -- To view, visit http://gerrit.cloudera.org:8080/12668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Gerrit-Change-Number: 12668 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Wed, 23 Sep 2020 12:12:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10175: Extend error msg when CAST(FORMAT) fails for DATE
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16473 ) Change subject: IMPALA-10175: Extend error msg when CAST(FORMAT) fails for DATE .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/16473/6/be/src/exprs/cast-functions-ir.cc File be/src/exprs/cast-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/16473/6/be/src/exprs/cast-functions-ir.cc@364 PS6, Line 364: stitute("S > can format_ctx be NULL here? Yes, I found the same as the reason of the verify job failures. Thanks for confirming. -- To view, visit http://gerrit.cloudera.org:8080/16473 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4e379f0f112e83e1511edb170bbe41f903972622 Gerrit-Change-Number: 16473 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 23 Sep 2020 09:37:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7248/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 11 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 08:26:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
wangsheng has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. IMPALA-10164: Supporting HadoopCatalog for Iceberg table This patch mainly realizes creating Iceberg table by HadoopCatalog. We only supported HadoopTables api before this patch, but now we can use HadoopCatalog to create Iceberg table. When creating managed table, we can use SQL like this: CREATE TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://test-warehouse/hadoop_catalog_test' TBLPROPERTIES ('iceberg.catalog'='hadoop.catalog'); We supported two values ('hadoop.catalog', 'hadoop.tables') for 'iceberg.catalog' now. If you don't specify this property in your SQL, default catalog type is 'hadoop.catalog'. As for external Iceberg table, you can use SQL like this: CREATE EXTERNAL TABLE default.iceberg_test_external STORED AS ICEBERG LOCATION 'hdfs://test-warehouse/hadoop_catalog_test' TBLPROPERTIES ('iceberg.catalog'='hadoop.catalog', 'iceberg.table_name'='default.iceberg_test'); 'iceberg.table_name' is the managed Iceberg table name, just like 'kudu.table_name' when creating external Kudu table. If this property not been specified in SQL, Impala will use database and table name to load Iceberg table, which is 'default.iceberg_test_external' in above SQL. This property cannot be set with managed table. Testing: - Create table tests in functional_schema_template.sql - Iceberg table create test in test_iceberg.py - Iceberg table query test in test_scanners.py Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/1-1-bc402da0-b562-4310-9001-06f9b6b0f9ae-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/6-6-d253aefa-65fc-4698-8f26-b155fc965cf6-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/9-9-5d04b016-05e1-43fc-b4a0-0e0df52a5035-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/00017-17-20b92523-c3b9-401d-b429-363c245dbe9c-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/00023-23-c86370cf-10a1-4e49-86dc-b094fe739aa6-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/00027-27-f32f86fa-286f-4cd3-8337-98685c48176d-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/00030-30-b18d2bbc-46a2-4040-a4a8-7488447de3b6-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-08/action=view/00031-31-c9bda250-ed1c-4868-bbf1-f2aad65fa80c-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-09/action=click/4-4-0ed77823-ded1-4a12-9e03-4027cd43966a-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-09/action=click/00014-14-f698d7a4-245f-44d5-8a59-ed511854c8f8-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-09/action=click/00015-15-7c1d5490-91f7-47bd-a3b6-e86caa7fe47d-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_time_hour=2020-01-01-09/action=click/00019-19-d2ef5fcf-4346-421f-b2ef-1f9d55fb4c84-0.parquet A testdata/data/iceberg_test/hadoop_catalog/hadoop_catalog_test/functional_parquet/hadoop_catalog_test/data/event_t
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 10: > The data loading failure was probably due to IMPALA-9923 so I > restarted the verify job. > However, later I've found that there were some failing Iceberg > tests in the dockerised environment, so it's probably reproducible > in local catalog mode: > > bin/start-impala-cluster.py --impalad_args --enable_minidumps=false > --impalad_args --use_local_catalog=true --catalogd_args > --catalog_topic_mode=minimal > > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3199/testReport/ ok, I will test local catalog mode in my own environment with these new test cases. -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 10 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 07:37:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 9: The data loading failure was probably due to IMPALA-9923 so I restarted the verify job. However, later I've found that there were some failing Iceberg tests in the dockerised environment, so it's probably reproducible in local catalog mode: bin/start-impala-cluster.py --impalad_args --enable_minidumps=false --impalad_args --use_local_catalog=true --catalogd_args --catalog_topic_mode=minimal https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3199/testReport/ -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 9 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 07:34:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 10: I see some Iceberg test related failures as well. Can these be related to this change? https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3199/ -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 10 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 07:29:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 10 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 07:27:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10164: Supporting HadoopCatalog for Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 ) Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6467/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 10 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 23 Sep 2020 07:27:03 + Gerrit-HasComments: No