[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 14: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6505/ -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 05:28:34 + Gerrit-HasComments: No
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 02:05:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10210: Skip Authentication for connection from a trusted domain
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16542 ) Change subject: IMPALA-10210: Skip Authentication for connection from a trusted domain .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7352/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09234078e2314dbc3177d0e869ae028e216ca699 Gerrit-Change-Number: 16542 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 03 Oct 2020 02:02:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10210: Skip Authentication for connection from a trusted domain
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16542 ) Change subject: IMPALA-10210: Skip Authentication for connection from a trusted domain .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16542/1/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/16542/1/be/src/transport/THttpServer.cpp@195 PS1, Line 195: transport_->getOrigin() for connections originating from the localhost, this returns "localhost" which is not a valid IP and fails the check even if trusted_domain is set to localhost. Do you think we should add a special case to check for domain names or only "localhost" in particular? -- To view, visit http://gerrit.cloudera.org:8080/16542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09234078e2314dbc3177d0e869ae028e216ca699 Gerrit-Change-Number: 16542 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 03 Oct 2020 01:43:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10210: Skip Authentication for connection from a trusted domain
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16542 Change subject: IMPALA-10210: Skip Authentication for connection from a trusted domain .. IMPALA-10210: Skip Authentication for connection from a trusted domain Adds the ability to skip authentication for connection requests originating from a trusted domain over the hs2 http endpoint and the http webserver endpoint. The trusted domain can be specified using the newly added "--trusted_domain" startup flag. Additionally, if the startup flag "--trusted_domain_use_xff_header" is set to true, impala will switch to using the 'X-Forwarded-For' HTML header to extract the origin address while attempting to check if the connection originated from a trusted domain. Testing: Added tests for both the hs2 http endpoint and the webserver http endpoint Change-Id: I09234078e2314dbc3177d0e869ae028e216ca699 --- M be/src/rpc/CMakeLists.txt R be/src/rpc/authentication-util.cc R be/src/rpc/authentication-util.h M be/src/rpc/authentication.cc M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java 11 files changed, 420 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/16542/1 -- To view, visit http://gerrit.cloudera.org:8080/16542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I09234078e2314dbc3177d0e869ae028e216ca699 Gerrit-Change-Number: 16542 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7351/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 00:26:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash .. Patch Set 1: I think we should just figure out how to get rid of the md5 use here. I took a look and I'm really not seeing a benefit to the current approach compared to using the path directly and choosing a better thrift structure. I think if we use the path as the key in the java map, there should be no space overhead - it looks like DataFile.path() will just return a reference to the path String in DataFile - there's no copy or anything. Then for TIcebergTable we don't need to represent it as a map, we can just use a list and construct the java map in loadFileDescFromThrift -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Sat, 03 Oct 2020 00:27:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 00:06:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6505/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 00:07:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 13: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/15931/11/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/15931/11/be/src/runtime/coordinator-backend-state.h@289 PS11, Line 289: /// True if 'agg_profile_' for the fragment is up-to-date with 'profile_'. : /// Set to false in Update() when a new profile is received for this instance : /// and back to true in Upd > Nit: This comment is a bit out of date with the new approach. Done -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Oct 2020 00:06:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15931 to look at the new patch set (#13). Change subject: IMPALA-9711: incrementally update aggregate profile .. IMPALA-9711: incrementally update aggregate profile In order to not cause additional work in the default mode, we still only compute the average once per instance, when it completes or when the query finishes. When --gen_experimental_profile=true, we update the aggregated profile for each status report, so that the live profile can be viewed as the query executes. The implications of this are as follows: * More work is done on the KRPC control service RPC thread (although this is largely moot after part 2 of IMPALA-9382 where we merge into the aggregated profile directly, so avoid the extra update). * For complex multi-stage queries, the profile merging work is done earlier as each stage completes, therefore the critical path of the query is shortened * Multiple RPC threads may be merging profiles concurrently * Multiple threads may be calling AggregatedRuntimeProfile::Update() on the same profile, whereas previously all merging was done by a single thread. I looked through the locking in that function to check correctness. Testing: Ran core tests. Ran a subset of the Python tests under TSAN, confirmed no races were introduced in this code. Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/util/runtime-profile.cc 4 files changed, 85 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/15931/13 -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7350/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 23:17:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/16412/6/be/src/scheduling/admission-control-service.cc File be/src/scheduling/admission-control-service.cc: http://gerrit.cloudera.org:8080/#/c/16412/6/be/src/scheduling/admission-control-service.cc@236 PS6, Line 236: admission_state->query_exec_request.query_ctx.client_request.query_options, admission_state->summary_profile, line too long (117 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 22:56:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16412 ) Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. Patch Set 6: (20 comments) http://gerrit.cloudera.org:8080/#/c/16412/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16412/5//COMMIT_MSG@12 PS5, Line 12: This patch adds some simple configuration flags that make it possible > not sure I understand the plan here. this patch adds flags is --is_admissio My intention with IMPALA-9975 is that there will be a whole new binary, called 'admissiond', which will be what actually exposes the AdmissionControlService, rather than having an impalad do it. At that point, we won't need the --is_admission_controller flag because there will be two possible setups: traditional distributed admission control mode (which impalads detect by checking if admission_control_service_addr is empty), and new admission control service mode (where the admissiond will know to always export the AdmissionControlService). I think this will be cleaner than having the admission control service run as an impalad with --is_admission_controller=true because: - There are a bunch of things that impalads need that the admissiond won't (DataStreamService, ControlService, the catalog client, the hs2 and beeswax interfaces, etc.) and explicitly separating out an admissiond makes it more natural to exclude these things from it - I think it makes integration with external deployment systems more natural, eg. in a kubernetes setup you might have an istio policy for exposing various ports for impalads and you'll need a separate policy for the admissiond anyways since the ports that it needs exposed are different. It seems nicer to just have admissiond be its own separate thing, rather than always special-casing the handling of impalads depending on whether they expose AdmissionControlService. I'll be utilizing the same functionality we already use for the statestored and catalogd where its really all the same binary behind the scenes to limit linking time (see daemon-main.cc), so in a way the admissiond will just be an impalad with --is_admission_controller=true always set. I can certainly see some downsides to this approach (eg. changes to legacy deployment systems will be larger, the need for generating another docker container image, etc.), so I'm open to the possibility this is not the right approach, depending on what others think. I already have the admissiond stuff basically written and tested if you want me to submit it as a WIP (though its rough at the moment) http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h File be/src/scheduling/admission-control-service.h: http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@76 PS5, Line 76: AdmitQueryRequestP > nit: typo Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@105 PS5, Line 105: RuntimeProfile* summary_profile; > might be worth mentioning that this is passed to AdmissionController::Submi Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@118 PS5, Line 118: to > nit: typo Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc File be/src/scheduling/admission-control-service.cc: http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@85 PS5, Line 85: Admission > nit: typo Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@142 PS5, Line 142: shared_ptr admission_state; > might be too verbose to log. I moved it to a higher VLOG level. http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@155 PS5, Line 155: > why wait at all? won't waiting tie up one of the RPC threads? I think the c Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@159 PS5, Line 159: } else { > is this expected? do want to DCHECK if the admit_status is not Status::OK() Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@197 PS5, Line 197: const ReleaseQueryBackendsRequestPB* req, ReleaseQueryBackendsResponsePB* resp, > why does the lock need to be acquired here? Done http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@209 PS5, Line 209: req->query_id(), host_addrs); > might be too verbose to log Moved to a higher vlog level http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@219 PS5, Line 219: admission_state->admit_outcome.Set(AdmissionOutcome::CANCELLED); > why does the lock need to be acquired here? Done http://gerrit.cloudera.org:8080/#/c/16412/5
[Impala-ASF-CR] IMPALA-9930 (part 2): Introduce new admission control rpc service
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16412 to look at the new patch set (#6). Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc service .. IMPALA-9930 (part 2): Introduce new admission control rpc service This patch introduces a new krpc service, AdmissionControlService, which coordinators can use to submit queries for admission. This patch adds some simple configuration flags that make it possible to have coordinators use this service to submit their queries for admission to other coordinators. These flags are only to make this patch testable will be replaced when the separate admission control daemon is introduced in IMPALA-9975. The interface consists of the following RPCs: - AdmitQuery: takes a TQueryExecRequest and a TQueryOptions (serialized into sidecars), places the request on a queue to be processed by a thread pool and then immediately returns. - GetQueryStatus: takes a query id and returns the current admission status, including the QuerySchedulePB if admission has completed successfully but the query has not been released yet. - ReleaseQueryBackends: called when individual backends complete but the overall query is still running to release resources incrementally. This RPC will be called at most O(log(# backends)) per query due to BackendResourceState, which batches backends to release together. - ReleaseQuery: called when the query has completely finished. Releases all remaining resources. - CancelAdmission: called if a query is cancelled before an admission decision has been made to indicate that it should no longer be considered for admission. The majority of the patch consists of two classes: - AdmissionControlClient: used to abstract whether admission is being performed locally or remotely. In the local case, it is basically just a wrapper around AdmissionController. In the remote case, it handles serializing/deserializing of RPC params, polling GetQueryStatus() until a decision has been made, etc. - AdmissionControlService: exports the RPC interface and acts as a wrapper around AdmissionController. Some notable changes involved: - AdmissionController::SubmitForAdmission() no longer blocks while a query is queued. Instead, a new function CheckQueued() can be used to monitor the admission status of a queued query. - Adding events to the query timeline is moved out of AdmissionController and into the AdmissionControlClient classes, so that it always happens on the coordinator. - When a cluster is run in the new admission control service mode, only the impalad that is performing admission control exposes the /admission http endpoint. Observability will be cleaned up in a subsequent patch. Testing: - Modified existing admission control tests to run both with and without the admission control service enabled, including both the functional and stress tests. The 'num_queries' param in the stress test is modified to only use a single value to reduce the number of tests that are run and keep the running time reasonable. - Ran tpch10 on a local minicluster and observed no significant regressions. Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-control-client.cc M be/src/scheduling/admission-control-client.h A be/src/scheduling/admission-control-service.cc A be/src/scheduling/admission-control-service.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/local-admission-control-client.cc M be/src/scheduling/local-admission-control-client.h A be/src/scheduling/remote-admission-control-client.cc A be/src/scheduling/remote-admission-control-client.h M be/src/scheduling/schedule-state.cc M be/src/scheduling/schedule-state.h M be/src/service/client-request-state.cc M be/src/service/impala-http-handler.cc M be/src/util/sharded-query-map-util.cc M common/protobuf/admission_control_service.proto M tests/common/resource_pool_config.py M tests/custom_cluster/test_admission_controller.py M tests/hs2/hs2_test_suite.py M tests/util/web_pages_util.py 24 files changed, 1,210 insertions(+), 189 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16412/6 -- To view, visit http://gerrit.cloudera.org:8080/16412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae Gerrit-Change-Number: 16412 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takia
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6504/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 20:50:33 + Gerrit-HasComments: No
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6503/ -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 20:49:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7349/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 20:20:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash .. IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash Current code generates data path hash in MD5 for Iceberg Table. But MD5 is one of forbidden algorithms for FIPS. Even for non-security purposes, like hash map, we still cannot use MD5 algorithm. This patch replaces MD5 with non-cryptographic hash function murmur3_128, which generates hash value with same length as MD5. Testing: - Passed core tests. Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java 6 files changed, 22 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/16534/3 -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10188: Remove unused WebDAV functions
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/16538 ) Change subject: IMPALA-10188: Remove unused WebDAV functions .. Patch Set 1: Sure. I will open a PR for the upstream squeasel and once its merged I will update this review. -- To view, visit http://gerrit.cloudera.org:8080/16538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b16116e86b6d39112eace0f3a783aa7f2319a53 Gerrit-Change-Number: 16538 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 02 Oct 2020 19:42:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10188: Remove unused WebDAV functions
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16538 ) Change subject: IMPALA-10188: Remove unused WebDAV functions .. Patch Set 1: Did you have plans to upstream this change into squeasel: https://github.com/cloudera/squeasel It would be nice not to diverge with upstream. Also AFAIK Kudu also uses squeasel and would probably also want this change (I really can't imagine if they use webdav, but might be worth asking). I'm not aware of any other projects that use it -- To view, visit http://gerrit.cloudera.org:8080/16538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b16116e86b6d39112eace0f3a783aa7f2319a53 Gerrit-Change-Number: 16538 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 02 Oct 2020 19:35:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16320 ) Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6502/ -- To view, visit http://gerrit.cloudera.org:8080/16320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Gerrit-Change-Number: 16320 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 19:34:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10188: Remove unused WebDAV functions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16538 ) Change subject: IMPALA-10188: Remove unused WebDAV functions .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7348/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b16116e86b6d39112eace0f3a783aa7f2319a53 Gerrit-Change-Number: 16538 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 02 Oct 2020 19:32:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10188: Remove unused WebDAV functions
Abhishek Rawat has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16538 Change subject: IMPALA-10188: Remove unused WebDAV functions .. IMPALA-10188: Remove unused WebDAV functions Removed PROPFIND and MKCOL methods from squeasel. Testing: - Ran webserver-test - Validated that the impalad debug UI came up fine and made sure that all the existing links work as expected. Change-Id: I9b16116e86b6d39112eace0f3a783aa7f2319a53 --- M be/src/thirdparty/squeasel/squeasel.c M be/src/util/webserver-test.cc 2 files changed, 5 insertions(+), 110 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/16538/1 -- To view, visit http://gerrit.cloudera.org:8080/16538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9b16116e86b6d39112eace0f3a783aa7f2319a53 Gerrit-Change-Number: 16538 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10205: Replace MD5 hash with Murmur3 for generating datafile path hash
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 hash with Murmur3 for generating datafile path hash .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7347/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 18:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 35: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7346/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 35 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 18:36:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10205: Replace MD5 hash with Murmur3 for generating datafile path hash
Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 hash with Murmur3 for generating datafile path hash .. IMPALA-10205: Replace MD5 hash with Murmur3 for generating datafile path hash Current code generates data path hash in MD5 for Iceberg Table. But MD5 is one of forbidden algorithms for FIPS. Even for non-security purposes, like hash map, we still cannot use MD5 algorithm. Thia patch replaces MD5 with non-cryptographic hash function Murmur3. Testing: - Passed core tests. Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java 6 files changed, 22 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/16534/2 -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Qifan Chen has uploaded a new patch set (#35). ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through coefficient of variation (CoV). A high CoV (say > 1.0) usually implies the existence of skew. With the fix, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE 2. ProbeRows and BuildRows in HASH_JOIN_NODE 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE and reported as follows: 1. In execution profile, a new skew summary that lists the names of the operators with skews; 2. In the averaged profile for the corresponding operator, the list of values of the counter across all fragment instances in the backend processes; 3. Skew detection formula: CoV > limit and mean > 5,000 4. A new query option 'report_skew_limit' < 0: disable skew reporting >= 0: enable skew reporting and supply the CoV limit Examples of skews reported for a hash join and an hdfs scan. In execution profile: ... ... skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) Per Node Peak Memory Usage: ... ... ... In averaged profiles: HDFS_SCAN_NODE (id=2): ... Skew details: RowsRead ([2004992,1724693,2001351], stddev/mean=0.07, mean=1910345) Testing: 1. Added test_skew_reporting_in_runtime_profile in test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 --- M be/src/runtime/coordinator.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_hash_join_timer.py M tests/query_test/test_observability.py 11 files changed, 243 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/35 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 35 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9711: incrementally update aggregate profile
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15931 ) Change subject: IMPALA-9711: incrementally update aggregate profile .. Patch Set 12: Code-Review+2 (1 comment) This makes sense to me. I had one small nit about a comment, but I'm going ahead and +2'ing it. http://gerrit.cloudera.org:8080/#/c/15931/11/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/15931/11/be/src/runtime/coordinator-backend-state.h@289 PS11, Line 289: /// If UpdateExecStats() has been called since the last time 'profile_' was : /// updated with a status report. I.e. Update() was called without a corresponding : /// UpdateExecStats() call. Nit: This comment is a bit out of date with the new approach. -- To view, visit http://gerrit.cloudera.org:8080/15931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a Gerrit-Change-Number: 15931 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 18:03:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG@11 PS1, Line 11: MD5 with FIPS-approved algorithm SHA-512. > The FIPS Support document, which I forwarded to you in email, says "Using o Since CryptoComply libraries have no way to distinguish between hash function being used for non-security purposes versus security purposes, maybe we should use non-cryptographic hash function, like fast-hash, murmur-hash. -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 17:13:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG@11 PS1, Line 11: MD5 with FIPS-approved algorithm SHA-512. > The current code that uses MD5 doesn't hash for cryptographic reasons, but The FIPS Support document, which I forwarded to you in email, says "Using only FIPS-approved algorithms. The most common forbidden algorithms are MD5 and RC4. Note that the CryptoComply libraries have no way of distinguishing between MD5 being used for non-security purposes versus security purposes, so avoid all use of forbidden algorithms." So we cannot use MD5 for non-security purposes, like hash map. The document list approved hash algorithms. It suggests to avoid SHA-1, SHA-224, SHA-256 since these algorithms are soon to be deprecated. -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 16:40:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7345/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 15:49:03 + Gerrit-HasComments: No
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6503/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 15:24:10 + Gerrit-HasComments: No
[Impala-ASF-CR] Pin the json-smart version to 2.3
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16536 to look at the new patch set (#2). Change subject: Pin the json-smart version to 2.3 .. Pin the json-smart version to 2.3 With some maven repositories, Impala builds have been picking up json-smart with version 2.3-SNAPSHOT. This is not intentional (and it doesn't reproduce with public repositories). To improve the consistency of the build, pin the json-smart version to 2.3 with appropriate exclusions to prevent alternate versions. This also fixes up bin/jenkins/get_maven_statistics.sh to handle cases where maven didn't download anything. Testing: - Ran core job Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 --- M bin/jenkins/get_maven_statistics.sh M fe/pom.xml M shaded-deps/hive-exec/pom.xml 3 files changed, 87 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/16536/2 -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16512 ) Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 15:08:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16320 ) Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Gerrit-Change-Number: 16320 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 14:09:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16320 ) Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6502/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Gerrit-Change-Number: 16320 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 14:09:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16512 ) Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7344/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 11:40:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16512 ) Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. Patch Set 7: (1 comment) I had to rebase with master + due to the HadoopTable changes I had to make some adjustments in the tests. http://gerrit.cloudera.org:8080/#/c/16512/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16512/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@79 PS5, Line 79: erties > nit: do we want to change the wording to use "default", or maybe "current"? I didn't find "default" self-descriptive enough. let's say "current" here. -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 11:21:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Hello Zoltan Borok-Nagy, wangsheng, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16512 to look at the new patch set (#7). Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables A SHOW CREATE TABLE output didn't contain the PARTITION BY SPEC section for partitioned Iceberg tables. This patch addresses this shortcoming. Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 11 files changed, 173 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16512/7 -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] Pin the json-smart version to 2.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16536 ) Change subject: Pin the json-smart version to 2.3 .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6501/ -- To view, visit http://gerrit.cloudera.org:8080/16536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff92a61c9c3164e7e0c63c7569178415dcba9fb4 Gerrit-Change-Number: 16536 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Oct 2020 10:10:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16512 ) Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7343/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 09:49:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Hello Zoltan Borok-Nagy, wangsheng, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16512 to look at the new patch set (#6). Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables A SHOW CREATE TABLE output didn't contain the PARTITION BY SPEC section for partitioned Iceberg tables. This patch addresses this shortcoming. Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 11 files changed, 173 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16512/6 -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10172: Support Hive metastore managed locations for databases
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/16529 ) Change subject: IMPALA-10172: Support Hive metastore managed locations for databases .. Patch Set 4: (1 comment) When a managed table is created I think the table should inherit the managed table location, somewhere here: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java#L127 What do you think? http://gerrit.cloudera.org:8080/#/c/16529/4/testdata/workloads/functional-query/queries/QueryTest/describe-db.test File testdata/workloads/functional-query/queries/QueryTest/describe-db.test: http://gerrit.cloudera.org:8080/#/c/16529/4/testdata/workloads/functional-query/queries/QueryTest/describe-db.test@47 PS4, Line 47: nit: double space -- To view, visit http://gerrit.cloudera.org:8080/16529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I925632a43ff224f762031e89981896722e453399 Gerrit-Change-Number: 16529 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 02 Oct 2020 09:15:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16512 ) Change subject: IMPALA-10184: Add PARTITON BY SPEC to SHOW CREATE TABLE for Iceberg Tables .. Patch Set 5: Code-Review+2 (1 comment) LGTM! http://gerrit.cloudera.org:8080/#/c/16512/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16512/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@79 PS5, Line 79: latest nit: do we want to change the wording to use "default", or maybe "current"? I find the word "latest" a bit confusing. -- To view, visit http://gerrit.cloudera.org:8080/16512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4c43b75057807ab513a220d348155be2487e714 Gerrit-Change-Number: 16512 Gerrit-PatchSet: 5 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 08:47:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 ) Change subject: IMPALA-10205: Replace MD5 hash with SHA-512 for generating datafile path hash .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16534/1//COMMIT_MSG@11 PS1, Line 11: MD5 with FIPS-approved algorithm SHA-512. The current code that uses MD5 doesn't hash for cryptographic reasons, but only to get a shorter string from the file path that can be used in lookup tables. Are we restricted to FIPS-approved hash algorithms in such cases as well? (I'd think it's just an internal implementation detail. E.g. hash tables also use all kinds of hash algorithms internally that are not approved by FIPS). If so, I see that only SHA algorithms are approved by FIPS. I think SHA-512 is an overkill for our use case. How about SHA-1 or SHA-256? Or maybe we could just use the file path without any hashing. -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 02 Oct 2020 08:41:17 + Gerrit-HasComments: Yes