[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13248 ) Change subject: IMPALA-8503: add option to start Kudu cluster with HMS integration .. Patch Set 3: (4 comments) > (6 comments) > > I don't have enough context for high-level review of this > changelist, but it seems the comment that Thomas added makes sense. > > Also, what is the way to configure the way how services started via > these scripts? I know that in case of UNIX init.d scripts there > are configuration files in /etc that are picked up by the scripts. > Using environment variables makes it look like something similar to > that approach. Ok, since both of you think using env is better, I updated the patch accordingly. Thanks! http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin File testdata/cluster/admin: http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@425 PS1, Line 425: > drop Done http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@427 PS1, Line 427: > drop Done http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@428 PS1, Line 428: M > drop Done http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@497 PS1, Line 497: function restart { : if [[ $# -ne 1 ]]; then : echo restart must be called with a single argument -- the service to restart. 1>&2 : exit 1 : fi : local SERVICE=$1 : stop $SERVICE : start $SERVICE : } : : function delete_data { : # Delete namenode, datanode and KMS data while preserving directory structure. : rm -rf "$NODES_DIR/$NODE_PREFIX"*/data/dfs/{nn,dn}/* : rm -f "$NODES_DIR/$NODE_PREFIX"*/data/kms.keystore : delete_kudu_data : } > Did you verify this works as expected? I'm a bit confused about check at L Yeah, I verified. What is the confusing point? -- To view, visit http://gerrit.cloudera.org:8080/13248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 Gerrit-Change-Number: 13248 Gerrit-PatchSet: 3 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 17 May 2019 05:57:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13318 ) Change subject: IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13318/2/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java: http://gerrit.cloudera.org:8080/#/c/13318/2/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java@65 PS2, Line 65: new TAccessEvent("_impala_builtins", TCatalogObjectType.DATABASE, "VIEW_METADATA"))); line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c Gerrit-Change-Number: 13318 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 17 May 2019 05:56:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration
Hello Thomas Marshall, Mike Percy, Alexey Serbin, Andrew Wong, Adar Dembo, Grant Henke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13318 to look at the new patch set (#2). Change subject: IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration .. IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration This commits support the syntax of CREATE TABLE (and CTAS) statements for managed Kudu tables with Kudu/HMS integration. A follow up patch will address the actaul handling of CREATE TABLE statement with Kudu/HMS integration. For managed table the syntax remains the same. However, the detailed changes includes: 1) Kudu table will always be created with the new Kudu storage handler 'org.apache.kudu.hive.KuduStorageHandler' even when Kudu/HMS integration is disabled. The legacy storage handler will be eventually deprecated. 2) When Kudu/HMS integration is enabled, the Kudu table underneath the managed HMS table will follow the naming convention 'db_name.table_name' instead of 'impala::db_name.table_name'. 3) Add 'kudu.table_id' table property to be used with Kudu/HMS integration. This commits also extracts Kudu related DDL parsing and analyzing tests, so that they can be run with or without Kudu/HMS integration enabled. Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java A fe/src/test/java/org/apache/impala/analysis/AuditingKuduTest.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java A fe/src/test/java/org/apache/impala/analysis/KuduHMSIntegrationTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java 13 files changed, 825 insertions(+), 572 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/13318/2 -- To view, visit http://gerrit.cloudera.org:8080/13318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c Gerrit-Change-Number: 13318 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8504: Introduce the new Kudu storage handler
Hao Hao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13358 Change subject: IMPALA-8504: Introduce the new Kudu storage handler .. IMPALA-8504: Introduce the new Kudu storage handler This commits introduces the new Kudu storage handler 'org.apache.kudu.hive.KuduStorageHandler' that is used in Kudu/HMS integration. Tables with either the legacy storage handler 'com.cloudera.kudu.hive.KuduStorageHandler' or the new one are now considered as Kudu tables. A series of follow-up patches will be rely on the new storage handler to support Kudu/HMS integration in Impala. Change-Id: I75bcd5246005f4e35251aef9219f4d07eeb87dc6 --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java 2 files changed, 16 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/13358/1 -- To view, visit http://gerrit.cloudera.org:8080/13358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I75bcd5246005f4e35251aef9219f4d07eeb87dc6 Gerrit-Change-Number: 13358 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao
[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13302 ) Change subject: IMPALA-8121: part 3: invalidate on memory pressure .. IMPALA-8121: part 3: invalidate on memory pressure Enable --invalidate_tables_on_memory_pressure=true on catalogd so that catalog can't hit out-of-memory. Testing: Ran core tests. Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Reviewed-on: http://gerrit.cloudera.org:8080/13302 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M docker/catalogd/Dockerfile 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Gerrit-Change-Number: 13302 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13302 ) Change subject: IMPALA-8121: part 3: invalidate on memory pressure .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Gerrit-Change-Number: 13302 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 May 2019 04:54:14 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13355 ) Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4274/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/13355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e Gerrit-Change-Number: 13355 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 May 2019 04:46:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12974 ) Change subject: IMPALA-7608: Estimate row count from file size when no stats available .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1198 PS8, Line 1198: // estimated row count by ESTIMATED_COMPRESSION_FACTOR. Should we estimate the compression factor differently depending on the file format? eg text I would guess would have a much lower compression factor than parquet. Might be interesting to look at some emprical data here since changing it later could change plans, and seems not too hard to do a little research -- To view, visit http://gerrit.cloudera.org:8080/12974 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a Gerrit-Change-Number: 12974 Gerrit-PatchSet: 8 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 May 2019 04:44:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13299 ) Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint .. Patch Set 2: (17 comments) http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@21 PS2, Line 21: Before this patch is committed, the intention is : to submit the changes to those files that are shown in this review as : a patch on Impala's native-toolchain Thrift. It seems to me like it would be easier to just copy-paste this, since the Thrift Protocol/Transport interfaces are typically pretty stable across Thrift versions. Or, contribute the changes upstream to Thrift itself. Carrying a largeish patch in the context of native-toolchain seems kinda like worst of both worlds to me. http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@28 PS2, Line 28: sends a huge payload that can potentially crash the server. yea, assuming this is meant to be exposed to the internet, we might want to consider fuzz testing the pre-authenticated endpoint in an ASAN build. http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@502 PS2, Line 502: LOG(ERROR) << "Failed to determine buffer length to decode base64 auth string."; for all of the log messages in this function, can we include the remote socket address? http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@505 PS2, Line 505: char out[out_max]; stack allocating here seems quite dangerous without constraining the length (eg to 1kb or something) http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@507 PS2, Line 507: if (!Base64Decode(base64, in_len, out_max, out, _len)) { Looking at Base64Decode, it doesn't seem to null-terminate the output, but down a few lines you're using 'out' as if it's a null-terminated C string. I'd feel safer about all this code if we used C++ strings. gutil has this function: bool Base64Unescape(const char* src, int slen, string* dest); which will take care of all the buffer sizing stuff for you http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@521 PS2, Line 521: size_t pass_len = out_len - user_len - 1; again I'd feel safer about the above code if we used C++ strings, like: pair u_p = strings::Split(up_string, strings::delimiter::Limit(":", 1)); (I can't necessarily spot a bug in your C string manipulation, but I also know that the above won't have a bug) http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@957 PS2, Line 957: SetConnectionUsername this function is a bit weirdly named, since in the HTTP case, it isn't setting the connection username a tall, but rather setting up a hook that will later set a connection username. Maybe a moot point based on my comment elsewhere that this needs to be per-request state rather than per-connection state for HTTP http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@977 PS2, Line 977: const string& err = Substitute("Bad transport type: $0", underlying_transport_type); can we just LOG(FATAL) here since it would be a coding bug? http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@998 PS2, Line 998: return Status::Expected(err); same http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc@2424 PS2, Line 2424: if (hs2_http_port > 0) { Maybe we should use '-1' to mean disable? port 0 usually means "use an ephemeral port' http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h File be/src/transport/THttpServer.h: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@71 PS2, Line 71: std::vector that's a bit of an odd choice of type instead of std::string http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@80 PS2, Line 80: THttpServerTransportFactory(bool requireBasicAuth = false) explicit http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@68 PS2, Line 68: strncmp probably needs to be made case-insensitive (odd that x-forwarded-for is not using THRIFT_strncasecmp). http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@69 PS2, Line 69: 7 probably need to also check that sz >= 7, otherwise we might read past the end of value (one of the advantages of pulling THttpServer into Impala code itself instead of trying to patch Thrift woudl be we could use nice convenience functions like
[Impala-ASF-CR] Add USE CDP HIVE=true case to build-all-flag-combinations.sh
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13335 ) Change subject: Add USE_CDP_HIVE=true case to build-all-flag-combinations.sh .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13335 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3f8e689242e20efb37fbadf7c04764ea8ffb9a9f Gerrit-Change-Number: 13335 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 May 2019 03:38:57 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13355 ) Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e Gerrit-Change-Number: 13355 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 17 May 2019 03:36:01 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Allow data cache to be enabled optionally when running tests This change adds two environment variables ${DATA_CACHE_DIR} and ${DATA_CACHE_SIZE} which specify the root directory path of the data cache and the data cache size respectively. If both are set, Impala cluster will be started with data cache enabled when running E2E tests. When running with HDFS, it will also set the config option --always_use_data_cache to true to force use of the data cache all the time. Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Reviewed-on: http://gerrit.cloudera.org:8080/13330 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/run-all-tests.sh 1 file changed, 15 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Fri, 17 May 2019 01:43:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/334/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 01:00:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 10: (15 comments) Thanks for your improvements - I think this is good. I really like the test, it'll be valuable in future. http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc File be/src/scheduling/cluster-membership-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@35 PS10, Line 35: /// handling code in 4 subsequently more sophisticated ways: Maybe it's worth mentioning that it's effectively simulating the statestore except updates are done serially http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@48 PS10, Line 48: rng_.Reset(seed); I don't know about the pros/cons of Kudu's RNG but we have some test utilities that handle seeding and allow overriding with an env variable - RandTestUtil::SeedRng(). I'd be inclined to keep things consistent within our test code. http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@64 PS10, Line 64: /// Backends that are created but offline, i.e. they have no ClusterMembershipMgr and It might be helpful to document the states that a backend can be in (and how that's represented in terms of membership in these lists). http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@86 PS10, Line 86: pull poll? http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@168 PS10, Line 168: // Broadcast to all other backends I was trying to think if there were any interesting cases where different CMMs saw deltas with different timing, e.g. two executors quiesce simultaneously so don't see the other's quiescence before they mark themselves as quiesced. The test framework I think is simulating serial updates but the statestore polls for updates in parallel. I think this is unimportant since state of other backends doesn't affect the deltas pushed out in any way and each key is single writer, but just wanted to double check that you agreed with my thinking. http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@174 PS10, Line 174: send sent http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@178 PS10, Line 178: // TODO Make these a function ? http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@202 PS10, Line 202: /// statestore directly. I'm not sure what "driving the interface with the statestore" really means. http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@358 PS10, Line 358: const or constexpr http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@359 PS10, Line 359: p_add These are constants, right, so "const double P_ADD" http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@363 PS10, Line 363: // empty comment. Actually would be good to document that these are cumulative counts. http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@384 PS10, Line 384: if (p < p_delete && !quiescing_.empty()) { I feel like we should also be deleting non-quiescing backends, to simulate non-graceful failures. I don't think it adds much complexity (famous last words). http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@79 PS8, Line 79: // implicitly-defined default and copy constructors. > I'm not sure I'm following your thoughts behind documenting it. Should we d I think both - i think they're pretty tightly coupled anyway. Implicit copy constructors for complex structs are a bit of a code smell - I'd be suspicious if I came across this code and think that maybe it was accidentally copying things. I can also see devs accidentally breaking copy construction if they didn't realise it was meant to be used that way. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@188 PS8, Line 188: ckend explicit > I went with 1. Do you think we should expose the fact that we rely on a hos I guess I don't see a reason why we need to change the interface, so this is fine. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@299 PS8, Line 299: LOG(WARNING) << "Error updating frontend
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/334/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:50:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13357 to look at the new patch set (#4). Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. IMPALA-8490: [DOCS] Describe the S3 file handle caching feature Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 --- M docs/topics/impala_scalability.xml 1 file changed, 109 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/4 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml File docs/topics/impala_scalability.xml: http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1022 PS3, Line 1022: One scalability aspect that affects heavily loaded clusters is the number of calls made > I think we want to explicitly mention the HDFS NameNode here as NameNode sc Done http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1032 PS3, Line 1032: You can reduce the number of calls made to your files system by enabling the file handle > "made to your files system" --> "made to your file system's metadata layer" Done -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:50:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml File docs/topics/impala_scalability.xml: http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1022 PS3, Line 1022: One scalability aspect that affects heavily loaded clusters is the number of calls made I think we want to explicitly mention the HDFS NameNode here as NameNode scalability issues are a big issue for Impala. Would revise to something like: "Once scalability aspect that affects heavily loaded clusters is the load on the metadata layer from looking up the details as each file is opened. On HDFS that can lead to increased load on the NameNode, and on S3 this can lead to an excessive number of S3 metadata requests." http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1032 PS3, Line 1032: You can reduce the number of calls made to your files system by enabling the file handle "made to your files system" --> "made to your file system's metadata layer" -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:33:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 10: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3258/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 17 May 2019 00:27:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/333/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:22:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml File docs/topics/impala_scalability.xml: http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml@1118 PS2, Line 1118: > For this section, and the ones below, just we just reference that the this I combined the 2 sections into one. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:16:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/333/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 17 May 2019 00:15:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13357 to look at the new patch set (#3). Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. IMPALA-8490: [DOCS] Describe the S3 file handle caching feature Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 --- M docs/topics/impala_scalability.xml 1 file changed, 108 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/3 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/331/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 16 May 2019 23:57:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 2: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/332/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 23:51:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 2: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/332/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 16 May 2019 23:59:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml File docs/topics/impala_scalability.xml: http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml@1118 PS2, Line 1118: For this section, and the ones below, just we just reference that the this is the same file handle cache used for HDFS files? Seems better than duplicating these sections, especially since I don't think these sections include anything unique to S3. -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 16 May 2019 23:57:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 2: Sorry for the messy diffs! The actual changes start from L1027. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 16 May 2019 23:52:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13357 to look at the new patch set (#2). Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. IMPALA-8490: [DOCS] Describe the S3 file handle caching feature Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 --- M docs/topics/impala_scalability.xml 1 file changed, 116 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/2 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13357 ) Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/331/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 23:49:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13357 Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature .. IMPALA-8490: [DOCS] Describe the S3 file handle caching feature Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 --- M docs/topics/impala_scalability.xml 1 file changed, 499 insertions(+), 736 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/1 -- To view, visit http://gerrit.cloudera.org:8080/13357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1 Gerrit-Change-Number: 13357 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 10: (11 comments) Thanks for the comments. Please see PS10. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@67 PS8, Line 67: typedef std::shared_ptr BackendDescriptorPtr; > I think I would prefer if there was something in the name of this and Snaps Would BackendDescriptorSPtr work for you? I'd like to keep it from getting out-of-hand long, Maybe BeDescSharedPtr would work, too? http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@70 PS8, Line 70: typedef std::unordered_map BackendIdMap; > nit: std::string in headers here and elsewhere. I guess a "using" is in a h Done http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@79 PS8, Line 79: // implicitly-defined default and copy constructors. > Probably worth documenting that we expect this to be copy-constructed. I'm not sure I'm following your thoughts behind documenting it. Should we document it for users of the struct to mark it as the intended behavior? Or for future developers who want to make changes and need to keep copy-construction supported? For now I assumed the former and added some documentation in comment and the code. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@121 PS8, Line 121: /// steps. > Can you document thread safety for these functions and the basics of the ca Done http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@133 PS8, Line 133: /// membership. This callback will only be called when backends are deleted from the > Valid to call any time before Init()? Done http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@39 PS8, Line 39: LOG(INFO) << "Starting cluster membership manager"; > Consider DCHECK(TestInfo::is_test()); Thx, TIL. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@188 PS8, Line 188: ckend explicit > Yeah I like 1) since it makes the comparison more explicit. We should add a I went with 1. Do you think we should expose the fact that we rely on a host:port combination through the interface? I.e., should if be RemoveExecutor(string krpc_address) or similar instead of passing the full be desc? http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@235 PS8, Line 235: AddLo > nit: unnecessary std:: Done http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@299 PS8, Line 299: LOG(WARNING) << "Error updating frontend membership snapshot: " << status.GetDetail(); > This method lacks a check to compare the local be desc inside current_backe Done http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@313 PS8, Line 313: auto it = state.current_backends.find(local_backend_id_); > Would this consistency check have detected the bug you found? Do we need to Yes, that's how I found it, a host trying to quiesce itself will hit the CheckConsistency DCHECK after trying to remove its own backend descriptor. http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h File be/src/scheduling/executor-group.h: http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h@41 PS8, Line 41: /// host/IP address. > Worth making explicit that this can be copy-constructed. Either by document Done -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 23:37:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13207 to look at the new patch set (#10). Change subject: IMPALA-8460: Simplify cluster membership management .. IMPALA-8460: Simplify cluster membership management This change adds a class to track cluster membership called ClusterMembershipMgr. It replaces the logic that was partially duplicated between the ImpalaServer and the Coordinator and makes sure that the local backend descriptor is consistent (IMPALA-8469). The ClusterMembershipMgr maintains a view of the cluster membership and incorporates incoming updates from the statestore. It also registers the local backend with the statestore after startup. Clients can obtain a consistent, immutable snapshot of the current cluster membership from the ClusterMembershipMgr. Additionally, callbacks can be registered to receive notifications of cluster membership changes. The ImpalaServer and Frontend use this mechanism. This change also unifies the naming of executor-related classes, in particular it renames "BackendConfig" to "ExecutorGroup". In anticipation of a subsequent change (IMPALA-8484), it adds maps to store multiple executor groups. This change also disables the generation of default operators from the thrift files and instead adds explicit implementations for the ones that we rely on. This forces us to explicitly specify comparators when manipulating containers of thrift structs and will help prevent accidental bugs. Testing: This change adds a backend unit test for the new cluster membership manager. The observable behavior of Impala does not change, and the existing scheduler unit test and end to end tests should make sure of that. Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 --- M be/src/benchmarks/scheduler-benchmark.cc M be/src/common/logging.h M be/src/gutil/strings/split.cc M be/src/gutil/strings/split.h M be/src/rpc/thrift-util.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/query-state.h M be/src/scheduling/CMakeLists.txt D be/src/scheduling/backend-config-test.cc D be/src/scheduling/backend-config.cc D be/src/scheduling/backend-config.h A be/src/scheduling/cluster-membership-mgr-test.cc A be/src/scheduling/cluster-membership-mgr.cc A be/src/scheduling/cluster-membership-mgr.h A be/src/scheduling/cluster-membership-test-util.cc A be/src/scheduling/cluster-membership-test-util.h A be/src/scheduling/executor-group-test.cc A be/src/scheduling/executor-group.cc A be/src/scheduling/executor-group.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h M be/src/util/network-util.cc M be/src/util/network-util.h M be/src/util/uid-util-test.cc M common/thrift/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/StatestoreService.thrift M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_coordinators.py 44 files changed, 1,831 insertions(+), 1,008 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13207/10 -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc File be/src/scheduling/cluster-membership-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@86 PS10, Line 86: // The empty delta is used to pull subscribers for updates without sending new changes. line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@180 PS10, Line 180: bool is_quiescing = find(quiescing_.begin(), quiescing_.end(), be) != quiescing_.end(); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@388 PS10, Line 388: quiescing_.erase(remove(quiescing_.begin(), quiescing_.end(), be), quiescing_.end()); line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 10 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 23:37:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13302 ) Change subject: IMPALA-8121: part 3: invalidate on memory pressure .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4273/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Gerrit-Change-Number: 13302 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 23:34:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13302 ) Change subject: IMPALA-8121: part 3: invalidate on memory pressure .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Gerrit-Change-Number: 13302 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 23:34:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13302 ) Change subject: IMPALA-8121: part 3: invalidate on memory pressure .. Patch Set 6: Test failure was a flake -- To view, visit http://gerrit.cloudera.org:8080/13302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969 Gerrit-Change-Number: 13302 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 23:34:39 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 7: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3257/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 22:22:22 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java 8 files changed, 436 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/7 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13355 ) Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3256/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e Gerrit-Change-Number: 13355 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 21:42:57 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 7: (14 comments) http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h File be/src/service/frontend.h: http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h@190 PS6, Line 190: PostExecHook > nit: PostQueryExecHook is probably a better name? Done http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java File fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java: http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@36 PS2, Line 36: > It is dangerous to let plugin fail Impala in this way. Could impala catches This is during impala daemon start-up; is it not preferred to prevent start-up if there is a configuration problem? http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@47 PS2, Line 47: > "when a (qualifying) Impala query" should it be "after a (qualifying) Impal "has executed" already communicates that the point of time in question is in the past, so "after" is redundant. But I think it's clearer, so I'll change it to "after" http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@48 PS2, Line 48: > should "has executed" be "has been executed"? This is just a difference between active voice and passive voice. No difference in meaning http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@63 PS2, Line 63: > why don't you have cleanup function, which will be called by Impala when Im Because of the way impala shuts down, it will never be guaranteed to execute. This is covered and commented on in the design doc http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java File fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java: http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@28 PS6, Line 28: lineageGraph > nit: use _ suffix naming convention for member varaibles, i.e. lineageGraph kk http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@31 PS6, Line 31: Objects.requireNonNull(lineageGraph).deepCopy(); > I don't think we want to do a deep copy each time. Each time it's constructed? Or called? Because a single context object is passed to all the hook implementations http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@33 PS6, Line 33: singleton > i'm not 100% convinced why this needs to be a singleton. Can Frontend owns I agree. But what about JniFrontend owning it? http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@78 PS6, Line 78: private static final int NUM_HOOK_THREADS = 3; > should this be configurable? Probably. I'd like to get the hook configuration sorted first before I tackle this though, since they should probably use the same config mechanism http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@97 PS6, Line 97: Configuration config > I don't think passing hadoop configuration is correct here. Impala doesn't Yeah that sounds better. Using a backend flag http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@98 PS6, Line 98: LOG.info("impalaStartup hook invoked with config: {}", config); > config can get pretty big, we don't want to print it on info log level The `toString` of the config object actually only prints the file names, not their content. http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@152 PS6, Line 152: @Override : protected void finalize() throws Throwable { : this.cleanUp(); : super.finalize(); : } > i don't know if it's a good idea do this in finalize(), maybe use Runtime.a Will do http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@175 PS6, Line 175: void > can we return a List instead of void in case we want to do somethin Do you intend on passing the futures back to the backend? I fully support at least storing the futures here in case we decide to take action later on, but since this is the "manager", i think that any course of action
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@55 PS7, Line 55: * is performed asynchronously during {@link #executePostQueryHooks(PostQueryHookContext)}. line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 21:35:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition .. IMPALA-8438: Store WriteId and ValidWriteId list for table and partition This happens when tables load metadata from HMS. Add MetastoreShim functions to support HMS3 only functions. Add validwriteIdlists to query profile through timeline. Tests: Manually tests HMS2 and HMS3, using log files to check Unit tests against HMS3 ToDo: WriteId and valid writeIds can be fetched in other time, need more study on that. Profile example: Query Compilation: 5s057ms - Metadata load started: 63.006ms (63.006ms) - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms) - Loaded ValidWriteIdLists: acid.insert_only_no_partitions:6:9223372036854775807:: acid.insert_only_with_partitions:3:9223372036854775807:: : 4s921ms (120.580ms) - Analysis finished: 4s929ms (8.013ms) Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Reviewed-on: http://gerrit.cloudera.org:8080/13215 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M common/thrift/CatalogObjects.thrift A fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java 15 files changed, 348 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 19 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition .. Patch Set 18: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 18 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 16 May 2019 21:12:15 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13355 ) Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e Gerrit-Change-Number: 13355 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 20:59:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13345 ) Change subject: IMPALA-8560: Prometheus metrics support in Impala .. Patch Set 2: (13 comments) Some more style nits and I some requests for additional test coverage. You probably have a better understanding of the prometheus format than I do, so if you can think of more edge cases that I didn't ask for tests for, please add them. http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h File be/src/util/collection-metrics.h: http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@80 PS2, Line 80: + << You can also just combine these lines, i.e. *val << name << " " << s; http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@170 PS2, Line 170: string nit: std::string http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@173 PS2, Line 173: *val << name; You can combine these lines by chaining <<. If you do that here and below it will be a lot more concise. http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h File be/src/util/histogram-metric.h: http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h@75 PS2, Line 75: string nit: std::string http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h@80 PS2, Line 80: *value << "{le=0.2} "; Same comment about chaining << to get this into fewer lines. If you look at some of the LOG statements in the code we even have multi-line chains in a lot of places. http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc File be/src/util/metrics-test.cc: http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@470 PS2, Line 470: + nit: << instead of + here and in below lines http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@477 PS2, Line 477: TEST_F(MetricsTest, CountersPrometheus) { Can you add a BYTE unit counter? http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@488 PS2, Line 488: AddMetricDef("gauge", TMetricKind::GAUGE, TUnit::NONE); Can you add tests for the supported unit types: $ grep unit common/thrift/metrics.json | sort | uniq "units": "BYTES", "units": "NONE", "units": "TIME_MS", "units": "TIME_NS", "units": "TIME_S", "units": "UNIT", http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@495 PS2, Line 495: TEST_F(MetricsTest, PropertiesPrometheus) { Can you test some more edge cases with the strings, e.g. what happens if there's a line break in the string? It looks like the prometheus format requires strings to be escaped: https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@513 PS2, Line 513: AssertPrometheus(stats_val, "stats_metric", Do we have test coverage for all the floating point formats in the spec: https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@514 PS2, Line 514: "stats_metric_count 2\nstats_metric_last 20.00\nstats_metric_min " Can you format this in a way that makes it easier to see the lines, i.e. "stats_metric_count 2\n" "stats_metric_last 20.00\n" "stats_metric_min 10\n" ... http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@533 PS2, Line 533: "histogram-metric{le=0.2} 2500\nhistogram-metric{le=0.5} " Same comment on formatting. http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@543 PS2, Line 543: exp_val << "# HELP description\n# TYPE counter1 COUNTER\ncounter1 2048\n# HELP " same comment on formatting. -- To view, visit http://gerrit.cloudera.org:8080/13345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08 Gerrit-Change-Number: 13345 Gerrit-PatchSet: 2 Gerrit-Owner: Harshil Gerrit-Reviewer: Harshil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 20:57:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13345 ) Change subject: IMPALA-8560: Prometheus metrics support in Impala .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3255/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08 Gerrit-Change-Number: 13345 Gerrit-PatchSet: 2 Gerrit-Owner: Harshil Gerrit-Reviewer: Harshil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 20:50:06 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
Hello Todd Lipcon, Tim Armstrong, Joe McDonnell, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/13355 to review the following change. Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 .. Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3 Fixes in the new CMake version include: - improved rule generation for ninja - forbidding the use of select() during CMake runs, which caused hangs in the past. Private builds have passed in the following configurations: - on Ubuntu 14.04 using ninja - on CentOS 7.4 and CnetOS 6.4 using GNU make - in a Docker container (using docker/test-with-docker.py) using GNU make on Ubuntu 16.04 and Ubuntu 18.04 Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/13355/1 -- To view, visit http://gerrit.cloudera.org:8080/13355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e Gerrit-Change-Number: 13355 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8344: Add support for running the minicluster with S3Guard
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13020 ) Change subject: IMPALA-8344: Add support for running the minicluster with S3Guard .. Patch Set 4: (6 comments) mostly minor comments, overall LGTM pending a few comments http://gerrit.cloudera.org:8080/#/c/13020/4/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/13020/4/bin/impala-config.sh@341 PS4, Line 341: export S3GUARD_METADATASTORE_IMPL="org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore" Is this necessary? I think the default impl uses the NullMetadataStore already, right? http://gerrit.cloudera.org:8080/#/c/13020/4/testdata/bin/load-test-warehouse-snapshot.sh File testdata/bin/load-test-warehouse-snapshot.sh: http://gerrit.cloudera.org:8080/#/c/13020/4/testdata/bin/load-test-warehouse-snapshot.sh@67 PS4, Line 67: hadoop s3guard prune -seconds 1 -meta "dynamodb://${S3GUARD_DYNAMODB_TABLE}" \ is it necessary to prune a newly initialized S3Guard table? http://gerrit.cloudera.org:8080/#/c/13020/4/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/13020/4/tests/common/impala_test_suite.py@a175 PS4, Line 175: Should we delete the S3Client as well? It doesn't look like it is used anywhere else, right? We can remove the boto3 dependency as well, which decreases the number of Python dependencies we have. http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py File tests/util/hdfs_util.py: http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@158 PS4, Line 158: def __init__(self, filesystem_type="HDFS"): the filesystem_type is just purely for debug messages right? might be useful to mention that http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@163 PS4, Line 163: hadoop_command = ['hadoop', 'fs'] + command nit: hdfs instead of hadoop http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@170 PS4, Line 170: def create_file(self, path, file_data, overwrite=True): nit: would be nice to have some docs for these methods, e.g. mention that create_file actually writes to a temp file on the local fs first and then uploads the file to the target fs -- To view, visit http://gerrit.cloudera.org:8080/13020 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c748529a494bb6e70fec96dc031523ff79bf61d Gerrit-Change-Number: 13020 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 16 May 2019 20:41:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4272/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 20:25:49 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 20:25:48 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3254/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 20:20:40 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3253/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 20:20:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala
Harshil has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13345 ) Change subject: IMPALA-8560: Prometheus metrics support in Impala .. IMPALA-8560: Prometheus metrics support in Impala -- This change adds Prometheus text explosion format metric generation. -- More details can be found below: -- https://prometheus.io/docs/instrumenting/exposition_formats -- Added unit test to test this change Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08 --- M be/src/util/collection-metrics.h M be/src/util/histogram-metric.h M be/src/util/metrics-test.cc M be/src/util/metrics.cc M be/src/util/metrics.h 5 files changed, 280 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13345/2 -- To view, visit http://gerrit.cloudera.org:8080/13345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08 Gerrit-Change-Number: 13345 Gerrit-PatchSet: 2 Gerrit-Owner: Harshil Gerrit-Reviewer: Harshil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/13313 ) Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py@260 PS1, Line 260: help=SUPPRESS_HELP > One potential use case that maybe useful is is to specify a different locat This is one of the ways we thought of while coming up with this solution, so I am completely on board if you think this use case will be common. The only reason I am usually skeptical about adding env variables is that, if you are sourcing a bunch of them in and the naming is ambiguous then there can be some conflicting ones and it can get difficult to manage. We will have to be super careful about naming this so its super obvious what it is used for, how about IMPALA_SHELL_GLOBAL_CONFIG_FILE. Also if we want users to set this then we might want to expose this somewhere in the help text and make sure we update the impala docs. -- To view, visit http://gerrit.cloudera.org:8080/13313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Gerrit-Change-Number: 13313 Gerrit-PatchSet: 4 Gerrit-Owner: Ethan Xue Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Ethan Xue Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 19:56:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow running backend tests sharded and in parallel
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13290 ) Change subject: Allow running backend tests sharded and in parallel .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt@675 PS2, Line 675: RUN_SERIAL I haven't use this before. If we have -j 8 on our ctest invocation, and expr-test and decompress-test are sharded, expr-test shards may overlap with decompress-test shards. Is that right? (I don't mean to imply any problem with that.) -- To view, visit http://gerrit.cloudera.org:8080/13290 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d44475e3f5a45c806f00d5003eeadf59683b009 Gerrit-Change-Number: 13290 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 19:58:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13313 ) Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py@260 PS1, Line 260: help=SUPPRESS_HELP > This is one of the ways we thought of while coming up with this solution, s +1 on IMPALA_SHELL_GLOBAL_CONFIG_FILE since this is an Impala shell specific config. -- To view, visit http://gerrit.cloudera.org:8080/13313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Gerrit-Change-Number: 13313 Gerrit-PatchSet: 4 Gerrit-Owner: Ethan Xue Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Ethan Xue Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 19:58:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13352/3/be/src/service/impala-server.cc@496 PS3, Line 496: // line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 19:23:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13352/5/be/src/service/impala-server.cc@496 PS5, Line 496: // line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 19:34:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java File fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java: http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java@31 PS5, Line 31: this.lineageGraph = Objects.requireNonNull(lineageGraph).deepCopy(); Lot of defensive copies because `TLineageGraph` is mutable, and context objects are given by reference to potentially multiple hooks. This initial copy may effect performance since it occurs even if no hooks are registered. This could be avoided if we provide an additional flag to disable hooks, independent of hook registration. Then we could avoid the JNI call and this constructor altogether. Feedback? -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 19:34:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java A fe/src/main/java/org/apache/impala/hooks/ImpalaQueryExecHook.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 7 files changed, 361 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/5 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has removed Todd Lipcon from this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Removed reviewer Todd Lipcon. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13352 Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 7 files changed, 361 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/3 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3252/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 19:12:40 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3251/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 19:11:42 + Gerrit-HasComments: No
[Impala-ASF-CR] DWX-124: Prometheus metrics support in Impala
Harshil has posted comments on this change. ( http://gerrit.cloudera.org:8080/13345 ) Change subject: DWX-124: Prometheus metrics support in Impala .. Patch Set 1: (19 comments) http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG@9 PS1, Line 9: -- This change adds Prometheus text explosion format metric generation in impala, more details > Can you wrap lines to 72 chars in commit messages. Done http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG@13 PS1, Line 13: -- Doc that talk about this approach is here: https://docs.google.com/document/d/1KeOuFowH83q9l7iGHFMXL2mAejt4EZCjn3XfKDu7sho/edit?usp=sharing > This doc isn't publicly accesssible. Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h File be/src/util/collection-metrics.h: http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@170 PS1, Line 170: string > Use std::string in header files (we have a policy not to import std::string Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@170 PS1, Line 170: string name, std::stringstream* val, std::stringstream* metric_kind) { > It's annoying that the original author of this file didn't move the ToJson( ok thanks http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@172 PS1, Line 172: string collect_data; > We should just append to val instead of adding an intermediate string. Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@175 PS1, Line 175: tatic_cast( > Why are the casts needed? this seems suspicious to me - like it might not w Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@178 PS1, Line 178: collect_data += name + "_last " + std::to_string(value_) + "\n"; > For string concatenation we either prefer using << to append to a stringstr Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@200 PS1, Line 200: sqrt > std::sqrt Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@204 PS1, Line 204: *metric_kind << "# TYPE " + name + " counter"; > use << to append to the stream instead of string concatenation. Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h File be/src/util/histogram-metric.h: http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h@81 PS1, Line 81: histogram_data += > Same comments about using << to append directly to value Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h@99 PS1, Line 99: *metric_kind << "# TYPE " + name + " histogram"; > Same comment about << Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h File be/src/util/metrics.h: http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h@171 PS1, Line 171: string > std::string in headers Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h@173 PS1, Line 173: *metric_kind << "# TYPE " + name + " " + PrintThriftEnum(kind()).c_str(); > Use << instead of + Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc File be/src/util/metrics.cc: http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@190 PS1, Line 190: return; > return not needed Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@206 PS1, Line 206: > Don't need to add vertical whitespace Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@220 PS1, Line 220: void MetricGroup::ToPrometheus(bool include_children, std::stringstream* out_val) { > Same comments apply about using << instead of strings. Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@220 PS1, Line 220: std:: > shouldn't need to use std:: prefix in .cc files. Done http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@245 PS1, Line 245: > unnecessary vertical whitespace and return Done http://gerrit.cloudera.org:8080/#/c/13345/1/bin/distcc/distcc_server_setup.sh File bin/distcc/distcc_server_setup.sh: http://gerrit.cloudera.org:8080/#/c/13345/1/bin/distcc/distcc_server_setup.sh@58 PS1, Line 58: if ! [[ $LSB_VERSION == 14.04 || $LSB_VERSION == 16.04 || $LSB_VERSION == 18.04 ]]; then > Great if this works - but can you do this in a separate change? I also want I am using ubuntu 18.04, so I tested this by running it on host that I have. -- To view, visit http://gerrit.cloudera.org:8080/13345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08 Gerrit-Change-Number: 13345 Gerrit-PatchSet: 1 Gerrit-Owner: Harshil Gerrit-Reviewer: Harshil Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 4: Code-Review+2 This looks good to me. -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 19:12:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13277 ) Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/330/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb Gerrit-Change-Number: 13277 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 19:00:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13313 ) Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3250/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Gerrit-Change-Number: 13313 Gerrit-PatchSet: 4 Gerrit-Owner: Ethan Xue Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Ethan Xue Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 18:52:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13277 ) Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml File docs/topics/impala_scaling_limits.xml: http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@77 PS3, Line 77: Number of Impalad Coordinators: 1 coordinator for every 50 executors > for at most every 50 executors (could be misinterpreted as-is) Done http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@148 PS3, Line 148: Number of views > Also total, per database? Done http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@152 PS3, Line 152: Number of user-defined functions > Also total, per database? Done http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@167 PS3, Line 167: Number of block per file > HDFS blocks? Done http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@171 PS3, Line 171: Footer size > Footer size isn't so much under user control - it's a function of the amoun Removed http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@282 PS3, Line 282: fragment > fragments Done -- To view, visit http://gerrit.cloudera.org:8080/13277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb Gerrit-Change-Number: 13277 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 18:48:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
Hello Lars Volker, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13277 to look at the new patch set (#4). Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits .. IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits - Listed the known/tested SCALING Limits. - Unknown limits are marked hidden for now. When the numbers are available, will remove the hidden tag. Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb --- M docs/impala.ditamap A docs/topics/impala_scaling_limits.xml 2 files changed, 365 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/13277/4 -- To view, visit http://gerrit.cloudera.org:8080/13277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb Gerrit-Change-Number: 13277 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13277 ) Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/330/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb Gerrit-Change-Number: 13277 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 18:49:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12974 ) Change subject: IMPALA-7608: Estimate row count from file size when no stats available .. Patch Set 8: (20 comments) Another round of thoughts on the test infrastructure. I think the product code looks fine, just want to make sure the tests are more maintainable and easier to work on in the future. http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h@48 PS8, Line 48: // The Debug webpage won't be able to be connected once the DCHECK fails. Remove the last line, no need to describe symptoms of a DCHECK here. http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift@391 PS8, Line 391: DISABLE_HDFS_NUM_ROWS_EST EST->ESTIMATE. No real need to abbreviate in a safety valve argument that will be rarely used. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@285 PS8, Line 285: // Added this field to prevent the case in the method getStatsNumRows when Can you rewrite the comment to just describe what the value is. No need to include the story about why you added it here. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@288 PS8, Line 288: private double defaultRowWidth_ = 1.0; Should be a constant, not a variable, i.e. private static double DEFAULT_ROW_WIDTH; Maybe also rename to indicate that it's an estimate, not the real row width, i.e. DEFAULT_ROW_WIDTH_ESTIMATE. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1148 PS8, Line 1148: private long getStatsNumRows(Analyzer analyzer) { I would prefer if you just passed in the query options to this methods. Otherwise when reading code one might think that it's using the analyzer in a more complex way. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1190 PS8, Line 1190: // In the case when there is no Column defined, we use an ultimate Do we have a test case for this? http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java: PS8: We should also test the nodes with the new estimate functionality disabled in these unit tests. Just to make sure that all branches in the code that handle missing stats are covered. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@234 PS8, Line 234: // The cardinality of an AggregateNode is always 1 no matter Not true if there's a group by - we should test that too. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@262 PS8, Line 262: path.add(0); Can we use something cleaner for constant lists like: Arrays.asList(0, 1, 0); http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@454 PS8, Line 454: GENERATE_DISTRIBUTED_PLAN I think this new planner test option is awkward since the planner tests already have a different mechanism to test single-node plans. Also the default for planner tests is to generate a distributed plan, which makes it more confusing. I'd propose that we instead implement it purely within this file instead - i.e. just add an argument to verifyCardinality. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@482 PS8, Line 482: protected void verifyCardinality(String query, long expected) { For the tests that rely on file-size-based estimates, I think we want some margin of error for the cardinality estimates.I think Paul added something for the planner tests. The problem is that file sizes may vary a little bit, e.g. because of changes to the file writers or random variation, so cardinality estimates based on file size can vary too. http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@253 PS8, Line 253: public void testFkPkJoinDetectionWithHDFSNumRowsEstEnabled() { Let's call the ones with the default options testFkPkJoinDetection() - no need to embed the fact that it's
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Hello Lars Volker, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13330 to look at the new patch set (#4). Change subject: Allow data cache to be enabled optionally when running tests .. Allow data cache to be enabled optionally when running tests This change adds two environment variables ${DATA_CACHE_DIR} and ${DATA_CACHE_SIZE} which specify the root directory path of the data cache and the data cache size respectively. If both are set, Impala cluster will be started with data cache enabled when running E2E tests. When running with HDFS, it will also set the config option --always_use_data_cache to true to force use of the data cache all the time. Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b --- M bin/run-all-tests.sh 1 file changed, 15 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/13330/4 -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/13330 ) Change subject: Allow data cache to be enabled optionally when running tests .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh File bin/run-all-tests.sh: http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh@64 PS2, Line 64: > Do we know if the choice of filesystem matters for the cache? For example, It may matter if the filesystem doesn't support hole punching. I leave it as empty in the new patch. http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh@78 PS2, Line 78: _START_CLUSTER_ARGS="${TEST_S > I think the data caching is orthogonal to the filesystem, and we might want Done -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 16 May 2019 18:11:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests
Hello Lars Volker, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13330 to look at the new patch set (#3). Change subject: Allow data cache to be enabled optionally when running tests .. Allow data cache to be enabled optionally when running tests This change adds two environment variables ${DATA_CACHE_DIR} and ${DATA_CACHE_SIZE} which specify the root directory path of the data cache and the data cache size respectively. If both are set, Impala cluster will be started with data cache enabled when running E2E tests. When running with HDFS, it will also set the config option --always_use_data_cache to true to force use of the data cache all the time. Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b --- M bin/run-all-tests.sh 1 file changed, 15 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/13330/3 -- To view, visit http://gerrit.cloudera.org:8080/13330 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b Gerrit-Change-Number: 13330 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config
Ethan Xue has posted comments on this change. ( http://gerrit.cloudera.org:8080/13313 ) Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py@1628 PS1, Line 1628: test_global_config = os.path.expanduser(options.global_config) : if os.path.isfile(test_global_config): : # Always output the source of the global config if verbose : if options.verbose: : print_to_stderr( : "Loading in options from global config file: %s \n" % test_global_config) : if test_global_config != default_global_config: : default_global_config = test_global_config : elif test_global_config != default_global_config: : print_to_stderr('%s not found.\n' % test_global_config) : sys.exit(1) : : # use path to custom file specified by user in config_file option : input_config = os.path.expanduser(options.config_file) : # verify input_config, if found : if input_config != default_user_config: : if os.path.isfile(input_config): : if options.verbose: : print_to_stderr("Loading in options from config file: %s \n" % input_config) : # command > One confusion I had earlier was particularly this line: input_config != use Ah I see what you're getting at, and I agree. Thanks! http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py@1656 PS1, Line 1656: > nit: formatting is a bit odd. In Python, we usually indent parser.option_li Done -- To view, visit http://gerrit.cloudera.org:8080/13313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Gerrit-Change-Number: 13313 Gerrit-PatchSet: 4 Gerrit-Owner: Ethan Xue Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Ethan Xue Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 17:56:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config
Hello Fredy Wijaya, Todd Lipcon, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13313 to look at the new patch set (#4). Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config .. IMPALA-6042: Allow Impala shell to use a global impalarc config Currently, impalarc files can be specified on a per-user basis (stored in ~/.impalarc), and they aren't created by default. The Impala shell should pick up /etc/impalarc as well, in addition to the user-specific configurations. The intent here is to allow a "global" configuration of the shell by a system administrator. Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 --- M shell/impala_shell.py M shell/impala_shell_config_defaults.py M shell/option_parser.py A tests/shell/good_impalarc2 M tests/shell/test_shell_commandline.py 5 files changed, 78 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/13313/4 -- To view, visit http://gerrit.cloudera.org:8080/13313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816 Gerrit-Change-Number: 13313 Gerrit-PatchSet: 4 Gerrit-Owner: Ethan Xue Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Ethan Xue Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13277 ) Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml File docs/topics/impala_scaling_limits.xml: http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@77 PS3, Line 77: Number of Impalad Coordinators: 1 coordinator for every 50 executors for at most every 50 executors (could be misinterpreted as-is) http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@148 PS3, Line 148: Number of views Also total, per database? http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@152 PS3, Line 152: Number of user-defined functions Also total, per database? http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@167 PS3, Line 167: Number of block per file HDFS blocks? http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@171 PS3, Line 171: Footer size Footer size isn't so much under user control - it's a function of the amount of metadata per file, which is mostly a funciton of the number of columns and row groups http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@282 PS3, Line 282: fragment fragments -- To view, visit http://gerrit.cloudera.org:8080/13277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb Gerrit-Change-Number: 13277 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 16 May 2019 17:45:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3249/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 17:24:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13353 Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. IMPALA-8443: Record time spent in authorization in the runtime profile The analysis and authorization is handled together as authorization depends on the results of the analysis. I have moved the events inside the analyzeAndAuthorize method and split them, this will help locating which part of the analysis was slow. Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/service/Frontend.java 2 files changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/13353/2 -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate
[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition .. Patch Set 18: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4271/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 18 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 16 May 2019 15:51:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition .. Patch Set 18: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 18 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 16 May 2019 15:51:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13339 ) Change subject: IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3 .. IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3 Currently, when running with USE_CDP_HIVE=true, Sentry service's sync with HMS is very slow. This is most likely due to the fact that in HMS-3 the notification events are generated using the JSONMessageFactory provided by Metastore, unlike in case of HMS-2 setup. When running against HMS-2, Sentry provides its own MessageFactory implementation which has its limitations and cannot be used in HMS-3. In order to fix this Sentry should add support for the out-of-box message factory available in Hive-3 (See SENTRY-2518). Due to these additional delays from Sentry test_owner_privileges fails due to race conditions between the cached information in catalog and Sentry server (See IMPALA-8550). This patch disables this test when running against HMS-3 until we fix the issues both on the Sentry and Impala side. Testing done: 1. Confirmed the test is skipped when using USE_CDP_HIVE=true 2. Confirmed the test is not skipped when using USE_CDP_HIVE=false Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3 Reviewed-on: http://gerrit.cloudera.org:8080/13339 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/authorization/test_owner_privileges.py M tests/common/skip.py 2 files changed, 8 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3 Gerrit-Change-Number: 13339 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13339 ) Change subject: IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3 Gerrit-Change-Number: 13339 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 15:34:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8476: Replace Sentry admin check workaround with proper Sentry API
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13346 ) Change subject: IMPALA-8476: Replace Sentry admin check workaround with proper Sentry API .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/13346/4/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/13346/4/fe/src/main/java/org/apache/impala/service/JniCatalog.java@309 PS4, Line 309: // When there is no user having user name as user.getName() in the : // underlying OS, the method isAdmin() in SentryPolicyServiceClient.class : // would throw a SentryUserException. In this case, we also consider this : // requesting user a non Sentry administrator. reword: When a user is not defined in Sentry, isAdmin() will throw SentryUserException, we will consider this requesting user as a non-Sentry administrator. -- To view, visit http://gerrit.cloudera.org:8080/13346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a27140d401494bc372ad0da96ada57bda94cd82 Gerrit-Change-Number: 13346 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 14:07:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8541: Remove existing roles in order to not interfere with a test.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13349 ) Change subject: IMPALA-8541: Remove existing roles in order to not interfere with a test. .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG@7 PS1, Line 7: Remove existing roles in order to not interfere with a test. reword: Remove existing roles prior to executing the tests in AuthorizationTest http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG@8 PS1, Line 8: : The test TestShortUsernameUsed() in AuthorizationTest.java would fail if the : current user is associated with some role granted the privilege to access some : databases, e.g., the database "functional". : : Specifically, for the SQL statement "select * from alltypes", if the role is : already granted the privilege to access the database "functional", only : AnalysisException will be thrown. To address this issue, we remove all the : lingering roles before running this test. The issue isn't specific to TestShortUsernameUsed(). We can explain the issue in a more generic fashion. http://gerrit.cloudera.org:8080/#/c/13349/1/fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/13349/1/fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java@438 PS1, Line 438: // Remove existing roles in order to not interfere with this test. : for (TSentryRole role : sentryService_.listAllRoles(USER)) { : AUTHZ_CATALOG.removeRole(role.getRoleName()); : } We should do that in the @BeforeClass. Also, we don't actually need to remove the roles in Sentry. We can just remove the roles from the catalog, e.g. AUTHZ_CATALOG.getAuthPolicy().getAllRoles().forEach( role -> AUTHZ_CATALOG.getAuthPolicy().removeRole(role.getName())); -- To view, visit http://gerrit.cloudera.org:8080/13349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7122ee0cc90eaf7d45c5637c28c9cdb59502967b Gerrit-Change-Number: 13349 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 May 2019 14:00:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13339 ) Change subject: IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3 Gerrit-Change-Number: 13339 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 10:05:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13339 ) Change subject: IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4270/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3 Gerrit-Change-Number: 13339 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 May 2019 10:05:32 + Gerrit-HasComments: No