[Impala-ASF-CR] IMPALA-7734: Catalog and Statestore memz page shows useless memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13670 ) Change subject: IMPALA-7734: Catalog and Statestore memz page shows useless memory .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4507/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/13670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a42434b98aec1d9e0be6ae52325049378fdde23 Gerrit-Change-Number: 13670 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Thu, 20 Jun 2019 06:56:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. Patch Set 2: (4 comments) Thanks for taking care of this! http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc File be/src/exprs/cast-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@310 PS2, Line 310: DateVal CastFunctions::CastToDateVal(FunctionContext* ctx, const StringVal& val) { Please cover this change with tests. If you grep for "String to Date parse failed." string in the tests directory you find the ones needed an update. E.g. testdata/workloads/functional-query/queries/QueryTest/date.test has some of these. http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@314 PS2, Line 314: (char *) The preferred way of converting here is using reinterpret_cast as seen above. You don't have to do it twice if you move the result above to a variable. http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316 PS2, Line 316: invalidVal Have you tried to simply pass the char* here? Is that a null terminated string? http://gerrit.cloudera.org:8080/#/c/13680/2/be/src/exprs/cast-functions-ir.cc@316 PS2, Line 316: c_str() Do you need to convert the return value of Substitute() to char*? -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 20 Jun 2019 06:48:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 06:41:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3691/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 06:11:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6758: Add metric for current catalog version to catalog
Vincent Tran has abandoned this change. ( http://gerrit.cloudera.org:8080/11383 ) Change subject: IMPALA-6758: Add metric for current catalog version to catalog .. Abandoned Abandoned for now until I have more time to reevaluate. -- To view, visit http://gerrit.cloudera.org:8080/11383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I82e51e8cd2f6a77c588914242684b885816aa0a0 Gerrit-Change-Number: 11383 Gerrit-PatchSet: 2 Gerrit-Owner: Vincent Tran Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 12: Fix a new test case that was added in the meantime -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:32:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/13386/12/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/12/tests/common/environ.py@328 PS12, Line 328: i flake8: F821 undefined name 'impala_url' -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:32:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Hello David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13386 to look at the new patch set (#12). Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. IMPALA-8553,IMPALA-8552: fix checks for remote cluster Apparently IMPALA_REMOTE_URL is not generally used for remote cluster tests: only --testing_remote_cluster is reliably set. Fix the is_remote_cluster() implementation to take into account REMOTE_DATA_LOAD and --testing_remote_cluster in addition to IMPALA_REMOTE_URL. Consistently use is_remote_cluster() in other tests instead of checking the pytest flag directly. There were a few lifecycle headaches with how ImpalaTestClusterProperties is used: * common.environ is imported from conftest, which means that the top-level code in the file runs *before* pytest command-line arguments have been registered and parsed. * ImpalaTestClusterProperties is used by various code, like build_flavor_timeout(), which runs before pytest command-line arguments have been parsed. * ImpalaTestClusterProperties is called from non-pytest scripts like start-impala-cluster.py, so the command-line arguments are not available. I dealt with the above challenges by making a few changes to do the detection later: * Lazily initializing a singleton ImpalaTestClusterProperties. This was not strictly necessary but makes the whole problem less sensitive to import order and module dependencies. * Adding cluster_properties fixture to make ImpalaTestClusterProperties available in tests without additional boilerplate. * Removing the caching of the local/remote build calculation. ImpalaTestClusterProperties is instantiated outside of python tests, but is_remote_cluster() is only called from python tests, so if we check flags in is_remote_cluster() we'll get the right results reliably. As a workaround to unblock remote tests, also assume catalog_v1 if accessing the web UI fails. Testing: Ran core tests against a regular minicluster. Ran tests against a remote cluster Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 --- M tests/common/environ.py M tests/common/skip.py M tests/conftest.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_automatic_invalidation.py M tests/custom_cluster/test_hive_parquet_codec_interop.py M tests/hs2/test_hs2.py M tests/metadata/test_compute_stats.py M tests/metadata/test_hms_integration.py M tests/metadata/test_metadata_query_statements.py M tests/metadata/test_refresh_partition.py M tests/query_test/test_kudu.py M tests/query_test/test_mt_dop.py M tests/query_test/test_spilling.py M tests/shell/test_shell_commandline.py M tests/shell/util.py M tests/webserver/test_web_pages.py 17 files changed, 144 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13386/12 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:31:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4505/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:31:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7290: part 2: Add HS2 support to Impala shell
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12884 ) Change subject: IMPALA-7290: part 2: Add HS2 support to Impala shell .. Patch Set 26: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4504/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12884 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12 Gerrit-Change-Number: 12884 Gerrit-PatchSet: 26 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:09:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7290: part 2: Add HS2 support to Impala shell
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12884 ) Change subject: IMPALA-7290: part 2: Add HS2 support to Impala shell .. Patch Set 26: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12884 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12 Gerrit-Change-Number: 12884 Gerrit-PatchSet: 26 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:09:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7290: part 2: Add HS2 support to Impala shell
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12884 ) Change subject: IMPALA-7290: part 2: Add HS2 support to Impala shell .. Patch Set 25: Code-Review+2 carry +2 -- To view, visit http://gerrit.cloudera.org:8080/12884 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12 Gerrit-Change-Number: 12884 Gerrit-PatchSet: 25 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:09:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4501/ -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 05:08:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 11: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py@223 PS8, Line 223: def get_instance(cls): > I think I will skip this one. Ack -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 01:56:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3689/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 8 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:36:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3690/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:31:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 9: Code-Review+1 (4 comments) Thanks for the changes. I had few more nits but otherwise the change looks good to me. http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test-util.cc File be/src/scheduling/scheduler-test-util.cc: http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test-util.cc@47 PS9, Line 47: nit: double space http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test.cc@18 PS9, Line 18: #include Is this one needed? http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test.cc@228 PS9, Line 228: cluster.AddHosts(num_data_nodes, false, true); nit: use AddRemoteCluster()? http://gerrit.cloudera.org:8080/#/c/13545/9/be/src/scheduling/scheduler-test.cc@376 PS9, Line 376: cluster.AddHosts(num_impala_nodes, true, false); nit: AddRemoteCluster()? -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:21:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4503/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:15:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@264 PS7, Line 264: void InitRemoteCluster(Cluster *cluster) { > Thx. I think I'd prefer to spell out the size of the cluster by calling Cre Removed the CreateStandardRemoteCluster and changed tests to call CreateRemoteCluster directly. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:14:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Hello Lars Volker, Tim Armstrong, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13545 to look at the new patch set (#9). Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. IMPALA-8630: Hash the full path when calculating consistent remote placement Consistent remote placement currently uses the relative filename within a partition for the consistent hash. If the relative filenames for different partitions have a simple naming scheme, then multiple partitions may have files of the same name. This is true for some tables written by Hive (e.g. in our minicluster the tpcds.store_sales has this problem). This can lead to unbalanced placement of remote ranges. This adds a partition_path_hash to the THdfsFileSplit and THdfsFileSplitGeneratorSpec, calculated in the frontend (which has all of the partition information). The scheduler hashes this in addition to the relative path. Testing: - Added several new scheduler tests that verify the consistent remote scheduling sees blocks with different relative paths, partition paths, or offsets as distinct. - Ran core tests Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b --- M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/ExplainTest.java 7 files changed, 329 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13545/9 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@264 PS7, Line 264: /// Helper function to verify that two things are treated as distinct for consistent > Moved this to a static function on Cluster. I added a CreateStandardRemoteC Thx. I think I'd prefer to spell out the size of the cluster by calling CreateRemoteCluster(50, 30), similar to how we have several calls to cluster.AddHosts(3, true, true) through out this file. This makes it easier for the reader to see the cluster size without having to check another file. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 8 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 01:02:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/13545/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13545/7//COMMIT_MSG@24 PS7, Line 24: see > nit: sees? Done http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.h File be/src/scheduling/scheduler-test-util.h: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.h@95 PS7, Line 95: PARTITIONED_SIMPLE_NAMES > PARTITIONED_SINGLE_NAME? PARTITIONED_CONSTANT_NAMES? Changed these to PARTITIONED_SINGLE_FILENAME and PARTITIONED_UNIQUE_FILENAMES. http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc File be/src/scheduling/scheduler-test-util.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc@295 PS7, Line 295: // Encoding the table name and index in the file helps debugging. > Maybe include one or two examples of a path in the comment in each case? Th Added examples for each section http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc@353 PS7, Line 353: GetBlockPaths(table_name, true, spec_idx, naming_policy, &relative_path, > One of the parameters could be returned by value, e.g. the partition_id I think I prefer returning multiple things all via args rather than having one via return value, so I'm going to leave this. http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@264 PS7, Line 264: void InitRemoteCluster(Cluster *cluster) { > This could be a member of Cluster, e.g. static Cluster CreateRemoteCluster( Moved this to a static function on Cluster. I added a CreateStandardRemoteCluster for the 50 impala node, 3 data node configuration. http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@305 PS7, Line 305: > Can you add brief comments above each test describing what to do (see other Added comments for the new tests http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@314 PS7, Line 314: Schema schema(cluster); > These loops could benefit from adding a SCOPED_TRACE for the naming policy Added SCOPED_TRACE for the naming policy. http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift@195 PS7, Line 195: partition's path > For S3, this could be a bucket? Does it matter whether it uses the whole pa The partition_id is something that Impala assigns relatively arbitrarily and it is not really stable across time. For example, it is not stable across an "invalidate metadata" for a table. So, it does need to be the path. http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift@196 PS7, Line 196: consistent > "consistent" here mean "given the same input, must return the same hash", r Right -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 00:55:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Hello Lars Volker, Tim Armstrong, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13545 to look at the new patch set (#8). Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. IMPALA-8630: Hash the full path when calculating consistent remote placement Consistent remote placement currently uses the relative filename within a partition for the consistent hash. If the relative filenames for different partitions have a simple naming scheme, then multiple partitions may have files of the same name. This is true for some tables written by Hive (e.g. in our minicluster the tpcds.store_sales has this problem). This can lead to unbalanced placement of remote ranges. This adds a partition_path_hash to the THdfsFileSplit and THdfsFileSplitGeneratorSpec, calculated in the frontend (which has all of the partition information). The scheduler hashes this in addition to the relative path. Testing: - Added several new scheduler tests that verify the consistent remote scheduling sees blocks with different relative paths, partition paths, or offsets as distinct. - Ran core tests Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b --- M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/ExplainTest.java 7 files changed, 330 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13545/8 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 8 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8584: Add cookie support to the HTTP HS2 server
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/13672 ) Change subject: IMPALA-8584: Add cookie support to the HTTP HS2 server .. Patch Set 2: (3 comments) Just nits http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.h File be/src/rpc/cookie-mgr.h: http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.h@56 PS2, Line 56: // The time, in Unix millis, that the cookies was created. Used to determine when to nit: the cookie http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.h@69 PS2, Line 69: // Thread that periodically iterates over all cookkies and removes one that are past nit: spelling -> cookies http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.cc File be/src/rpc/cookie-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.cc@54 PS2, Line 54: SleepForMs(FLAGS_max_cookie_lifetime_s * 10); Should this be * 100 ? -- To view, visit http://gerrit.cloudera.org:8080/13672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I647c06f94ef91aa3b6413e91576c4ec506ed57f4 Gerrit-Change-Number: 13672 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 20 Jun 2019 00:36:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3688/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 00:17:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3687/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 20 Jun 2019 00:16:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. Patch Set 10: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4500/ -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 19 Jun 2019 23:59:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 23:42:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4501/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:38:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:38:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/13386/10/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/10/tests/common/environ.py@328 PS10, Line 328: i flake8: F821 undefined name 'impala_url' -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:37:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 10: Code-Review+2 carry +2 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:38:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py@223 PS8, Line 223: def get_instance(cls): > This is a non-blocking comment, but but there's a bit of python pedantry/ha I think I will skip this one. http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py@321 PS8, Line 321: def _get_flags_from_web_ui(self): > Also non-blocking, but a style nit here: seems like this would be a good pl Done http://gerrit.cloudera.org:8080/#/c/13386/8/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/13386/8/tests/hs2/test_hs2.py@42 PS8, Line 42: handless > handles? Done -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:37:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Hello David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13386 to look at the new patch set (#9). Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. IMPALA-8553,IMPALA-8552: fix checks for remote cluster Apparently IMPALA_REMOTE_URL is not generally used for remote cluster tests: only --testing_remote_cluster is reliably set. Fix the is_remote_cluster() implementation to take into account REMOTE_DATA_LOAD and --testing_remote_cluster in addition to IMPALA_REMOTE_URL. Consistently use is_remote_cluster() in other tests instead of checking the pytest flag directly. There were a few lifecycle headaches with how ImpalaTestClusterProperties is used: * common.environ is imported from conftest, which means that the top-level code in the file runs *before* pytest command-line arguments have been registered and parsed. * ImpalaTestClusterProperties is used by various code, like build_flavor_timeout(), which runs before pytest command-line arguments have been parsed. * ImpalaTestClusterProperties is called from non-pytest scripts like start-impala-cluster.py, so the command-line arguments are not available. I dealt with the above challenges by making a few changes to do the detection later: * Lazily initializing a singleton ImpalaTestClusterProperties. This was not strictly necessary but makes the whole problem less sensitive to import order and module dependencies. * Adding cluster_properties fixture to make ImpalaTestClusterProperties available in tests without additional boilerplate. * Removing the caching of the local/remote build calculation. ImpalaTestClusterProperties is instantiated outside of python tests, but is_remote_cluster() is only called from python tests, so if we check flags in is_remote_cluster() we'll get the right results reliably. As a workaround to unblock remote tests, also assume catalog_v1 if accessing the web UI fails. Testing: Ran core tests against a regular minicluster. Ran tests against a remote cluster Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 --- M tests/common/environ.py M tests/common/skip.py M tests/conftest.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_automatic_invalidation.py M tests/hs2/test_hs2.py M tests/metadata/test_compute_stats.py M tests/metadata/test_hms_integration.py M tests/metadata/test_metadata_query_statements.py M tests/metadata/test_refresh_partition.py M tests/query_test/test_kudu.py M tests/query_test/test_mt_dop.py M tests/query_test/test_spilling.py M tests/shell/test_shell_commandline.py M tests/shell/util.py M tests/webserver/test_web_pages.py 16 files changed, 141 insertions(+), 102 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13386/9 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Hello David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13386 to look at the new patch set (#10). Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. IMPALA-8553,IMPALA-8552: fix checks for remote cluster Apparently IMPALA_REMOTE_URL is not generally used for remote cluster tests: only --testing_remote_cluster is reliably set. Fix the is_remote_cluster() implementation to take into account REMOTE_DATA_LOAD and --testing_remote_cluster in addition to IMPALA_REMOTE_URL. Consistently use is_remote_cluster() in other tests instead of checking the pytest flag directly. There were a few lifecycle headaches with how ImpalaTestClusterProperties is used: * common.environ is imported from conftest, which means that the top-level code in the file runs *before* pytest command-line arguments have been registered and parsed. * ImpalaTestClusterProperties is used by various code, like build_flavor_timeout(), which runs before pytest command-line arguments have been parsed. * ImpalaTestClusterProperties is called from non-pytest scripts like start-impala-cluster.py, so the command-line arguments are not available. I dealt with the above challenges by making a few changes to do the detection later: * Lazily initializing a singleton ImpalaTestClusterProperties. This was not strictly necessary but makes the whole problem less sensitive to import order and module dependencies. * Adding cluster_properties fixture to make ImpalaTestClusterProperties available in tests without additional boilerplate. * Removing the caching of the local/remote build calculation. ImpalaTestClusterProperties is instantiated outside of python tests, but is_remote_cluster() is only called from python tests, so if we check flags in is_remote_cluster() we'll get the right results reliably. As a workaround to unblock remote tests, also assume catalog_v1 if accessing the web UI fails. Testing: Ran core tests against a regular minicluster. Ran tests against a remote cluster Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 --- M tests/common/environ.py M tests/common/skip.py M tests/conftest.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_automatic_invalidation.py M tests/hs2/test_hs2.py M tests/metadata/test_compute_stats.py M tests/metadata/test_hms_integration.py M tests/metadata/test_metadata_query_statements.py M tests/metadata/test_refresh_partition.py M tests/query_test/test_kudu.py M tests/query_test/test_mt_dop.py M tests/query_test/test_spilling.py M tests/shell/test_shell_commandline.py M tests/shell/util.py M tests/webserver/test_web_pages.py 16 files changed, 142 insertions(+), 103 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13386/10 -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/13386/9/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/9/tests/common/environ.py@328 PS9, Line 328: i flake8: F821 undefined name 'impala_url' -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 23:37:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/13545/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13545/7//COMMIT_MSG@24 PS7, Line 24: see nit: sees? http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.h File be/src/scheduling/scheduler-test-util.h: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.h@95 PS7, Line 95: PARTITIONED_SIMPLE_NAMES PARTITIONED_SINGLE_NAME? PARTITIONED_CONSTANT_NAMES? http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc File be/src/scheduling/scheduler-test-util.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc@295 PS7, Line 295: // Encoding the table name and index in the file helps debugging. Maybe include one or two examples of a path in the comment in each case? That might make the intent easier to follow. http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test-util.cc@353 PS7, Line 353: GetBlockPaths(table_name, true, spec_idx, naming_policy, &relative_path, One of the parameters could be returned by value, e.g. the partition_id http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@264 PS7, Line 264: void InitRemoteCluster(Cluster *cluster) { This could be a member of Cluster, e.g. static Cluster CreateRemoteCluster(num_impala_nodes, num_data_nodes). http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@305 PS7, Line 305: Can you add brief comments above each test describing what to do (see other tests in this file for examples)? http://gerrit.cloudera.org:8080/#/c/13545/7/be/src/scheduling/scheduler-test.cc@314 PS7, Line 314: Schema schema(cluster); These loops could benefit from adding a SCOPED_TRACE for the naming policy (https://github.com/google/googletest/blob/master/googletest/docs/advanced.md#adding-traces-to-assertions). http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift@195 PS7, Line 195: partition's path For S3, this could be a bucket? Does it matter whether it uses the whole path or just the partition id? Should we just call it "partition_hash"? http://gerrit.cloudera.org:8080/#/c/13545/7/common/thrift/PlanNodes.thrift@196 PS7, Line 196: consistent "consistent" here mean "given the same input, must return the same hash", right? -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 22:29:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13665 ) Change subject: IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3686/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13665 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa Gerrit-Change-Number: 13665 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 22:23:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13665 ) Change subject: IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/13665/3/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/13665/3/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@585 PS3, Line 585: public static final Predicate HIDDEN_DIRECTORIES_PREDICATE = fileStatus -> { line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/13665/3/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@658 PS3, Line 658: Predicate fileStatusPredicate, boolean useListStatus) throws IOException { line too long (94 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13665 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa Gerrit-Change-Number: 13665 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 21:43:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13665 ) Change subject: IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@580 PS2, Line 580: /** :* Predicate useful to filter out hidden directories and temporary staging directories :* which tools like Hive create in the table/partition directories when a query is :* inserting data into them. :*/ : public static final Predicate HIDDEN_DIRECTORIES_PREDICATE = fileStatus -> { : if (!fileStatus.isDirectory()) return false; : String filename = fileStatus.getPath().getName(); : return filename.startsWith(".") || filename.startsWith("_tmp."); : }; : : /** :* Wrapper around FileSystem.listFiles(), similar to the listStatus() wrapper above. :*/ : public static RemoteIterator listFiles(FileSystem fs, Path p, : boolean recursive) throws IOException { : try { : return new RemoteIteratorWithFilter(fs, p, recursive, false); : } catch (FileNotFoundException e) { : if (LOG.isWarnEnabled()) LOG.warn("Path does not exist: " + p.toString(), e); : return null; : } : } : : /* > You could also implement a Predicate overriding test(). Done http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@656 PS2, Line 656: private RemoteIteratorWithFilter(FileSystem fs, Path startPath, > I think the refactor is confusing. RecursiingIterator can also have isRecur Renamed it to RemoteIteratorWithFilter which hopefully is clearer http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@716 PS2, Line 716: > nit: no braces Done http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java File fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java: http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java@93 PS2, Line 93: hdfs://localhost:20500/ > Do we need these qualifiers? I think the Configuration should resolve it to Don't think we need them. I just followed what was being done in the other tests of this class. I personally think its clearer than letting Configuration resolve it to the default FileSystem http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java@100 PS2, Line 100: dstFs.deleteOnExit(tmpTestPath); > Do you need wrap the fs in try-with-resources for this to work? don't think so. This will clean up the directory when the JVM exits. The other way to do this would be a finally block to explicitly remove the directory before exiting. http://gerrit.cloudera.org:8080/#/c/13665/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java@112 PS2, Line 112: 24 > Instead, we could try loading sourcePath and substitute the values? I am vary of using load() to test load() method's behavior. Seems like a anti-pattern. -- To view, visit http://gerrit.cloudera.org:8080/13665 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa Gerrit-Change-Number: 13665 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 21:43:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13665 ) Change subject: IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories .. IMPALA-8663 : FileMetadataLoader should skip listing files in hidden and tmp directories The FileMetadataLoader is used to load the file information in when the table is loaded. By default, it lists all the files in the table/partition directory. Currently, it only skips the filenames which are invalid (hidden files and ones starting with "_" etc). However, it does not skip the directories which are temporary or hidden. In case of Hive when data is inserted into a table, it creates a temporary staging directory which is a hidden directory under the table location. When the insert in hive is completed, such staging directories are removed. But if there is a refresh called during that time, FileMetadataLoader will add the files in the staging directory as well. Not only this could cause temporary invalid results but it causes table to go in a bad state when these temporary directories are removed. The only work-around in such a case to issue a refresh on the table again. This patch adds logic in the filemetadataloader to ignore such temporary staging directories. Unfortunately, hadoop does not provide a API which can recursively list files in a directory and skip certain directories. This patch addes this logic of filtering into existing RecursingIterator in FileSystemUtil. Testing: 1. Added a new test in FilemetadataloaderTest and modified existing ones in AcidUtils 2. Ran concurrent inserts from Hive while issuing refresh in a loop on Impala side. Earlier this would cause the table to go into a bad state. Now, it works fine for the staging directories. It still runs into a FileNotFoundException from the impalad when there are insert overwrite statements in Hive Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa --- M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java 3 files changed, 89 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/13665/3 -- To view, visit http://gerrit.cloudera.org:8080/13665 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa Gerrit-Change-Number: 13665 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3685/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 21:40:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13664 ) Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@243 PS4, Line 243: final AtomicInteger piggybackSuccessCountForTests = new AtomicInteger(); > nit: final Done http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@250 PS4, Line 250: final AtomicInteger piggybackExceptionCountForTests = new AtomicInteger(); > nit: final Done http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@434 PS4, Line 434: returns an invalidation for the table name list > just for my sake of understanding. Is the invalidate of table name done in yea, because a table was created, the list of tables in that DB gets invalidated. -- To view, visit http://gerrit.cloudera.org:8080/13664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c Gerrit-Change-Number: 13664 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 21:39:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3684/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 21:26:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails It's hard for users to debug right now if users running millions of rows casting to date while cast failed without extra information. Solution: Adding a fail data value is at least a minimum requirement. Testing: query_test/test_date_queries.py test passed Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/2 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 2 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13680 ) Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13680/1/be/src/exprs/cast-functions-ir.cc File be/src/exprs/cast-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/13680/1/be/src/exprs/cast-functions-ir.cc@315 PS1, Line 315: ctx->SetError(Substitute("String to Date parse failed. Invalid string val: $0", invalidVal).c_str()); line too long (105 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 20:47:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8665 Include extra info in error message when date cast fails
Jiawei Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13680 Change subject: IMPALA-8665 Include extra info in error message when date cast fails .. IMPALA-8665 Include extra info in error message when date cast fails It's hard for users to debug right now if users running millions of rows casting to date while cast failed without extra information. Solution: Adding a fail data value is at least a minimum requirement. Testing: query_test/test_date_queries.py test passed Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 --- M be/src/exprs/cast-functions-ir.cc 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/13680/1 -- To view, visit http://gerrit.cloudera.org:8080/13680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If800b7696515cd61afee27220c55ff2440a86f04 Gerrit-Change-Number: 13680 Gerrit-PatchSet: 1 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13679 ) Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3683/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 4 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 19:43:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13679 ) Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3682/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 19:27:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13664 ) Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@434 PS4, Line 434: returns an invalidation for the table name list just for my sake of understanding. Is the invalidate of table name done in response to create table or this is some other invalidation happening around this same time? -- To view, visit http://gerrit.cloudera.org:8080/13664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c Gerrit-Change-Number: 13664 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 19:25:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13664 ) Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3681/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c Gerrit-Change-Number: 13664 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 18:54:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13679 ) Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py File tests/authorization/test_authorized_proxy.py: http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@26 PS3, Line 26: from getpass import getuser > flake8: F811 redefinition of unused 'time' from line 21 Done http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@120 PS3, Line 120: t > flake8: E131 continuation line unaligned for hanging indent Done http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@140 PS3, Line 140: t > flake8: E131 continuation line unaligned for hanging indent Done -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 4 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 18:51:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Fredy Wijaya has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/13679 ) Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. IMPALA-8682: Add authorized proxy user/group test coverage with Ranger This patch adds a test coverage for authorized proxy user/group with Ranger. This patch also moves the authorized proxy tests into a separate file, test_authorized_proxy and refactors the tests to be more readable and reusable. Testing: - Added a new test_authorized_proxy.py - Ran all E2E authorization tests Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 --- M tests/authorization/test_authorization.py A tests/authorization/test_authorized_proxy.py 2 files changed, 278 insertions(+), 145 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/13679/4 -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 4 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13679 ) Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py File tests/authorization/test_authorized_proxy.py: http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@26 PS3, Line 26: from time import sleep, time flake8: F811 redefinition of unused 'time' from line 21 http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@120 PS3, Line 120: . flake8: E131 continuation line unaligned for hanging indent http://gerrit.cloudera.org:8080/#/c/13679/3/tests/authorization/test_authorized_proxy.py@140 PS3, Line 140: . flake8: E131 continuation line unaligned for hanging indent -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 18:47:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13679 Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with Ranger .. IMPALA-8682: Add authorized proxy user/group test coverage with Ranger This patch adds a test coverage for authorized proxy user/group with Ranger. This patch also moves the authorized proxy tests into a separate file, test_authorized_proxy and refactors the tests to be more readable and reusable. Testing: - Added a new test_authorized_proxy.py - Ran all E2E authorization tests Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 --- M tests/authorization/test_authorization.py A tests/authorization/test_authorized_proxy.py 2 files changed, 279 insertions(+), 144 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/13679/3 -- To view, visit http://gerrit.cloudera.org:8080/13679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596 Gerrit-Change-Number: 13679 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3680/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 18:43:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8553,IMPALA-8552: fix checks for remote cluster
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/13386 ) Change subject: IMPALA-8553,IMPALA-8552: fix checks for remote cluster .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py File tests/common/environ.py: http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py@223 PS8, Line 223: def get_instance(cls): This is a non-blocking comment, but but there's a bit of python pedantry/hackery you can employ here to effect the Singleton pattern. Python has a __new__() method that gets called *before* __init__(). It rarely gets overridden, but for this case, you could do something like this to forgo needing a get_instance() class method: class ImpalaTestClusterProperties(object): _instance = None _build_flavor = None _library_link_type = None _web_ui_url = None def __new__(cls): if cls._instance is None: cls._web_ui_url = IMPALA_REMOTE_URL or DEFAULT_LOCAL_WEB_UI_URL if IMPALA_REMOTE_URL: # If IMPALA_REMOTE_URL is set, prefer detecting from the web UI. cls._build_flavor, cls._library_link_type =\ ImpalaTestClusterFlagsDetector.detect_using_web_ui(web_ui_url) else: cls._build_flavor, cls._library_link_type =\ ImpalaTestClusterFlagsDetector.detect_using_build_root_or_web_ui(IMPALA_HOME) cls._instance = object.__new__(cls) return cls._instance def __init__(self): self._runtime_flags = None # Lazily populated to avoid unnecessary web UI calls. @property def build_flavor(self): """ Return the correct ImpalaBuildFlavors for the Impala under test. """ return self._build_flavor @property def library_link_type(self): """ Return the library link type (either static or dynamic) for the Impala under test. """ return self._library_link_type # etc... >>> instance1 = ImpalaTestClusterProperties() >>> instance2 = ImpalaTestClusterProperties() >>> >>> print id(instance1) 4378407440 >>> print id(instance2) 4378407440 >>> instance1 == instance2 True http://gerrit.cloudera.org:8080/#/c/13386/8/tests/common/environ.py@321 PS8, Line 321: def _get_flags_from_web_ui(self): Also non-blocking, but a style nit here: seems like this would be a good place to use the @property decorator, which generally gets used where attributes need to be lazily eval'ed? @property def runtime_flags(self): if self._runtime_flags is None: # do stuff return self._runtime_flags Then later... def is_catalog_v2_cluster(self): """Whether we use CATALOG_V2 options, including local catalog and HMS notifications. For now, assume that --use_local_catalog=true implies that the others are enabled.""" try: key = "use_local_catalog" # --use_local_catalog is hidden so does not appear in JSON if disabled. return key in self.runtime_flags and self.runtime_flags[key]["current"] == "true" except Exception: if self.is_remote_cluster(): # IMPALA-8553: be more tolerant of failures on remote cluster builds. LOG.exception("Failed to get flags from web UI, assuming catalog V1") return False raise http://gerrit.cloudera.org:8080/#/c/13386/8/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/13386/8/tests/hs2/test_hs2.py@42 PS8, Line 42: handless handles? -- To view, visit http://gerrit.cloudera.org:8080/13386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa6b2a1391f53121d3d7c00c5cf0a57590899ce4 Gerrit-Change-Number: 13386 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 19 Jun 2019 18:38:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 19 Jun 2019 18:28:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4500/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 19 Jun 2019 18:28:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13664 to look at the new patch set (#5). Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider .. IMPALA-7534. Handle invalidation races in CatalogdMetaProvider This handles a race condition in which a cache invalidation concurrent with a cache load would potentially be skipped, causing out-of-date data to persist in the cache. This would present itself as spurious "table not found" errors. A new test case triggers the issue reliably by injecting latency into the metadata fetch RPC and running DDLs concurrently on the same database across 8 threads. With the fix, the test passes reliably. Another option to fix this might have been to switch to Caffeine instead of Guava's loading cache. However, Caffeine requires Java 8, and LocalCatalog is being backported to Impala 2.x which still can run on Java 7. So, working around the Guava issue will make backporting (and future backports) easier. Change-Id: I70f377db88e204825a909389f28dc3451815235c --- M be/src/exec/catalog-op-executor.cc M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M tests/custom_cluster/test_local_catalog.py 4 files changed, 260 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/5 -- To view, visit http://gerrit.cloudera.org:8080/13664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c Gerrit-Change-Number: 13664 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4499/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 18:11:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc@267 PS6, Line 267: // Set of Impala hosts > Extra space indentation Done http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc@282 PS6, Line 282: /// overlapping by chance is extremely low, so this test should only fail if the > Can we control how the RNG used for the random replica selection is seeded? Good point. I was thinking the probability of overlap should be small enough not to matter (~10^-18 or so), but then I noticed that we do srand(0) in SchedulerTest constructor: https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler-test.cc#L31 The constructor runs for each test, so I guess these are deterministic. Changed this comment to indicate this is deterministic (which is fine, because these tests don't benefit much from actual randomness). http://gerrit.cloudera.org:8080/#/c/13545/6/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/13545/6/common/thrift/PlanNodes.thrift@195 PS6, Line 195: // hash of the partition's path > I think it would be useful to document how the hash should be computed (sin Added a comment about using Java String.hashCode() and that it needs to be consistent across processes/machines. http://gerrit.cloudera.org:8080/#/c/13545/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/13545/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@951 PS6, Line 951: partition.getLocation().hashCode()); > This was surprisingly subtle. I remembered that hashCode() was not guarante Good point, I added a comment in the thrift file. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 18:03:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Hello Lars Volker, Tim Armstrong, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13545 to look at the new patch set (#7). Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. IMPALA-8630: Hash the full path when calculating consistent remote placement Consistent remote placement currently uses the relative filename within a partition for the consistent hash. If the relative filenames for different partitions have a simple naming scheme, then multiple partitions may have files of the same name. This is true for some tables written by Hive (e.g. in our minicluster the tpcds.store_sales has this problem). This can lead to unbalanced placement of remote ranges. This adds a partition_path_hash to the THdfsFileSplit and THdfsFileSplitGeneratorSpec, calculated in the frontend (which has all of the partition information). The scheduler hashes this in addition to the relative path. Testing: - Added several new scheduler tests that verify the consistent remote scheduling see blocks with different relative paths, partition paths, or offsets as distinct. - Ran core tests Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b --- M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/ExplainTest.java 7 files changed, 281 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13545/7 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13664 ) Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider .. Patch Set 4: Code-Review+2 Thank you for making those changes. Should be good to go once you've addressed Bharath's comments. -- To view, visit http://gerrit.cloudera.org:8080/13664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c Gerrit-Change-Number: 13664 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 19 Jun 2019 17:15:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13648 ) Change subject: IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/13648/3/be/src/benchmarks/date-benchmark.cc File be/src/benchmarks/date-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/13648/3/be/src/benchmarks/date-benchmark.cc@37 PS3, Line 37: 4.92X Nice results! Can you mention this optimization in the commit message? http://gerrit.cloudera.org:8080/#/c/13648/3/be/src/benchmarks/date-benchmark.cc@169 PS3, Line 169: data.AddRange(DateValue(1965, 1, 1), DateValue(2020, 12, 31)); It is possible that the sorted test data helps the current solution by making many branches predictable. (note that real world data is likely to be from a small range, so the current data may be more realistic than a shuffled one). Can you run the test with shuffled dates? It is not necessary to commit it, just check that there is no regression. http://gerrit.cloudera.org:8080/#/c/13648/3/be/src/runtime/date-value.cc File be/src/runtime/date-value.cc: http://gerrit.cloudera.org:8080/#/c/13648/3/be/src/runtime/date-value.cc@199 PS3, Line 199: int m = int(days_since_jan1 / 30.5); Idea for +1 DCHECK: DCHECK(month_ranges[m] > days_since_jan1) -- To view, visit http://gerrit.cloudera.org:8080/13648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If404bffdaf055c769e79ffa8f193bac415cfdd1a Gerrit-Change-Number: 13648 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 16:51:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8630: Hash the full path when calculating consistent remote placement
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 ) Change subject: IMPALA-8630: Hash the full path when calculating consistent remote placement .. Patch Set 6: (4 comments) Tests look great, thanks for adding them. I had a few minor requests but can +2 after you've addressed those. http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc File be/src/scheduling/scheduler-test.cc: http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc@267 PS6, Line 267: // Set of Impala hosts Extra space indentation http://gerrit.cloudera.org:8080/#/c/13545/6/be/src/scheduling/scheduler-test.cc@282 PS6, Line 282: /// overlapping by chance is extremely low, so this test should only fail if the Can we control how the RNG used for the random replica selection is seeded? E.g. call srand() with a seed that we control before each run. Otherwise it might be hard to reproduce if we have occasional collisions (e.g. RNG isn't as good as we think and has more collisions than we'd expect). This isn't a blocker if it turns out to be tricky, but it's a nice-to-have. http://gerrit.cloudera.org:8080/#/c/13545/6/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/13545/6/common/thrift/PlanNodes.thrift@195 PS6, Line 195: // hash of the partition's path I think it would be useful to document how the hash should be computed (since we rely on it being computed the same way). I.e. that it's the standard Java String.hashCode() http://gerrit.cloudera.org:8080/#/c/13545/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/13545/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@951 PS6, Line 951: partition.getLocation().hashCode()); This was surprisingly subtle. I remembered that hashCode() was not guaranteed to be consistent across JVMs: https://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html. But it appears that the String hashCode is actually documented as being stable: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#hashCode-- I think it is worth documenting, but might be best to document in the thrift spec instead of here. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Jun 2019 16:48:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7734: Catalog and Statestore memz page shows useless memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13670 ) Change subject: IMPALA-7734: Catalog and Statestore memz page shows useless memory .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3679/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a42434b98aec1d9e0be6ae52325049378fdde23 Gerrit-Change-Number: 13670 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 19 Jun 2019 15:49:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7734: Catalog and Statestore memz page shows useless memory
Tamas Mate has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/13670 ) Change subject: IMPALA-7734: Catalog and Statestore memz page shows useless memory .. IMPALA-7734: Catalog and Statestore memz page shows useless memory The MemTracker is removed from StatestoreD main and CatalogD main as it is not used. When MemTracker is not available the new method that responsible for adding mem trackers will be skipped and will not be part of the callback. This results missing document members which are not printed. Before/after pictures of the '/memz' page can be found in the Jira. Change-Id: I1a42434b98aec1d9e0be6ae52325049378fdde23 --- M be/src/catalog/catalogd-main.cc M be/src/statestore/statestored-main.cc M be/src/util/default-path-handlers.cc M www/memz.tmpl 4 files changed, 30 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/13670/5 -- To view, visit http://gerrit.cloudera.org:8080/13670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1a42434b98aec1d9e0be6ae52325049378fdde23 Gerrit-Change-Number: 13670 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13648 ) Change subject: IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3678/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If404bffdaf055c769e79ffa8f193bac415cfdd1a Gerrit-Change-Number: 13648 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 13:04:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8636: Implement INSERT for insert-only ACID tables
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13559 ) Change subject: IMPALA-8636: Implement INSERT for insert-only ACID tables .. Patch Set 5: (14 comments) http://gerrit.cloudera.org:8080/#/c/13559/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13559/5//COMMIT_MSG@36 PS5, Line 36: TODO in following commits: It is not clear to me whether dynamic partitioning is supported or not. http://gerrit.cloudera.org:8080/#/c/13559/5/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/13559/5/be/src/exec/hdfs-table-sink.cc@214 PS5, Line 214: // Create final_hdfs_file_name_prefix and tmp_hdfs_file_name_prefix. Can you mention the transactional naming logic in the comment? http://gerrit.cloudera.org:8080/#/c/13559/5/be/src/exec/hdfs-table-sink.cc@696 PS5, Line 696: return (IsS3APath(partition->final_hdfs_file_name_prefix.c_str()) && !overwrite_ && : state->query_options().s3_skip_insert_staging) || : IsTransactional(); nit: can you improve readability? e.g if (IsTransactional()) return true; could go to a separate line. http://gerrit.cloudera.org:8080/#/c/13559/5/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/13559/5/be/src/service/client-request-state.cc@817 PS5, Line 817: if (exec_request().__isset.transaction_id) { : RETURN_IF_ERROR(frontend_->CommitTransaction(exec_request().transaction_id)); : in_transaction_ = false; : } : RETURN_IF_ERROR(UpdateCatalog()); I am not sure here, but the order of UpdateCatalog() and CommitTransaction() seems non-trivial to me, especially with dynamic partitioning - I assume that UpdateCatalog() will create the new partitions in HMS, and in that case we can commit to partitions that do not exist yet in HMS. http://gerrit.cloudera.org:8080/#/c/13559/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/13559/5/fe/src/main/java/org/apache/impala/service/Frontend.java@1259 PS5, Line 1259: try { I think it would be more readable if the logic inside the try block would be in a separate function. http://gerrit.cloudera.org:8080/#/c/13559/5/fe/src/main/java/org/apache/impala/service/Frontend.java@1370 PS5, Line 1370: ImpalaException Should this only catch ImpalaException? e.g. if there is a null pointer exception for some reason, we should still abort the transaction in my opinion. http://gerrit.cloudera.org:8080/#/c/13559/5/fe/src/main/java/org/apache/impala/service/Frontend.java@1372 PS5, Line 1372: abortTransaction This can also throw an exception, for example if HMS was shut down, and this will hide the original exception. I don't know how do we generally handle this in Impala - in this case, TransactionException could get a member like Exception causeOfAbort_ or originalException_. http://gerrit.cloudera.org:8080/#/c/13559/5/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/13559/5/testdata/datasets/functional/functional_schema_template.sql@2611 PS5, Line 2611: TBLPROPERTIES('transactional'='true', 'transactional_properties'='insert_only');; nit: extra ; http://gerrit.cloudera.org:8080/#/c/13559/5/testdata/workloads/functional-query/queries/QueryTest/acid-insert.test File testdata/workloads/functional-query/queries/QueryTest/acid-insert.test: http://gerrit.cloudera.org:8080/#/c/13559/5/testdata/workloads/functional-query/queries/QueryTest/acid-insert.test@7 PS5, Line 7: QUERY It would be great to also read the tables back with HMS. http://gerrit.cloudera.org:8080/#/c/13559/5/testdata/workloads/functional-query/queries/QueryTest/acid-insert.test@113 PS5, Line 113: Can you also add insert into/overwrite with dynamic partitioning? http://gerrit.cloudera.org:8080/#/c/13559/5/tests/metadata/test_hms_integration.py File tests/metadata/test_hms_integration.py: http://gerrit.cloudera.org:8080/#/c/13559/5/tests/metadata/test_hms_integration.py@662 PS5, Line 662: execute_serially As this test runs in a unique_database, I don't think that it needs to run serially. E.g. the next test needs this because it has INVALIDATE METADATA, which is a global operation. http://gerrit.cloudera.org:8080/#/c/13559/5/tests/query_test/test_insert.py File tests/query_test/test_insert.py: http://gerrit.cloudera.org:8080/#/c/13559/5/tests/query_test/test_insert.py@142 PS5, Line 142: def test_acid_insert(self, vector): What is the reason for not using a unique database? http://gerrit.cloudera.org:8080/#/c/13559/5/tests/query_test/test_insert.py@146 PS5, Line 146: if HIVE_
[Impala-ASF-CR] IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/13648 ) Change subject: IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE .. Patch Set 3: (1 comment) Patch set #3 also contains improvements to the DateValue::ToYearMonthDay() implementation,. http://gerrit.cloudera.org:8080/#/c/13648/2/be/src/runtime/date-test.cc File be/src/runtime/date-test.cc: http://gerrit.cloudera.org:8080/#/c/13648/2/be/src/runtime/date-test.cc@952 PS2, Line 952: nt > typo: are Done -- To view, visit http://gerrit.cloudera.org:8080/13648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If404bffdaf055c769e79ffa8f193bac415cfdd1a Gerrit-Change-Number: 13648 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Jun 2019 12:27:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13648 ) Change subject: IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE .. IMPALA-7369: part 2: Add INTERVAL expr support and built-in functions for DATE This change implements INTERVAL expression support for DATE type and adds several DATE related built-in functions. The following functions are supported in Hive: INT YEAR(DATE d) Extracts year of the 'd' date, returns it as an int in 0- range. INT MONTH(DATE d) Extracts month of the 'd' date and returns it as an int in 1-12 range. INT DAY(DATE d), INT DAYOFMONTH(DATE d) Extracts day-of-month of the 'd' date and returns it as an int in 1-31 range. INT QUARTER(DATE d) Extracts quarter of the 'd' date and returns it as an int in 1-4 range. INT DAYOFWEEK(DATE d) Extracts day-of-week of the 'd' date and returns it as an int in 1-7 range. 1 is Sunday and 7 is Saturday. INT DAYOFYEAR(DATE d) Extracts day-of-year of the 'd' date and returns it as an int in 1-366 range. INT WEEKOFYEAR(DATE d) Extracts week-of-year of the 'd' date and returns it as an int in 1-53 range. STRING DAYNAME(DATE d) Returns the day field from a 'd' date, converted to the string corresponding to that day name. The range of return values is "Sunday" to "Saturday". STRING MONTHNAME(DATE d) Returns the month field from a 'd' date, converted to the string corresponding to that month name. The range of return values is "January" to "December". DATE NEXT_DAY(DATE d, STRING weekday) Returns the first date which is later than 'd' and named as 'weekday'. 'weekday' is 3 letters or full name of the day of the week. DATE LAST_DAY(DATE d) Returns the last day of the month which the 'd' date belongs to. INT DATEDIFF(DATE d1, DATE d2) Returns the number of days from 'd1' date to 'd2' date. DATE CURRENT_DATE() Returns the current date (in the local time zone). INT INT_MONTHS_BETWEEN(DATE d1, DATE d2) Returns the number of months between 'd1' and 'd2' dates, as an int representing only the full months that passed. If 'd1' represents an earlier date than 'd2', the result is negative. DOUBLE MONTHS_BETWEEN(DATE d1, DATE d2) Returns the number of months between 'd1' and 'd2' dates. Can include a fractional part representing extra days in addition to the full months between the dates. The fractional component is computed by dividing the difference in days by 31 (regardless of the month). If 'd1' represents an earlier date than 'd2', the result is negative. DATE ADD_YEARS(DATE d, INT/BIGINT num_years), DATE SUB_YEARS(DATE d, INT/BIGINT num_years) Adds/subtracts a specified number of years to a 'd' date value. DATE ADD_MONTHS(DATE d, INT/BIGINT num_months), DATE SUB_MONTHS(DATE d, INT/BIGINT num_months) Adds/subtracts a specified number of months to a date value. If 'd' is the last day of a month, the returned date will fall on the last day of the target month too. DATE ADD_DAYS(DATE d, INT/BIGINT num_days), DATE SUB_DAYS(DATE d, INT/BIGINT num_days) Adds/subtracts a specified number of days to a date value. DATE ADD_WEEKS(DATE d, INT/BIGINT num_weeks), DATE SUB_WEEKS(DATE d, INT/BIGINT num_weeks) Adds/subtracts a specified number of weeks to a date value. The following function doesn't exist in Hive but supported by Amazon Redshift INT DATE_CMP(DATE d1, DATE d2) Compares 'd1' and 'd2' dates. Returns: 1. NULL, if either 'd1' or 'd2' is NULL 2. -1 if d1 < d2 3. 1 if d1 > d2 4. 0 if d1 == d2 (https://docs.aws.amazon.com/redshift/latest/dg/r_DATE_CMP.html) Change-Id: If404bffdaf055c769e79ffa8f193bac415cfdd1a --- M be/src/benchmarks/date-benchmark.cc M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt A be/src/exprs/date-functions-ir.cc A be/src/exprs/date-functions.h M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr-evaluator.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timestamp-functions.h M be/src/exprs/udf-builtins.cc M be/src/runtime/date-test.cc M be/src/runtime/date-value.cc M be/src/runtime/date-value.h M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 17 files changed, 1,676 insertions(+), 174 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13648/3 -- To view, visit http://gerrit.cloudera.org:8080/13648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If404bffdaf055c769e79ffa8f193bac415cfdd1a Gerrit-Change-Number: 13648 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Revie
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3677/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 19 Jun 2019 12:22:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile
Tamas Mate has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/13353 ) Change subject: IMPALA-8443: Record time spent in authorization in the runtime profile .. IMPALA-8443: Record time spent in authorization in the runtime profile The analysis and authorization is handled together as authorization depends on the results of the analysis. The timeline EventSequence is moved into the AuthorizationContext and the markEvent is called during the postAuthorize method. In some cases the EventSequence is not available when the AuthorizationContext is created therefore it is wrapped in Optional. Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationContext.java M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/EventSequence.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M tests/observability/test_log_fragments.py M tests/query_test/test_observability.py 13 files changed, 84 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/13353/10 -- To view, visit http://gerrit.cloudera.org:8080/13353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae Gerrit-Change-Number: 13353 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-7467: Port ExecQueryFInstances to krpc
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13583 ) Change subject: IMPALA-7467: Port ExecQueryFInstances to krpc .. IMPALA-7467: Port ExecQueryFInstances to krpc This patch ports the ExecQueryFInstances rpc to use KRPC. The parameters for this call contain a huge number of Thrift structs (eg. everything related to TPlanNode and TExpr), so to avoid converting all of these to protobuf and the resulting effect that would have on the FE and catalog, this patch stores most of the parameters in a sidecar (in particular the TQueryCtx, TPlanFragmentCtx's, and TPlanFragmentInstanceCtx's). Testing: - Passed a full exhaustive run on the minicluster. Set up a ten node cluster with tpch 500: - Ran perf tests: 3 iterations per tpch query, 4 concurrent streams, no perf change. - Ran the stress test for 1000 queries, passed. Change-Id: Id3f1c6099109bd8e5361188005a7d0e892147570 Reviewed-on: http://gerrit.cloudera.org:8080/13583 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/test-env.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 15 files changed, 245 insertions(+), 188 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/13583 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Id3f1c6099109bd8e5361188005a7d0e892147570 Gerrit-Change-Number: 13583 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-7467: Port ExecQueryFInstances to krpc
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13583 ) Change subject: IMPALA-7467: Port ExecQueryFInstances to krpc .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13583 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id3f1c6099109bd8e5361188005a7d0e892147570 Gerrit-Change-Number: 13583 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 19 Jun 2019 07:21:00 + Gerrit-HasComments: No