[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2443/ -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 05:53:11 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 05:50:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py File bin/diagnostics/collect_diagnostics.py: http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@450 PS2, Line 450: # collection_root_dir is an absoulte path. There is no point in preserving its > nit: its Done http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@451 PS2, Line 451: # entire directory structure in the archive, so set the arcname accordingly. > nit: "archive, so set" Done -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 05:49:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Hello Lars Volker, Philip Zeyliger, Kim Jin Chul, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10347 to look at the new patch set (#3). Change subject: Fix diagnostics path to not include the parent dir structure .. Fix diagnostics path to not include the parent dir structure Without the fix, the diagnostics tar file included the entire directory structure of the diagnostics root dir. Before: === $ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/ tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/ tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt After: = $ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/ impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/ impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt . Tested with python 2.6 Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c --- M bin/diagnostics/collect_diagnostics.py 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/10347/3 -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py File bin/diagnostics/collect_diagnostics.py: http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@450 PS2, Line 450: # collection_root_dir is an absoulte path. There is no point in preserving it's nit: its http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@451 PS2, Line 451: # entire directory structure in the archive. So set the arcname accordingly. nit: "archive, so set" -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 04:27:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 2: Kim Jun Chul, If you want to discuss a policy change about requiring JIRA mentions, it seems like the dev mailing list may be a better venue than individual code reviews. -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 04:24:45 + Gerrit-HasComments: No
[Impala-ASF-CR] test-with-docker: work with git worktree
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10335 ) Change subject: test-with-docker: work with git worktree .. Patch Set 1: For small changes, I think we think it's fine to avoid the paperwork. I looked, and I'm not the only person who does it. -- To view, visit http://gerrit.cloudera.org:8080/10335 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b Gerrit-Change-Number: 10335 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 04:22:50 + Gerrit-HasComments: No
[Impala-ASF-CR] Remove IMPALA THRIFT JAVA VERSION and untested Darwin Thrift versions.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10361 ) Change subject: Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift versions. .. Patch Set 1: In terms of additional color, I've heard from multiple folks that the environment variables are a major pain. It makes it hard to get started in Impala, harder to get tools set up just right, etc. One of the motivations here is that I've run into tools that parse pom.xml files but are confused by variable references. This is a drop in a big ocean, but I think it's directionally reasonable to work towards fewer things in impala-config.sh. -- To view, visit http://gerrit.cloudera.org:8080/10361 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac Gerrit-Change-Number: 10361 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 04:17:38 + Gerrit-HasComments: No
[Impala-ASF-CR] Remove IMPALA THRIFT JAVA VERSION and untested Darwin Thrift versions.
Philip Zeyliger has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10361 Change subject: Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift versions. .. Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift versions. The singular use of IMPALA_THRIFT_JAVA_VERSION was in impala-parent/pom.xml. We can reduce complexity by just inlining the version there, like we do with several other Java dependencies. Meanwhile, with the upgrade to Thrift 0.9.3, it doesn't make sense to retain the Darwin path for Thrift 0.9.2. The reality is likely that nobody is building Impala on Darwin. Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac --- M bin/impala-config.sh M impala-parent/pom.xml 2 files changed, 1 insertion(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/10361/1 -- To view, visit http://gerrit.cloudera.org:8080/10361 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac Gerrit-Change-Number: 10361 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10363 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 10 May 2018 03:39:12 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has uploaded a new patch set (#2) to the change originally created by Vuk Ercegovac. ( http://gerrit.cloudera.org:8080/10363 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Reviewed-on: http://gerrit.cloudera.org:8080/10291 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10363/2 -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10363 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Reviewed-on: http://gerrit.cloudera.org:8080/10291 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 16: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522 PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" side of cancel rpc. : // And in the case of EOS, sink already knows this. : if (coord_sink_ != nullptr) coord_sink_->CloseConsumer(); > Yeah, I agree, but would prefer to do it in an independent commit in case t makes sense. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943 PS15, Line 943: e : backend_states_) { > In principle, I agree that's how we'd want it to work. But given the way Cl sounds good http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@132 PS16, Line 132: nit: extra space -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 16 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 10 May 2018 02:16:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10306 ) Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 There is a Hive bug in Hive 1.1.0 that can result in a NullPointerException when doing parallel Hive operations (see IMPALA-6532). Since dataload goes parallel on Hive loads starting with IMPALA-6372, dataload can hit this error on Hive 1.1.0 (i.e. IMPALA_MINICLUSTER_PROFILE=2). This is impacting builds on the 2.x branch. This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2. IMPALA_MINICLUSTER_PROFILE=3 uses a newer version of Hive that has a fix for this, so this continues to use parallel dataload for that case. Parallelism can be reenabled when Hive 1.1.0 gets the fix from Hive 2.1.1. Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Reviewed-on: http://gerrit.cloudera.org:8080/10306 Reviewed-by: Philip Zeyliger Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M bin/load-data.py M testdata/bin/create-load-data.sh 3 files changed, 11 insertions(+), 1 deletion(-) Approvals: Philip Zeyliger: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10306 ) Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 01:30:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8936 ) Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners .. Patch Set 19: Code-Review+2 Will wait for the other prerequisite to be merged. -- To view, visit http://gerrit.cloudera.org:8080/8936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a Gerrit-Change-Number: 8936 Gerrit-PatchSet: 19 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Thu, 10 May 2018 00:48:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 16: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@618 PS16, Line 618: currently concurrently -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 16 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 10 May 2018 00:47:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] test-with-docker: work with git worktree
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/10335 ) Change subject: test-with-docker: work with git worktree .. Patch Set 1: Why don't you put JIRA on the commit msg? -- To view, visit http://gerrit.cloudera.org:8080/10335 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b Gerrit-Change-Number: 10335 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Thu, 10 May 2018 00:34:48 + Gerrit-HasComments: No
[Impala-ASF-CR] test-with-docker: exit properly on failures
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/10318 ) Change subject: test-with-docker: exit properly on failures .. Patch Set 1: Why don't you put JIRA on the commit msg? -- To view, visit http://gerrit.cloudera.org:8080/10318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I285e2f4d07e34898d73beba857e9ac325ed4e6db Gerrit-Change-Number: 10318 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 10 May 2018 00:34:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@291 PS14, Line 291: ExecState exec_state_ = ExecState::EXECUTING; > I agree it's a little confusing, but there is nothing you can do with this I think the changes to the header comment made it clearer. -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 14 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 10 May 2018 00:31:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10363 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2440/ -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 10 May 2018 00:24:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 16: (13 comments) http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@322 PS15, Line 322: cu > double space snuck back in Done http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@341 PS15, Line 341: ith 'instance_ho > nit: update to "failed_finstance" Done http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@414 PS15, Line 414: > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@a115 PS14, Line 115: : > put that back Done http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@121 PS14, Line 121: may call any > should we call this a non-OK status, because for cancellation through Cance Done http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@131 PS14, Line 131: > do you mean cancels? I think I meant to delete that. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@147 PS15, Line 147: Status prepare_status = query_state_->WaitForPrepare(); : DCHECK(!prepare_status.ok()); : return UpdateExecState > should we call UpdateExecStateIfError() here? yes, done. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@486 PS15, Line 486: !status.ok() && exec_status_.IsCancelled() > can we have a case with = ? Ideally no. but I think the backend can sometimes do it due to complications in the error propagation path. I would like to get to a point where this can't be cancelled, but I don't think we're there yet. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522 PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" side of cancel rpc. : // And in the case of EOS, sink already knows this. : if (coord_sink_ != nullptr) coord_sink_->CloseConsumer(); > I dont think we need this, I looked at why we originally needed this and it Yeah, I agree, but would prefer to do it in an independent commit in case there is some unforeseen issue. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@530 PS15, Line 530: : } else { > in case of cancelBackends(), should we also wait for resources to be freed? I think the argument can be made in both directions, but we eventually settled on the current behavior. The reasoning is documented in the header comment for ReleaseAdmissionControlResources(). I'll update this todo as to not add confusion. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@605 PS15, Line 605: > maybe change the name to something like UpdateExecState, since this is also The "if error" was meant to say that the exec state is only updated if the status is an not-ok and only to move into an error state (as opposed to, say, cancell state). But I can see the confusion so i'll change it since you feel it'd be clearer. I guess the header comment should clarify that this shouldn't be used for cancellation. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@639 PS15, Line 639: DCHECK(exec_rpcs_complete_barrier_ != nullptr && : exec_rpcs_complete_barrier_->pending() <= 0) << "Exec() must be called first"; : discard_result(SetNonErrorTerminalSt > does this mean we can call cancel before Exec() completes? what if cancel i Yeah, this is bogus. Fixed. The comments in the header already say Exec() needs to precede this. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943 PS15, Line 943: e : backend_states_) { > Do we expect these methods to be called before Exec() completes? If not, we In principle, I agree that's how we'd want it to work. But given the way ClientRequestState currently works and that things like the webserver just reach in and get the coord() object, I think they can currently call these webserver support rourtines before Exec() completes. How about I revisit this after your patch is merged? -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 16 Gerrit-Owner: Dan Hecht Ger
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10158 to look at the new patch set (#16). Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. IMPALA-5384, part 2: Simplify Coordinator locking and clarify state The is the final change to clarify and break up the Coordinator's lock. The state machine for the coordinator is made explicit, distinguishing between executing state and multiple terminal states. Logic to transition into a terminal state is centralized in one location and executes exactly once for each coordinator object. Derived from a patch for IMPALA_5384 by Marcel Kornacker. Testing: - exhaustive functional tests - stress test on minicluster with memory overcommitment. Verified from the logs that this exercises all these paths: - successful queries - client requested cancellation - error from exec FInstances RPC - error reported asynchronously via report status RPC - eos before backend execution completed Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e --- M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/client-request-state.cc M be/src/service/impala-server.h M be/src/util/counting-barrier.h 6 files changed, 389 insertions(+), 391 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/16 -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 16 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10363 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Thu, 10 May 2018 00:15:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/10100 ) Change subject: IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/10100/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10100/8//COMMIT_MSG@7 PS8, Line 7: Update Please add a space in front of this. http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py File tests/benchmark/report_benchmark_results.py: http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py@141 PS8, Line 141: Dont I think either "Don't" or "Do not" is better than Dont. This is used for user's helper message. http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py@1058 PS8, Line 1058: if exec_summaries[0] is None: : return "" I think this is a redundant special handling. I believe L1054-1055 can be merged into the L1058-1059 because there is no symbol dependent issue. e.g.) if options.hive_results or exec_summaries[0] is None: return "" -- To view, visit http://gerrit.cloudera.org:8080/10100 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c Gerrit-Change-Number: 10100 Gerrit-PatchSet: 8 Gerrit-Owner: Nithya Janarthanan Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Thu, 10 May 2018 00:12:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10364 Change subject: IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded() .. IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded() When a UDF hits a MemLimitExceeded, the query does not immediately abort. Instead, UDFs rely on the caller checking the query_status_ periodically. This means that on some codepaths, UDFs can call SetMemLimitExceeded() many times (e.g. once per row) before the query fragment exits. RuntimeState::SetMemLimitExceeded() currently constructs a MemLimitExceeded Status and dumps it for each call, even if the query has already hit an error. This is expensive and can delay an fragment from exiting when UDFs are repeatedly hitting MemLimitExceeded. This changes SetMemLimitExceeded() to avoid dumping if the query_status_ is already not ok. Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701 --- M be/src/runtime/runtime-state.cc 1 file changed, 7 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/10364/1 -- To view, visit http://gerrit.cloudera.org:8080/10364 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701 Gerrit-Change-Number: 10364 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10363 Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Reviewed-on: http://gerrit.cloudera.org:8080/10291 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins (cherry picked from commit 28c1f76529b79d437b66a80b954ec227e0ddc6cd) --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10363/1 -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 2: Why don't you put JIRA in your commit message? As far as I know, most changes include JIRA in the commit message. But some of the merged changes did not exist in JIRA. I think it's a good idea to include JIRA consistently even with small changes. What do you think? -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 23:36:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 7: (27 comments) I need to dig into the details some more but some initial comments/questions. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@44 PS7, Line 44: ClientRequestState::admit_outcome_ if this enum is meant to be consumed publicly, then it's best to not refer to this private variable when explaining it. let's try to explain it in terms of the public portion of the AdmissionController abstraction. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@201 PS7, Line 201: admit_status admit_outcome? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@204 PS7, Line 204: admit_status admit_outcome http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@72 PS7, Line 72: /// Must *not* be called with lock_ held. given that the coordinator object is (unfortunately) current exposed by this class, let's document when, and in which cases, that gets created. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@114 PS7, Line 114: non-error states (RUNNING_STATE and FINISHED_STATE) is PENDING_STATE one of those? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@152 PS7, Line 152: coord_ let's try not to talk about private members when documenting the public interface (so that we can try to make sense of the abstractions without knowing all the implementation details). http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172 PS7, Line 172: Coordinator* coord() const { : return coord_exec_started_.Load() ? coord_.get() : nullptr; : } it's confusing that coord() is not actually an accessor given it's name. If we really need this then we should name it GetCoord() or something. But, it seems weird that we need this. We really need to cleanup the CRS states, but in the mean time, why can't we just defer setting coord_ until after coord->Exec() returns (and maybe hold the lock while setting coord_ if that makes the synchronization simpler)? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@323 PS7, Line 323: Exec() has been successfully called on it why do we bother creating the coordinator object ahead of when we call Exec()? i.e. what use is it before Exec() is called? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@448 PS7, Line 448: DCHECK(status.ok() || !request_state->uses_admission_control()); why is that the case, and why does this code care? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@464 PS7, Line 464: if (!status.ok()) goto return_error; if we hit an error here, how does the WaitAsync thread finish? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769 PS7, Line 769: == nullptr not your change, and let's not fix it here, but given that the coord_ ptr is set outside of holding the lock, these checks aren't really technically safe. This is another example of why we should tighten up the specification of the CRS states. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769 PS7, Line 769: uses_admission_control( should that check for PENDING_STATE instead, to be more specific? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@778 PS7, Line 778: request_state->coord() != nullptr this would also be more about the states, but okay to leave it for now. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@779 PS7, Line 779: request_state->coord()->GetTExecSummary(&summary); I guess PrintExecSummary() is okay with an empty summary? have you tested that? I think previously the summary would at least be initialized (though given that the CRS lock isn't held across setting the coord_ ptr and calling Exec(), there was a small window previously that could also lead to this empty case). http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/i
[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10362 Change subject: IMPALA-7003: Deflake erasure coding data loading .. IMPALA-7003: Deflake erasure coding data loading Erasure coding data loading is flaky in two ways: 1. HBase sometimes doesn't work because of HBase-19369 2. Nested data loading sometimes fails because the HDFS namenode cannot find enough good datanodes. For problem 1, this patch enables erasure coding only on /test-warehouse directory. For problem 2, this patch sets dfs.namenode.redundancy.considerLoad to false, preventing namenode to exclude heavily-loaded datanodes. Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c --- M bin/impala-config.sh M testdata/bin/create-load-data.sh M testdata/bin/load-test-warehouse-snapshot.sh M testdata/bin/setup-hdfs-env.sh M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl 5 files changed, 20 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10362/1 -- To view, visit http://gerrit.cloudera.org:8080/10362 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c Gerrit-Change-Number: 10362 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6819: Add new queries to targeted-perf workload
David Knupp has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9979 ) Change subject: IMPALA-6819: Add new queries to targeted-perf workload .. IMPALA-6819: Add new queries to targeted-perf workload Description: Adding new queries to the targeted-perf workload that is used by Impala performance tests run via $IMPALA_HOME/bin/run-workload.py Testing: Ran the performance tests for the targeted-perf workload and all the tests passed Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab Reviewed-on: http://gerrit.cloudera.org:8080/9979 Tested-by: Impala Public Jenkins Reviewed-by: David Knupp --- A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_1.test A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_2.test A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_3.test A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_4.test A testdata/workloads/targeted-perf/queries/primitive_count_star.test A testdata/workloads/targeted-perf/queries/primitive_decimal_arithmetic.test A testdata/workloads/targeted-perf/queries/primitive_filter_bigint_in_list.test A testdata/workloads/targeted-perf/queries/primitive_intrinsic_appx_median.test A testdata/workloads/targeted-perf/queries/primitive_intrinsic_to_date.test M testdata/workloads/targeted-perf/queries/primitive_long_predicate.test A testdata/workloads/targeted-perf/queries/primitive_many_fragments.test A testdata/workloads/targeted-perf/queries/primitive_many_independent_fragments.test A testdata/workloads/targeted-perf/queries/primitive_orderby_bigint_expression.test A testdata/workloads/targeted-perf/queries/primitive_shuffle_1mb_rows.test 14 files changed, 675 insertions(+), 171 deletions(-) Approvals: Impala Public Jenkins: Verified David Knupp: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab Gerrit-Change-Number: 9979 Gerrit-PatchSet: 6 Gerrit-Owner: Nithya Janarthanan Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-6819: Add new queries to targeted-perf workload
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/9979 ) Change subject: IMPALA-6819: Add new queries to targeted-perf workload .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab Gerrit-Change-Number: 9979 Gerrit-PatchSet: 5 Gerrit-Owner: Nithya Janarthanan Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Wed, 09 May 2018 23:07:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10339 ) Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required .. Patch Set 1: (16 comments) http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml File docs/topics/impala_invalidate_metadata.xml: http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@47 PS1, Line 47: relatively > replace: very Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@48 PS1, Line 48: in the common scenario of adding new data files to an existing table > replace: whenever possible (link to INVALIDATE vs. REFRESH usage page, if e Replaced with "whenever possible" No link to add, though. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@59 PS1, Line 59: By default > replace: If there is no table specified Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@60 PS1, Line 60: Even for a single table, INVALIDATE METADATA is more expensive : than REFRESH, so prefer REFRESH in the common case where you add new data : files for an existing table. > Same thing mentioned ~10 lines above - remove? Removed http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@69 PS1, Line 69: Therefore, if some other entity modifies information used by Impala in the metastore : that Impala and Hive share, the information cached by Impala must be updated. However, this does not mean : that all metadata updates require an Impala update. > This is vague, explicitly state when is manual invalidate needed (see L100- Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@74 PS1, Line 74: : : : In Impala 1.2 and higher, a dedicated daemon (catalogd) broadcasts DDL changes made : through Impala to all Impala nodes. Formerly, after you created a database or table while connected to one : Impala node, you needed to issue an INVALIDATE METADATA statement on another Impala node : before accessing the new database or table from the other node. Now, newly created or altered objects are : picked up automatically by all Impala nodes. You must still use the INVALIDATE METADATA : technique after creating or altering objects through Hive. See : for more information on the catalog service. : : : The INVALIDATE METADATA statement is new in Impala 1.1 and higher, and takes over some of : the use cases of the Impala 1.0 REFRESH statement. Because REFRESH now : requires a table name parameter, to flush the metadata for all tables at once, use the INVALIDATE : METADATA statement. : : : > This section is very outdated and mixes up usage of old vs. new usage of RE Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@99 PS1, Line 99: instance > Don't mention individual instances in this context, it's the service as a w Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@100 PS1, Line 100: required if a change is made from another impalad : instance in your cluster, or through Hive and is distributed by : catalogd. > INVALIDATE METADATA is required when: Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@106 PS1, Line 106: same Impala node > This is pre-1.2 information. No INVALIDATE is needed as long as the changes Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@127 PS1, Line 127: INVALIDATE METADATA causes the metadata for that table to be marked as stale, and reloaded : the next time the table is referenced. For a huge table, that process could take a noticeable amount of time; > Repeat of L44-47 The whole paragraph is replaced. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@129 PS1, Line 129: thus you might prefer to use REFRESH where practical, to avoid an unpredictable delay later, : for example if the next reference to the table is during a benchmark test. > Key information missing: use REFRESH after invalidating a specific table to Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@137 PS1, Line 137: (such as SequenceFile or HBase tables) > remove Done http://gerrit.cloudera.o
[Impala-ASF-CR] IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates
Nithya Janarthanan has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/10100 ) Change subject: IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates .. IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates Updates to the scripts in benchmark folder to - Store test execution results to ExecutionResults table - Store workload metrics to WorkloadMetrics - Option to mark a performance run as official - Option to skip saving of profiles Testing: Executed these script to create the database and populate the tables with the above mentioned data Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c --- M tests/benchmark/create_database.py M tests/benchmark/perf_result_datastore.py M tests/benchmark/report_benchmark_results.py 3 files changed, 252 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/10100/8 -- To view, visit http://gerrit.cloudera.org:8080/10100 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c Gerrit-Change-Number: 10100 Gerrit-PatchSet: 8 Gerrit-Owner: Nithya Janarthanan Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Nithya Janarthanan
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 22:27:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Reviewed-on: http://gerrit.cloudera.org:8080/10291 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 7 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10306 ) Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2437/ -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 21:59:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10306 ) Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 21:11:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10306 ) Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. Patch Set 4: Fixed variable initialization. Tested dataload locally. -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 21:02:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2
Hello Philip Zeyliger, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10306 to look at the new patch set (#4). Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 .. IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2 There is a Hive bug in Hive 1.1.0 that can result in a NullPointerException when doing parallel Hive operations (see IMPALA-6532). Since dataload goes parallel on Hive loads starting with IMPALA-6372, dataload can hit this error on Hive 1.1.0 (i.e. IMPALA_MINICLUSTER_PROFILE=2). This is impacting builds on the 2.x branch. This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2. IMPALA_MINICLUSTER_PROFILE=3 uses a newer version of Hive that has a fix for this, so this continues to use parallel dataload for that case. Parallelism can be reenabled when Hive 1.1.0 gets the fix from Hive 2.1.1. Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 --- M bin/impala-config.sh M bin/load-data.py M testdata/bin/create-load-data.sh 3 files changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/10306/4 -- To view, visit http://gerrit.cloudera.org:8080/10306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Gerrit-Change-Number: 10306 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 ) Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state .. Patch Set 15: (11 comments) http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@341 PS15, Line 341: failed_fragment' nit: update to "failed_finstance" http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@414 PS15, Line 414: nit: extra space http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@121 PS14, Line 121: s not thread should we call this a non-OK status, because for cancellation through Cancel() we should not call it error http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@131 PS14, Line 131: tate_ do you mean cancels? http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@147 PS15, Line 147: Status prepare_status = query_state_->WaitForPrepare(); : DCHECK(!prepare_status.ok()); : return prepare_status; should we call UpdateExecStateIfError() here? http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@486 PS15, Line 486: !status.ok() && exec_status_.IsCancelled() can we have a case with = ? http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522 PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" side of cancel rpc. : // And in the case of EOS, sink already knows this. : if (coord_sink_ != nullptr) coord_sink_->CloseConsumer(); I dont think we need this, I looked at why we originally needed this and it was because the fragment lifecycle was a little different and we needed to to close the sink in case the Prepare phase of the fragments failed. After patch IMPALA-2550, we handle partially-prepared state failure differently and hence dont require this anymore. Moreover, since this was used to close partially-prepared state failure and now if we fail prepare, coord_sink_ is not even set, so this is essentially redundant. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@530 PS15, Line 530: ensure resources are : // really freed before admitting more queries? in case of cancelBackends(), should we also wait for resources to be freed? http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@605 PS15, Line 605: UpdateExecStateIfError maybe change the name to something like UpdateExecState, since this is also used to only get the overall status and it may or may not be an error status http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@639 PS15, Line 639: // Wait until backends have started executing, otherwise the cancel rpc might arrive : // before the exec rpc. : exec_rpcs_complete_barrier_->Wait(); does this mean we can call cancel before Exec() completes? what if cancel is called before exec_rpcs_complete_barrier_ is initialized and is still nullptr. We should probably just mark this method to only be called after Exec completes because currently that how it is used. http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943 PS15, Line 943: backend_states_init_lock_ Do we expect these methods to be called before Exec() completes? If not, we can get rid of backend_states_init_lock_ and mark these methods to only be called after Exec() -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 15 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 20:55:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
Hello Kim Jin Chul, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10349 to look at the new patch set (#2). Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse() .. IMPALA-6995: avoid DCHECK in TimestampParse::Parse() The bug was that the string's length is checked before trimming leading and trailing spaces instead of afterwards. The bug has been present for a long time but couldn't hit a DCHECK until recently. Testing: Added some backend tests that reproduce the crash. Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1 --- M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-parse-util.cc 2 files changed, 56 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10349/2 -- To view, visit http://gerrit.cloudera.org:8080/10349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1 Gerrit-Change-Number: 10349 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10322 ) Change subject: [DOCS] Impala doc site update for 3.0 .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d Gerrit-Change-Number: 10322 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 20:55:18 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0
Alex Rodoni has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10322 ) Change subject: [DOCS] Impala doc site update for 3.0 .. [DOCS] Impala doc site update for 3.0 Add Impala docs from branch master. Commit hash: f20415755401d56103df7f16348cea8ed12fb3d8 Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d Reviewed-on: http://gerrit.cloudera.org:8080/10322 Reviewed-by: Michael Brown Tested-by: Alex Rodoni --- A docs/build3x/html/commonltr.css A docs/build3x/html/commonrtl.css A docs/build3x/html/images/impala_arch.jpeg A docs/build3x/html/index.html A docs/build3x/html/topics/impala_abort_on_error.html A docs/build3x/html/topics/impala_adls.html A docs/build3x/html/topics/impala_admin.html A docs/build3x/html/topics/impala_admission.html A docs/build3x/html/topics/impala_aggregate_functions.html A docs/build3x/html/topics/impala_aliases.html A docs/build3x/html/topics/impala_allow_unsupported_formats.html A docs/build3x/html/topics/impala_alter_table.html A docs/build3x/html/topics/impala_alter_view.html A docs/build3x/html/topics/impala_analytic_functions.html A docs/build3x/html/topics/impala_appx_count_distinct.html A docs/build3x/html/topics/impala_appx_median.html A docs/build3x/html/topics/impala_array.html A docs/build3x/html/topics/impala_auditing.html A docs/build3x/html/topics/impala_authentication.html A docs/build3x/html/topics/impala_authorization.html A docs/build3x/html/topics/impala_avg.html A docs/build3x/html/topics/impala_avro.html A docs/build3x/html/topics/impala_batch_size.html A docs/build3x/html/topics/impala_bigint.html A docs/build3x/html/topics/impala_bit_functions.html A docs/build3x/html/topics/impala_boolean.html A docs/build3x/html/topics/impala_breakpad.html A docs/build3x/html/topics/impala_buffer_pool_limit.html A docs/build3x/html/topics/impala_char.html A docs/build3x/html/topics/impala_comments.html A docs/build3x/html/topics/impala_complex_types.html A docs/build3x/html/topics/impala_components.html A docs/build3x/html/topics/impala_compression_codec.html A docs/build3x/html/topics/impala_compute_stats.html A docs/build3x/html/topics/impala_compute_stats_min_sample_size.html A docs/build3x/html/topics/impala_concepts.html A docs/build3x/html/topics/impala_conditional_functions.html A docs/build3x/html/topics/impala_config.html A docs/build3x/html/topics/impala_config_options.html A docs/build3x/html/topics/impala_config_performance.html A docs/build3x/html/topics/impala_connecting.html A docs/build3x/html/topics/impala_conversion_functions.html A docs/build3x/html/topics/impala_count.html A docs/build3x/html/topics/impala_create_database.html A docs/build3x/html/topics/impala_create_function.html A docs/build3x/html/topics/impala_create_role.html A docs/build3x/html/topics/impala_create_table.html A docs/build3x/html/topics/impala_create_view.html A docs/build3x/html/topics/impala_databases.html A docs/build3x/html/topics/impala_datatypes.html A docs/build3x/html/topics/impala_datetime_functions.html A docs/build3x/html/topics/impala_ddl.html A docs/build3x/html/topics/impala_debug_action.html A docs/build3x/html/topics/impala_decimal.html A docs/build3x/html/topics/impala_decimal_v2.html A docs/build3x/html/topics/impala_default_join_distribution_mode.html A docs/build3x/html/topics/impala_default_spillable_buffer_size.html A docs/build3x/html/topics/impala_delegation.html A docs/build3x/html/topics/impala_delete.html A docs/build3x/html/topics/impala_describe.html A docs/build3x/html/topics/impala_development.html A docs/build3x/html/topics/impala_disable_codegen.html A docs/build3x/html/topics/impala_disable_row_runtime_filtering.html A docs/build3x/html/topics/impala_disable_streaming_preaggregations.html A docs/build3x/html/topics/impala_disable_unsafe_spills.html A docs/build3x/html/topics/impala_disk_space.html A docs/build3x/html/topics/impala_distinct.html A docs/build3x/html/topics/impala_dml.html A docs/build3x/html/topics/impala_double.html A docs/build3x/html/topics/impala_drop_database.html A docs/build3x/html/topics/impala_drop_function.html A docs/build3x/html/topics/impala_drop_role.html A docs/build3x/html/topics/impala_drop_stats.html A docs/build3x/html/topics/impala_drop_table.html A docs/build3x/html/topics/impala_drop_view.html A docs/build3x/html/topics/impala_exec_single_node_rows_threshold.html A docs/build3x/html/topics/impala_exec_time_limit_s.html A docs/build3x/html/topics/impala_explain.html A docs/build3x/html/topics/impala_explain_level.html A docs/build3x/html/topics/impala_explain_plan.html A docs/build3x/html/topics/impala_faq.html A docs/build3x/html/topics/impala_file_formats.html A docs/build3x/html/topics/impala_fixed_issues.html A docs/build3x/html/topics/impala_float.html A docs/build3x/html/topics/impala_functions.html A docs/build3x/html/topics/impala_functions_overview.html A docs/build3x/html/topics/impala_grant.html A do
[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10349 ) Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10349/1/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/10349/1/be/src/runtime/timestamp-parse-util.cc@399 PS1, Line 399: *d = boost::gregorian::date(); : *t = boost::posix_time::time_duration(boost::posix_time::not_a_date_time); > From a code refactoring perspective, I would suggest you make a new static Created a helper function. I actually ended up refactoring the handling of lengths in this function too in order to better distinguish the different length values. -- To view, visit http://gerrit.cloudera.org:8080/10349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1 Gerrit-Change-Number: 10349 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 20:54:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10322 ) Change subject: [DOCS] Impala doc site update for 3.0 .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d Gerrit-Change-Number: 10322 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 20:53:31 + Gerrit-HasComments: No
[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT
Sailesh Mukil has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10360 Change subject: Update version to 3.1.0-SNAPSHOT .. Update version to 3.1.0-SNAPSHOT Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 --- M bin/save-version.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10360/1 -- To view, visit http://gerrit.cloudera.org:8080/10360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167 Gerrit-Change-Number: 10360 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6645: Enable disk spill encryption by default .. Patch Set 5: Hmm, hit an assertion when running tests. I wonder if this is CentOS 6 versus Ubuntu Log file created at: 2018/05/09 19:16:57 Running on machine: ip-172-31-32-199 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0509 19:16:57.214826 117722 openssl_util.h:201] Check failed: ERR_peek_error() == 0 (101171331 vs. 0) Expected no pending OpenSSL errors on kudu::Status kudu::security::TlsContext::InitiateHandshake(kudu::security::TlsHandshakeType, kudu::security::TlsHandshake*) const entry, but had: error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 20:36:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6645: Enable disk spill encryption by default .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2434/ -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 20:27:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10358 Change subject: IMPALA-6802 (part 3): Clean up authorization tests .. IMPALA-6802 (part 3): Clean up authorization tests The third part of this patch is to rewrite the following authorization tests: - with - union - reset metadata - show Testing: - Added new authorization tests - Ran all front-end tests Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Cherry-picks: not for 2.x --- M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java 1 file changed, 436 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10358/1 -- To view, visit http://gerrit.cloudera.org:8080/10358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7 Gerrit-Change-Number: 10358 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 9: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 09 May 2018 19:17:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 ) Change subject: IMPALA-6131: Track time of last statistics update in metadata .. Patch Set 9: The change broke AuthorizationTest#TestDescribeTableResults. I fixed this in patch set 9. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 09 May 2018 19:11:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10116 to look at the new patch set (#9). Change subject: IMPALA-6131: Track time of last statistics update in metadata .. IMPALA-6131: Track time of last statistics update in metadata The timestamp of the last COMPUTE STATS operation is saved to table property "impala.lastComputeStatsTime". The format is the same as in "transient_lastDdlTime", so the two can be compared to check if the schema has changed since computing statistics. Other changes: - Handling of "transient_lastDdlTime" is simplified - the old logic set it to current time + 1, if the old version was >= current time, to ensure that it is always increased by DDL operations. This was useful in the past, as IMPALA-387 used lastDdlTime to check if partition data needs to be reloaded, but since IMPALA-1480, Impala does not rely on lastDdlTime at all. - Computing / setting stats on HDFS tables no longer increases "transient_lastDdlTime". - When Kudu tables are (re)loaded, it is checked if their HMS representation is up to date, and if it is, then IMetaStoreClient.alter_table() is not called. The old logic always called alter_table() after loading metadata from Kudu. This change was needed to ensure that "transient_lastDdlTime" works similarly in HDFS and Kudu tables, and should also make (re)loading Kudu tables faster. Notes: - Kudu will be able to sync its tables to HMS in the near future (see KUDU-2191), so the Kudu metadata handling in Impala may need to be redesigned. Testing: tests/metadata/test_last_ddl_time_update.py is extended by - also checking "impala.lastComputeStatsTime" - testing more SQL statements - tests for Kudu tables Note that test_last_ddl_time_update.py is ran only in exhaustive testing. Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M tests/metadata/test_last_ddl_time_update.py 7 files changed, 202 insertions(+), 161 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/9 -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10351 ) Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Gerrit-Change-Number: 10351 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 09 May 2018 19:06:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10351 ) Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries .. IMPALA-6781: expand ORDER BY in some TPCH queries Fix determinism in TPCH Q3, Q10, Q18 by adding another column to the queries' ORDER BY to guarantee deterministic results. With TPCH 1 these queries were producing differing results across stress test runs. They were all valid, but the LIMIT without the more specific ORDER BY meant that several different result sets were possible. By adding these columns, we sort by a column that has uniqueness across all rows returned. Testing: Repeated runs of these specific TPCH queries via: impala-py.test -k Q18 tests/query_test/test_tpch_queries.py Stress test on a 140-node cluster with TPCH 1 loaded. Previously when using these queries, the stress test would incorrectly report incorrect results. Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Reviewed-on: http://gerrit.cloudera.org:8080/10351 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M testdata/workloads/tpch/queries/tpch-q10.test M testdata/workloads/tpch/queries/tpch-q18.test M testdata/workloads/tpch/queries/tpch-q3.test 3 files changed, 12 insertions(+), 6 deletions(-) Approvals: Michael Brown: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Gerrit-Change-Number: 10351 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2435/ -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 18:45:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 18:25:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10291 to look at the new patch set (#6). Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/6 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304 PS3, Line 304: void CatalogServer::CatalogUrlCallback(const Webserver::ArgumentMap& args, > I just think it's weird that we don't show the dbs if there was an error wi done. makes sense, I was treating it as all-or-nothing. http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py@50 PS5, Line 50: self.client.execute("invalidate metadata functional.alltypes") > check that the catalog version is at least 50 at this point to make debuggi Done -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 18:23:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304 PS3, Line 304: void CatalogServer::CatalogUrlCallback(const Webserver::ArgumentMap& args, > did it this way to share the error handling and avoid multiple "error" fiel I just think it's weird that we don't show the dbs if there was an error with getting the catalog version. We return in L315 even though the dbs are just fine. Might make debugging slightly harder if something goes wrong. Don't think we need to do something too complicated, just want to be sure we can easily pinpoint the error if something does wrong. http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py@50 PS5, Line 50: self.client.execute("invalidate metadata functional.alltypes") check that the catalog version is at least 50 at this point to make debugging easier if something goes wrong (e.g. we purposely or accidentally change no-op ddls to not increment the version) -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 18:08:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10327 ) Change subject: IMPALA-6907: Close stale connections to removed cluster members .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/10327/3/tests/custom_cluster/test_restart_services.py File tests/custom_cluster/test_restart_services.py: http://gerrit.cloudera.org:8080/#/c/10327/3/tests/custom_cluster/test_restart_services.py@57 PS3, Line 57: def test_restart_impala(self): > Yes, the way the client cache code is structured makes it hard to discern w We technically could introduce a test only function that checks the number of stale connections in the cache, and have a way to call that from here, but I think that's not necessary, and it's going to probably introduce a lot more complexity. So, I think this is fine for now. -- To view, visit http://gerrit.cloudera.org:8080/10327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d Gerrit-Change-Number: 10327 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Wed, 09 May 2018 18:00:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9693 ) Change subject: IMPALA-5842: Write page index in Parquet files .. Patch Set 16: (20 comments) http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1225 PS15, Line 1225: > Switched to ++i. Ah, I had overlooked row_group->columns[i] http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc File be/src/util/string-util-test.cc: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@33 PS15, Line 33: void EvalTruncation(const string& original, const string& expected_result, > In the test it was easier to organize the code like this. Let's leave it as it is, I also don't feel strongly about it. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@68 PS15, Line 68: EvalTruncation(normal_plus_max, "abcdeg" + string(4, 0), 10, UP); > Added new tess. Already had tests with 0xFF at L57-L66 Thanks for adding more tests. I meant the exact 3-character string consisting of {0, (char)0x7F, (char)0xFF} to make sure that truncating up would hit the signed char overflow. Apologies for not being more clear. http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/util/string-util.cc File be/src/util/string-util.cc: http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/util/string-util.cc@46 PS16, Line 46: unsigned char uch = (*result)[i]; I think this could use a comment and/or an explicit cast to make the intent clear. http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py File tests/query_test/test_parquet_page_index.py: http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@36 PS16, Line 36: PAGE_INDEX_TRUNCATE_LENGTH update http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@55 PS16, Line 55: """ nit: PEP8 says the closing """ should be on a new line. You can use flake8 to check the whole file. It was recently added to our codebase: https://github.com/apache/impala/commit/2fb73f94b425fde488166a19f78050ddbc3d7b50 http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@90 PS16, Line 90: column_info = ColumnInfo._make((schema, stats, offset_index, column_index, I think you should be able to just pass all the parameters to the ctor "ColumnInfo(schema, stats,...). Currently you wrap it in a tuple to get an iterable to pass to _make(), but that should not be necessary. http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@96 PS16, Line 96: nametuples "ColumnInfo objects" or "column infos" http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@97 PS16, Line 97: a row group the first row group? http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@110 PS16, Line 110: current_loc in page_locations[1:]: You can zip a list with itself to create (prev, cur) pairs: In [6]: l = range(5) In [7]: zip(l[:-1], l[1:]) Out[7]: [(0, 1), (1, 2), (2, 3), (3, 4)] http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@124 PS16, Line 124: null_page nit: page_is_null might be more clear http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@182 PS16, Line 182: previous_max = None Would it make sense to validate the ordering instead of re-computing it? I.e., extract all the min_values and max_values into lists and then just all assert(sorted(min_values))? http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@214 PS16, Line 214: sorting_column unused? http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@216 PS16, Line 216: skip_col_idx Stale comment? http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@249 PS16, Line 249: vector.get_value('exec_option')['num_nodes'] = 1 move closer to the comment explaining the intent http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@321 PS16, Line 321: ends nit: end http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@348 PS16, Line 348: followed by : # zeroes. I had missed this looking at the truncation code. Why do we keep the zeroes at the end? http://gerrit.cloudera.org:8080/#/c/9693/16/tests/util/get_parquet_metadata.py File tests/util/get_parquet_metadata.py: http://gerrit.cloudera.org:8080/#/c/9693/16/tests/util/get_parquet_metadata.py@31 PS16, Line 31: The range must start with a serialized : thrift object. That does not actually seem to be a requirement of th
[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10322 ) Change subject: [DOCS] Impala doc site update for 3.0 .. Patch Set 5: The 3.0 change log was committed. Could you review this again? Thanks! -- To view, visit http://gerrit.cloudera.org:8080/10322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d Gerrit-Change-Number: 10322 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 17:45:20 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10333 ) Change subject: Update download and signature links for 3.0.0 release. .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10333 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae Gerrit-Change-Number: 10333 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 17:39:42 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Added changelog for 3.0.0
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10334 ) Change subject: Added changelog for 3.0.0 .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10334 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5 Gerrit-Change-Number: 10334 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 17:39:53 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Added changelog for 3.0.0
Sailesh Mukil has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10334 ) Change subject: Added changelog for 3.0.0 .. Added changelog for 3.0.0 Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5 Reviewed-on: http://gerrit.cloudera.org:8080/10334 Reviewed-by: Michael Brown Tested-by: Sailesh Mukil --- A docs/changelog-3.0.html 1 file changed, 410 insertions(+), 0 deletions(-) Approvals: Michael Brown: Looks good to me, approved Sailesh Mukil: Verified -- To view, visit http://gerrit.cloudera.org:8080/10334 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: merged Gerrit-Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5 Gerrit-Change-Number: 10334 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.
Sailesh Mukil has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10333 ) Change subject: Update download and signature links for 3.0.0 release. .. Update download and signature links for 3.0.0 release. Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae Reviewed-on: http://gerrit.cloudera.org:8080/10333 Reviewed-by: Michael Brown Tested-by: Sailesh Mukil --- M downloads.html 1 file changed, 26 insertions(+), 15 deletions(-) Approvals: Michael Brown: Looks good to me, approved Sailesh Mukil: Verified -- To view, visit http://gerrit.cloudera.org:8080/10333 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: merged Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae Gerrit-Change-Number: 10333 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6645: Enable disk spill encryption by default .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2434/ -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 17:21:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6645: Enable disk spill encryption by default .. Patch Set 5: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 17:20:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6645: Enable disk spill encryption by default .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-6635: > IMPALA-6645 Done (this also addresses Kim's comment). http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@11 PS3, Line 11: PCLMULQDQ > which CPUs introduced that? Added to comment. Any server-class x86 processors < 5 years will have it. http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py File tests/custom_cluster/test_disk_spill_configurations.py: http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py@33 PS3, Line 33: with a customer > what does that mean? I'm not sure what I meant. -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 09 May 2018 17:20:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default
Hello Dan Hecht, Kim Jin Chul, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10345 to look at the new patch set (#4). Change subject: IMPALA-6645: Enable disk spill encryption by default .. IMPALA-6645: Enable disk spill encryption by default Perf: Targeted benchmarks with a heavily spilling query on a machine with PCLMULQDQ support show < 5% of CPU time spent in encryption and decryption. PCLMULQDQ was introduced in AMD Bulldozer (c. 2011) and Intel Westmere (c. 2010). Testing: Ran core tests with the change. Updated the custom cluster test to exercise the non-default configuration. Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 --- M be/src/runtime/tmp-file-mgr.cc R testdata/workloads/functional-query/queries/QueryTest/basic-spilling.test R tests/custom_cluster/test_disk_spill_configurations.py 3 files changed, 9 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10345/4 -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kim Jin Chul
[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10357 Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors .. IMPALA-7000: [DOCS] Correct info about dedicated executors Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f --- M docs/topics/impala_scalability.xml 1 file changed, 7 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10357/1 -- To view, visit http://gerrit.cloudera.org:8080/10357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f Gerrit-Change-Number: 10357 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10317 ) Change subject: IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE .. IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4 Reviewed-on: http://gerrit.cloudera.org:8080/10317 Reviewed-by: Fredy Wijaya Reviewed-by: Alex Rodoni Tested-by: Impala Public Jenkins --- M docs/topics/impala_alter_table.xml 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Fredy Wijaya: Looks good to me, but someone else must approve Alex Rodoni: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10317 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4 Gerrit-Change-Number: 10317 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10317 ) Change subject: IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10317 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4 Gerrit-Change-Number: 10317 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 09 May 2018 16:42:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10317 ) Change subject: IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/287/ -- To view, visit http://gerrit.cloudera.org:8080/10317 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4 Gerrit-Change-Number: 10317 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 09 May 2018 16:38:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10317 ) Change subject: IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10317 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4 Gerrit-Change-Number: 10317 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 09 May 2018 16:38:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6635: Enable disk spill encryption by default
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10345 ) Change subject: IMPALA-6635: Enable disk spill encryption by default .. Patch Set 3: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-6635: IMPALA-6645 http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@11 PS3, Line 11: PCLMULQDQ which CPUs introduced that? http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py File tests/custom_cluster/test_disk_spill_configurations.py: http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py@33 PS3, Line 33: with a customer what does that mean? -- To view, visit http://gerrit.cloudera.org:8080/10345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9 Gerrit-Change-Number: 10345 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Wed, 09 May 2018 15:58:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10333 ) Change subject: Update download and signature links for 3.0.0 release. .. Patch Set 1: Code-Review+2 Oops, I'll try to remember that in the future. -- To view, visit http://gerrit.cloudera.org:8080/10333 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae Gerrit-Change-Number: 10333 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 09 May 2018 15:54:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10351 ) Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2432/ -- To view, visit http://gerrit.cloudera.org:8080/10351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Gerrit-Change-Number: 10351 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 09 May 2018 15:43:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10351 ) Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries .. Patch Set 3: Code-Review+2 IMPALA-6970 -- To view, visit http://gerrit.cloudera.org:8080/10351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Gerrit-Change-Number: 10351 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Brown Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 09 May 2018 15:43:16 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10347 ) Change subject: Fix diagnostics path to not include the parent dir structure .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10347/1/bin/diagnostics/collect_diagnostics.py File bin/diagnostics/collect_diagnostics.py: http://gerrit.cloudera.org:8080/#/c/10347/1/bin/diagnostics/collect_diagnostics.py@451 PS1, Line 451: arcname=os.path.basename(self.collection_root_dir)) > Some comments will be helpful to understand why an alternative name is requ Done -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 1 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 15:18:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure
Hello Philip Zeyliger, Kim Jin Chul, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10347 to look at the new patch set (#2). Change subject: Fix diagnostics path to not include the parent dir structure .. Fix diagnostics path to not include the parent dir structure Without the fix, the diagnostics tar file included the entire directory structure of the diagnostics root dir. Before: === $ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/ tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/ tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt After: = $ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/ impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/ impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt . Tested with python 2.6 Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c --- M bin/diagnostics/collect_diagnostics.py 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/10347/2 -- To view, visit http://gerrit.cloudera.org:8080/10347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c Gerrit-Change-Number: 10347 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9693 ) Change subject: IMPALA-5842: Write page index in Parquet files .. Patch Set 15: (29 comments) http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h File be/src/exec/hdfs-parquet-table-writer.h: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h@108 PS15, Line 108: PAGE_INDEX_TRUNCATE_LENGTH > PAGE_INDEX_MAX_STRING_LENGTH? Changed the name http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h@130 PS15, Line 130: Write > nit: "Writes" Done, also fixed the previous and the next comment as well. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@48 PS15, Line 48: using namespace parquet; > Can you remove this? I find omitting the parquet:: prefix confusing, and we I agree, done. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@109 PS15, Line 109: HdfsSinkTable > HdfsTableSink? Done http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@118 PS15, Line 118: table_sink_mem_tracker_->Release(page_index_memory_consumption_); > Can we do this in Close() instead? We have generally been trying to do reso Done http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@312 PS15, Line 312: OffsetIndex offset_index_; > I find it useful to prefix Parquet types with parquet:: and we do that in p I'm also in favor of prefixing. Done. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@318 PS15, Line 318: MemTracker *table_sink_mem_tracker_; > nit: MemTracker* Done http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@651 PS15, Line 651: location.compressed_page_size = -1; > Probably doesn't have much of an effect, but should we set the size to 0 fo Yeah, it makes more sense. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@668 PS15, Line 668: location.compressed_page_size = page.header.compressed_page_size + len; > Should we add a comment for this line to explain that one includes the leng Added a comment. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@752 PS15, Line 752: string min_val, max_val; > I think we usually do one declaration per line, but I don't feel strongly a Done http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@754 PS15, Line 754: Status s_min = TruncateMinValue(page_stats.min_value, PAGE_INDEX_TRUNCATE_LENGTH, > Could this truncate non-string values if PAGE_INDEX_TRUNCATE_LENGTH was sma Added a comment. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@761 PS15, Line 761: column_index_.null_pages.push_back(true); > Maybe DCHECK that both are false, to catch errors that result in only one b Added DCHECK. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@766 PS15, Line 766: Consume > We prefer TryConsume() and returning a MEM_LIMIT_EXCEEDED error synchronous Switched to TryConsume() and check its result. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1220 PS15, Line 1220: // Currently we only support Parquet files with one row group. > Can you add a sentence why this particular code relies on this limited supp Updated the comment. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1225 PS15, Line 1225: ++) > nit: ++i Switched to ++i. I need 'i' to index 'columns_' and 'row_group->columns'. The same stands for the next for-loop. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@88 PS15, Line 88: if (a != a && b != b) return 0; > Can we use std::isnan()? Yes, done. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@113 PS15, Line 113: Used for merging page statistics into column chunk statistics. > I think it's better to omit usage from the comment here. Instead say that i Changed the added comment to your sentence. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@164 PS15, Line 164: Let's a > nit: "We assume..." so it reads less like a suggestions. Done http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@165 PS15, Line 165: If not all values are equal > What happens for NaN, i.e. when all values are NaN? In ColumnStats::Merge(), MinMaxTrait::Compare() will compare the NaNs, and the cu
[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files
Hello Lars Volker, Anonymous Coward #248, Tim Armstrong, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9693 to look at the new patch set (#16). Change subject: IMPALA-5842: Write page index in Parquet files .. IMPALA-5842: Write page index in Parquet files This commit builds on the previous work of Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/ The commit implements the write path of PARQUET-922: "Add column indexes to parquet.thrift". As specified in the parquet-format, Impala writes the page indexes just before the footer. This allows much more efficient page filtering than using the same information from the 'statistics' field of DataPageHeader. I updated Pooja's python tests as well. Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-stats.h M be/src/exec/parquet-column-stats.inline.h M be/src/util/CMakeLists.txt A be/src/util/string-util-test.cc A be/src/util/string-util.cc A be/src/util/string-util.h M common/thrift/parquet.thrift M testdata/bin/load-dependent-tables.sql M tests/query_test/test_chars.py A tests/query_test/test_parquet_page_index.py M tests/util/get_parquet_metadata.py 13 files changed, 999 insertions(+), 98 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/9693/16 -- To view, visit http://gerrit.cloudera.org:8080/9693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9 Gerrit-Change-Number: 9693 Gerrit-PatchSet: 16 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Anonymous Coward #248 Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10354 ) Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell .. Patch Set 1: Jenkins pre-review tests passed: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2124/ -- To view, visit http://gerrit.cloudera.org:8080/10354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb Gerrit-Change-Number: 10354 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 09 May 2018 14:51:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
Balazs Jeszenszky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10339 ) Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required .. Patch Set 1: (18 comments) IMO the page overall needs a wider cleanup, commented on that too. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml File docs/topics/impala_invalidate_metadata.xml: http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@47 PS1, Line 47: relatively replace: very http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@48 PS1, Line 48: in the common scenario of adding new data files to an existing table replace: whenever possible (link to INVALIDATE vs. REFRESH usage page, if exists) http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@59 PS1, Line 59: By default replace: If there is no table specified http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@60 PS1, Line 60: Even for a single table, INVALIDATE METADATA is more expensive : than REFRESH, so prefer REFRESH in the common case where you add new data : files for an existing table. Same thing mentioned ~10 lines above - remove? http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@69 PS1, Line 69: Therefore, if some other entity modifies information used by Impala in the metastore : that Impala and Hive share, the information cached by Impala must be updated. However, this does not mean : that all metadata updates require an Impala update. This is vague, explicitly state when is manual invalidate needed (see L100-102), and remove this. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@74 PS1, Line 74: : : : In Impala 1.2 and higher, a dedicated daemon (catalogd) broadcasts DDL changes made : through Impala to all Impala nodes. Formerly, after you created a database or table while connected to one : Impala node, you needed to issue an INVALIDATE METADATA statement on another Impala node : before accessing the new database or table from the other node. Now, newly created or altered objects are : picked up automatically by all Impala nodes. You must still use the INVALIDATE METADATA : technique after creating or altering objects through Hive. See : for more information on the catalog service. : : : The INVALIDATE METADATA statement is new in Impala 1.1 and higher, and takes over some of : the use cases of the Impala 1.0 REFRESH statement. Because REFRESH now : requires a table name parameter, to flush the metadata for all tables at once, use the INVALIDATE : METADATA statement. : : : This section is very outdated and mixes up usage of old vs. new usage of REFRESH, remove. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@99 PS1, Line 99: instance Don't mention individual instances in this context, it's the service as a whole that needs update. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@100 PS1, Line 100: required if a change is made from another impalad : instance in your cluster, or through Hive and is distributed by : catalogd. INVALIDATE METADATA is required when: * metadata changes of existing tables are done outside of Impala * new tables are added outside of Impala, that are to be used by Impala * SERVER or DATABASE level Sentry privileges are changed outside of Impala * block metadata changes outside of Impala, but files remain the same (HDFS rebalance) * possibly when UDF jars change (needs verification) All metadata changes go through catalogd and are distributed by statestored. 'Outside of Impala' for table metadata means Hive and any other Hive client, e.g. SparkSQL. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@106 PS1, Line 106: same Impala node This is pre-1.2 information. No INVALIDATE is needed as long as the changes are done through the Impala service (using any impalad). http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@127 PS1, Line 127: INVALIDATE METADATA causes the metadata for that table to be marked as stale, and reloaded : the next time the table is referenced. For a huge table, that process could take a noticeable amount of time; Repeat of L44-4
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 4: Code-Review+1 (6 comments) Few nits otherwise looks good to me. The LocalToUtc performance part is optional - as it does not affect other parts of the code, it can be easily done later when general structure is already accepted by other reviewers. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@526 PS4, Line 526: const string& tz_name = (start_lookup.abbr != nullptr) ? start_lookup.abbr : : context->impl()->state()->local_time_zone()->name(); What is the goal of this logic? To print timezone abbreviations instead of the full names, or to distinguish between summer/winter time, or both? A comment would be nice, and maybe the logic could be moved to a TimestampValue member function like string GetTimezoneName(Timezone*). http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc File be/src/runtime/timestamp-value.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@93 PS4, Line 93: inline bool CheckIfDateOutOfRange(const cctz::civil_day& date) { : static const cctz::civil_day max_date(TimestampFunctions::MAX_YEAR, 12, 31); : static const cctz::civil_day min_date(TimestampFunctions::MIN_YEAR, 1, 1); : return date < min_date || date > max_date; : } This could be simpler and possibly faster by expecting a cctz::civil_second argument and check if 1400<=year<1. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@128 PS4, Line 128: cctz explains pretty well the handling of dst boundaries, maybe we could add a link to it, for example to https://github.com/google/cctz/blob/a2dd3d0fbc811fe0a1d4d2dbb0341f1a3d28cb2a/include/cctz/time_zone.h#L147 http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@129 PS4, Line 129: // In case of ambiguity invalidate TimestampValue. : const cctz::time_zone::civil_lookup from_cl = local_tz->lookup(from_cs); : if (UNLIKELY(from_cl.kind != cctz::time_zone::civil_lookup::UNIQUE)) { : SetToInvalidDateTime(); : } else { : cctz::time_point from_tp = from_cl.pre; : : // Convert 'from_tp' time_point to civil_second assuming 'UTC' time-zone. : cctz::civil_second to_cs = cctz::convert(from_tp, TimezoneDatabase::GetUtcTimezone()); : : // boost::gregorian::date() throws boost::gregorian::bad_year if year is not in the : // 1400.. range. Need to check validity before creating the date object. : if (UNLIKELY(CheckIfDateOutOfRange(cctz::civil_day(to_cs { I may be possible to get TimestampValue from cctz::time_zone::civil_lookup in a faster way - splitting from_tp to a day part (since a constant date) and the remainder seconds part is enough for us and should be faster then getting cctz::civil_second (which contains year/month/day split). The code could look something like this: int64 secs_since_base = from_tp - BASETIME_AS_CCTZ_SYS_SEC; time_=sec_since_base%(24*60*60)+time_.fractional_seconds(); int32 days_since_base = sec_since_base/(24*60*60); if(out_of_range(days_since_base)) SetToInvalidDateTime(); date_ = days_since_base - BASEDATE_AS_BOOST_GREG_DATE; http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@146 PS4, Line 146: time_ = boost::posix_time::time_duration(to_cs.hour(), to_cs.minute(), to_cs.second(), nit: long line http://gerrit.cloudera.org:8080/#/c/9986/3/be/src/util/filesystem-util.h File be/src/util/filesystem-util.h: http://gerrit.cloudera.org:8080/#/c/9986/3/be/src/util/filesystem-util.h@66 PS3, Line 66: iff > "iff" stands for "if and only if". Thanks for the explanation! -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 09 May 2018 11:47:13 + Gerrit-HasComments: Yes