[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to optionally use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this is enabled via a environment variable override to bin/impala-config.sh. When the environment variable USE_CDP_HIVE is set to true the bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it in the toolchain directory. These binaries are used to start the Hive services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1 Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to start a minicluster which uses Hive 3.1.0 users should follow the steps below: 1. Make sure that minicluster, if running, is stopped before you run the following commands. 2. Open a new terminal and run following commands. > export USE_CDP_HIVE=true > source bin/impala-config.sh > bin/bootstrap_toolchain.py The above command downloads the Hive 3.1.0 tarballs and extracts them in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components are already downloaded by a previous invocation of the script. > source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and needs to be regenerated. > testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Reviewed-on: http://gerrit.cloudera.org:8080/12846 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 284 insertions(+), 11 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 12 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 28 Mar 2019 01:52:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2566/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 21:45:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3962/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 21:06:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 21:06:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 10: Code-Review+2 (1 comment) I saw few +1s from Tim and Andrew. I'm going to promote it to +2. This is a good start. We can always iterate it again to improve it further. http://gerrit.cloudera.org:8080/#/c/12846/9/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/9/bin/create-test-configuration.sh@149 PS9, Line 149: 1>${IMPALA_CLUSTER_LOGS_DIR}/schematool.log 2>&1 > schematool has a problem which prints bunch of new lines on the stdout afte Sounds good. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 21:05:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to optionally use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this is enabled via a environment variable override to bin/impala-config.sh. When the environment variable USE_CDP_HIVE is set to true the bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it in the toolchain directory. These binaries are used to start the Hive services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1 Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to start a minicluster which uses Hive 3.1.0 users should follow the steps below: 1. Make sure that minicluster, if running, is stopped before you run the following commands. 2. Open a new terminal and run following commands. > export USE_CDP_HIVE=true > source bin/impala-config.sh > bin/bootstrap_toolchain.py The above command downloads the Hive 3.1.0 tarballs and extracts them in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components are already downloaded by a previous invocation of the script. > source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and needs to be regenerated. > testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 284 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/10 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 9: (5 comments) http://gerrit.cloudera.org:8080/#/c/12846/9/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/12846/9/bin/bootstrap_toolchain.py@433 PS9, Line 433: def download_cdp_hive(toolchain_root): We don't have to do it now, but at some point, we should refactor this function to be more generic, like downloading Ranger, Hive, etc. http://gerrit.cloudera.org:8080/#/c/12846/9/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/9/bin/create-test-configuration.sh@132 PS9, Line 132: # Certain configurations (like SentrySyncHMSNotificationsPostListener) do not work : # with HMS 3.1.0. Use a cdp specific configuration template : generate_config postgresql-hive-site.xml.cdp.template hive-site.xml will this cause Sentry tests to fail when USE_CDP_HIVE=true? http://gerrit.cloudera.org:8080/#/c/12846/9/bin/create-test-configuration.sh@149 PS9, Line 149: 1>${IMPALA_CLUSTER_LOGS_DIR}/schematool.log 2>&1 it may be better to use tee 2>&1 | tee ${IMPALA_CLUSTER_LOGS_DIR}/schematool.log http://gerrit.cloudera.org:8080/#/c/12846/9/fe/src/test/resources/postgresql-hive-site.xml.cdp.template File fe/src/test/resources/postgresql-hive-site.xml.cdp.template: http://gerrit.cloudera.org:8080/#/c/12846/9/fe/src/test/resources/postgresql-hive-site.xml.cdp.template@27 PS9, Line 27: nit: formatting is off in this file, a lot of mixed 1 space vs 2 spaces. http://gerrit.cloudera.org:8080/#/c/12846/9/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/12846/9/testdata/bin/run-hive-server.sh@72 PS9, Line 72: if [[ $USE_CDP_HIVE && -n "$SENTRY_HOME" ]]; then : for f in ${SENTRY_HOME}/lib/sentry-binding-hive*.jar; do : FILE_NAME=$(basename $f) : # exclude all the hive jars from being included in the classpath since Sentry : # depends on Hive 2.1.1 : if [[ ! $FILE_NAME == hive* ]]; then : export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${f} : fi : done : fi nit: use 2 spaces -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 01:50:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 9: Thanks for addressing my feedback. I'll set someone else review to a +2 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 27 Mar 2019 01:18:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2550/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 20:55:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to optionally use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this is enabled via a environment variable override to bin/impala-config.sh. When the environment variable USE_CDP_HIVE is set to true the bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it in the toolchain directory. These binaries are used to start the Hive services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1 Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to start a minicluster which uses Hive 3.1.0 users should follow the steps below: 1. Make sure that minicluster, if running, is stopped before you run the following commands. 2. Open a new terminal and run following commands. > export USE_CDP_HIVE=true > source bin/impala-config.sh > bin/bootstrap_toolchain.py The above command downloads the Hive 3.1.0 tarballs and extracts them in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components are already downloaded by a previous invocation of the script. > source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and needs to be regenerated. > testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 303 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/9 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2549/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 19:28:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2548/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 19:27:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 7: Code-Review+1 (5 comments) a few nits but looks good http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG@10 PS6, Line 10: the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. Nit: to optionally use http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh@132 PS6, Line 132: # Certain configurations (like SentrySyncHMSNotificationsPostListener) does not work s/does not/do not/ http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@163 PS6, Line 163: export CDH_BUILD_NUMBER=909265 Nit: I wonder if CDP_BUILD_NUMBER should go here so that the 2 build numbers are together http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@534 PS6, Line 534: export HIVE_HOME="$CDP_COMPONENTS_HOME/apache-hive-${IMPALA_HIVE_VERSION}-bin" Why in one case is the home apache-hive-xxx and in the other it is hive-xxx ? http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@748 PS6, Line 748: echo "CDP_BUILD_NUMBER= $CDP_BUILD_NUMBER" It could be confusing that you always echo CDH_BUILD_NUMBER but you only echo CDP_BUILD_NUMBER when doing a CDP build. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 19:12:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/6/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/bootstrap_toolchain.py@432 PS6, Line 432: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 18:56:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this is enabled via a environment variable override to bin/impala-config.sh. When the environment variable USE_CDP_HIVE is set to true the bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it in the toolchain directory. These binaries are used to start the Hive services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1 Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to start a minicluster which uses Hive 3.1.0 users should follow the steps below: 1. Make sure that minicluster, if running, is stopped before you run the following commands. 2. Open a new terminal and run following commands. > export USE_CDP_HIVE=true > source bin/impala-config.sh > bin/bootstrap_toolchain.py The above command downloads the Hive 3.1.0 tarballs and extracts them in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components are already downloaded by a previous invocation of the script. > source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and needs to be regenerated. > testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 298 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/7 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/6/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/bootstrap_toolchain.py@432 PS6, Line 432: def download_cdp_hive(toolchain_root): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 18:44:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this is enabled via a environment variable override to bin/impala-config.sh. When the environment variable USE_CDP_HIVE is set to true the bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it in the toolchain directory. These binaries are used to start the Hive services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1 Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to start a minicluster which uses Hive 3.1.0 users should follow the steps below: 1. Make sure that minicluster, if running, is stopped before you run the following commands. 2. Open a new terminal and run following commands. > export USE_CDP_HIVE=true > source bin/impala-config.sh > bin/bootstrap_toolchain.py The above command downloads the Hive 3.1.0 tarballs and extracts them in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components are already downloaded by a previous invocation of the script. > source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and needs to be regenerated. > testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 297 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/6 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@37 PS4, Line 37: # parse command line options > It seems like options to impala-config.sh are currently passed by environme Yeah, I think it would be best to avoid making this a special option that behaves differently to everything else. A lot of scripts source impala-config.sh without arguments. This works today for the two valid ways to set options - via environment variables or by setting them in impala-config-local.sh/impala-config-branch.sh -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 25 Mar 2019 23:34:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@37 PS4, Line 37: # parse command line options It seems like options to impala-config.sh are currently passed by environment variable, for example USE_KUDU_DEBUG_BUILD can be set before sourcing impala-config.sh. Is there a particular reason you chose to add arguments to impala-config.sh rather than using the existing mechanism? -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 25 Mar 2019 22:52:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: (17 comments) http://gerrit.cloudera.org:8080/#/c/12846/4/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/bootstrap_toolchain.py@434 PS4, Line 434: os.getenv("USE_CDP do we have any utility code anywhere that's more permission than this? I can see someone setting it to '1' and being very confused why it's not working. http://gerrit.cloudera.org:8080/#/c/12846/4/bin/bootstrap_toolchain.py@450 PS4, Line 450: present maybe say 'set' here since it doesn't actually need to be present? (it will be makedirred below) http://gerrit.cloudera.org:8080/#/c/12846/4/bin/bootstrap_toolchain.py@466 PS4, Line 466: # TODO the tar file name in the cdp build don't match with the version number. Hard : # coding the name here currently : file_name = "{0}.tar.gz".format(dir_name) is this TODO inaccurate? it looks like from the code here it does match. http://gerrit.cloudera.org:8080/#/c/12846/4/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/create-test-configuration.sh@146 PS4, Line 146: # Hive schema SQL scripts include other scripts using \i, which expects absolute paths. : # Switch to the scripts directory to make this work. : pushd ${HIVE_HOME}/bin this pushd/popd is no longer relevant now that you're using schematool, right? http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@38 PS4, Line 38: for ARG in $* nit: usually 'do' is on the same line http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@41 PS4, Line 41: -use-hive3) I think '--' instead of '-' is more common for long arg names http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@44 PS4, Line 44: -help) same http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@49 PS4, Line 49: esac do you want a default case here that prints usage info? otherwise a typo in the args would just be silently ignored. http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@310 PS4, Line 310: export METASTORE_DB=${METASTORE_DB-"$(cut -c-63 <<< HMS$ESCAPED_IMPALA_HOME)_cdp"} I'm assuming the 63-character 'cut' here is because of a 63-character limit in db names in postgres or something. Given that, I guess we need to cut to 59 instead of 63? http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@767 PS4, Line 767: echo "IMPALA_HIVE_VERSION = $IMPALA_HIVE_VERSION" nit: indentation off http://gerrit.cloudera.org:8080/#/c/12846/4/fe/src/test/resources/postgresql-hive-site.xml.cdp.template File fe/src/test/resources/postgresql-hive-site.xml.cdp.template: http://gerrit.cloudera.org:8080/#/c/12846/4/fe/src/test/resources/postgresql-hive-site.xml.cdp.template@99 PS4, Line 99:
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2536/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 25 Mar 2019 20:59:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/4/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/testdata/bin/run-hive-server.sh@69 PS4, Line 69: # CDH Hive metastore scripts do not do so. This is currently to make sure that we can run all line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 25 Mar 2019 20:15:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this option is enabled via a command line argument to bin/impala-config.sh. This command line argument (-use-hive3) sets up certain environment variables such that Hive 3.1.0 based binaries can be used to instantiate Hive service (Hiveserver2 and metastore). The default is still Hive 2.1.1 Also, since Hive 3.1.1 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to do so users should follow the below steps: 1. Open a new terminal 2. Run bin/bootstrap_toolchain.py 2. source bin/impala-config.sh -use-hive3 3. source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is slightly different than Hive 2.1.0 and needs to be regenerated. 4. Start services using the testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 372 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/4 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/2535/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 25 Mar 2019 19:57:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/12846/3/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/12846/3/bin/bootstrap_toolchain.py@465 PS3, Line 465: p flake8: F841 local variable 'platform_label' is assigned to but never used http://gerrit.cloudera.org:8080/#/c/12846/3/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/3/bin/create-test-configuration.sh@132 PS3, Line 132: # Certain configurations (like SentrySyncHMSNotificationsPostListener) does not work with HMS 3.1.0 line too long (101 > 90) http://gerrit.cloudera.org:8080/#/c/12846/3/testdata/bin/run-hive-server.sh File testdata/bin/run-hive-server.sh: http://gerrit.cloudera.org:8080/#/c/12846/3/testdata/bin/run-hive-server.sh@66 PS3, Line 66: export HIVE_METASTORE_HADOOP_OPTS="-verbose:class -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=30010" line too long (121 > 90) http://gerrit.cloudera.org:8080/#/c/12846/3/testdata/bin/run-hive-server.sh@69 PS3, Line 69: # CDH Hive metastore scripts do not do so. This is currently to make sure that we can run all the tests line too long (103 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 25 Mar 2019 19:41:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12846 Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. IMPALA-8345 : Add option to set up minicluster to use Hive 3 As a first step to integrate Impala with Hive 3.1.0 this patch modifies the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. In order to make sure that existing setups don't break this option is enabled via a command line argument to bin/impala-config.sh. This command line argument (-use-hive3) sets up certain environment variables such that Hive 3.1.0 based binaries can be used to instantiate Hive service (Hiveserver2 and metastore). The default is still Hive 2.1.1 Also, since Hive 3.1.1 uses a upgraded metastore schema, this patch makes use of a different database name so that it is easy to switch from working from one environment which uses Hive 2.1.1 metastore to another which usese Hive 3.1.0 metastore. In order to do so users should follow the below steps: 1. Open a new terminal 2. Run bin/bootstrap_toolchain.py 2. source bin/impala-config.sh -use-hive3 3. source bin/create-test-configuration.sh -create-metastore The above step should provide "-create-metastore" only the first time so that a new metastore db is created and the Hive 3.1.0 schema is initialized. For all subsequent invocations, the "-create-metastore" argument can be skipped. We should still source this script since the hive-site.xml of Hive 3.1.0 is slightly different than Hive 2.1.0 and needs to be regenerated. 4. Start services using the testdata/bin/run-all.sh Note that the testing was performed locally by downloading the Hive 3.1 binaries into toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once the binaries are available in S3 bucket, the bootstrap_toolchain script should automatically do this for you. Testing Done: 1. Made sure that the cluster comes up with Hive 3.1 when the steps above are performed. 2. Made sure that existing scripts work as they do currently when argument is not provided. 3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala still uses Hive 2.1.1 client. Upgrading client libraries in Impala will be done as a separate change) Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/impala-config.sh A fe/src/test/resources/postgresql-hive-site.xml.cdp.template M testdata/bin/run-hive-server.sh 5 files changed, 374 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/12846/3 -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Todd Lipcon