Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13020 )
Change subject: IMPALA-8344: Add support for running the minicluster with S3Guard ...................................................................... IMPALA-8344: Add support for running the minicluster with S3Guard Some tests can fail on S3 due to some operations that are eventually consistent. S3Guard stores extra metadata in a DynamoDB to solve several consistency issues. This adds support for running the minicluster on S3 with S3Guard. S3Guard is configured by the following environment variables: S3GUARD_ENABLED: defaults to false, set to true to enable S3Guard S3GUARD_DYNAMODB_TABLE: name of the DynamoDB table to use. This must be exclusively owned by this minicluster. The dataload scripts initialize this table and will purge entries if the table already exists. The table should be in the same region as the S3_BUCKET for the minicluster. S3GUARD_DYNAMODB_REGION - AWS region for S3GUARD_DYNAMODB_TABLE These environment variables only impact S3 configurations. The support comes from three pieces: 1. Configuration changes in core-site.xml to add the appropriate parameters. 2. Updating dataload to initialize/purge the s3guard dynamodb table and import data appropriately. 3. Update tests to manipulate files through the HDFS command line rather than through s3 utilities. This takes the filesystem utility code for ABFS (which actually calls HDFS command line), makes it generic, and uses it for S3Guard. Testing: - Ran multiple rounds of s3 tests - Aborted tests in the middle and restarted the s3 tests (to test the s3guard reinitialization code) Change-Id: I3c748529a494bb6e70fec96dc031523ff79bf61d Reviewed-on: http://gerrit.cloudera.org:8080/13020 Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Sahil Takiar <stak...@cloudera.com> --- M bin/generate_xml_config.py M bin/impala-config.sh A bin/jenkins/release_cloud_resources.sh M infra/python/deps/requirements.txt M testdata/bin/load-test-warehouse-snapshot.sh A testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py D testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl M tests/common/impala_test_suite.py M tests/query_test/test_scanners_fuzz.py D tests/util/abfs_util.py M tests/util/filesystem_utils.py M tests/util/hdfs_util.py D tests/util/s3_util.py 13 files changed, 316 insertions(+), 373 deletions(-) Approvals: Joe McDonnell: Looks good to me, but someone else must approve Impala Public Jenkins: Verified Sahil Takiar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/13020 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3c748529a494bb6e70fec96dc031523ff79bf61d Gerrit-Change-Number: 13020 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: David Knupp <dkn...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org>