[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-13 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 3:

(3 comments)

> I also don't want to block progress on making this good improvement
 > to our S3 development, but my fear is that this script will get
 > more and more out of control if we don't draw a line somewhere
 > about what belongs in it.

That's a valid point. I moved the logic to bin/check-s3-access.sh, which 
actually lets us reuse the the same check elsewhere if it becomes necessary or 
useful.

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294: export LOCAL_FS="file:${WAREHOUSE_LOCATION_PREFIX}"
> Yeah, I'd question whether "aws ls" belongs in this script either.
Done, moved to bin/check-s3-access.sh.


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@303
PS2, Line 303: set AWS_A
> This variable will leak out into the user's shell.
Done, I moved these checks into a separate script, check-s3-access.sh


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@307
PS2, Line 307:
> Good point; I'll check if wget can be set up the same way.
Done. Although the check was moved to bin/check-s3-access.sh, I have replaced 
curl with wget, using equivalent parameters for silencing and short timeouts.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:26:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-13 Thread Laszlo Gaal (Code Review)
Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, 
David Knupp, Joe McDonnell, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8294

to look at the new patch set (#3).

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..

IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

For some time Impala in a production environment has been able
to access data stored in Amazon S3 buckets using credentials specified
in a number of ways:
- storing Amazon access keys in environment variables or
  in core-site.xml.
- using proprietary management tools to store Amazon access keys
  securely
- using Amazon IAM roles bound to VMs running in EC2.

The development minicluster environment used the first approach,
which risked leaking these keys.

This change enables Impala builds to use IAM
roles to access S3 buckets when running on an Amazon EC2 virtual
machine. The changes mainly ensure that environment variables and/or Jenkins
parameters carrying the traditional AWS credentials do not conflict with
credentials supplied by the IAM role attached to the VM instance.

The change also moves the logic performing the S3 access checks into a separate
script file: bin/check-s3-access.sh.

IAM role based credentials are accessible through the EC2
instance-property mechanism; for further details see Amazon's docs at
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials

Changes to the configuration script:
1. bin/impala-config.sh stops setting the AWS_* environment variables
   to dummy default values. When AWS credentials are not supplied in
   the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
   these variables are unset (removed from the environment), otherwise
   they would interfere with authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
   optional. They are still accepted to allow for private test runs
   accessing private/nondefault buckets with custom credentials.
3. bin/impala-config.sh now calls bin/check-s3-access.sh to perform the actual
   S3-dependent checks. check-s3-access.sh contains the S3-specific logic and
   network access needed to check if the requested S3 bucket is accessible
   for the build.

Changes to the minicluster configuration:
1. Security credentials for the s3n: connector, located in core-site.xml
   are no longer replaced with actual AWS_ credentials when configuring
   the minicluster. These parameters are used for some front-end tests,
   which don't actually reach out to S3, the s3n: notation just simulates
   non-HDFS storage.
   For these tests to work s3n: authentication parameters still need to
   exist in core-site.xml. Their values do not matter, so the configuration
   template now has fixed dummy values for these parameters.

2. Remove empty s3a: security parameter sections from core-site.xml:
   The testdata/cluster/admin setup script substitutes values from
   environment variables into core-site.xml when it sets up the minicluster
   runtime environment.

   The configuration section for s3a: credentials is now completely
   removed if both of the following conditions are met:
   - the target filesystem is set to "s3"
   - the AWS credential environment variables AWS_ACCESS_KEY_ID
 and AWS_SECRET_ACCESS_KEY are both empty or missing.

   The configuration file core-site.xml.tmpl is extended with
   comment markers that delimit the section to be removed in this case.

Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
---
A bin/check-s3-access.sh
M bin/impala-config.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
4 files changed, 163 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/3
--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

I also don't want to block progress on making this good improvement to our S3 
development, but my fear is that this script will get more and more out of 
control if we don't draw a line somewhere about what belongs in it.


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 02:22:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> I'm not sure that putting the checks into a separate script would change an
Lines 252-264 seem ok - I don't have a problem with setting environment 
variables in this script.


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> In most of the use cases the network access is avoided, it happens only if
Yeah, I'd question whether "aws ls" belongs in this script either.

The performance is one thing but I just generally think we should restrict this 
script to doing the minimum possible to set up environment variables. I don't 
see why it should be extended to do heavyweight things to validate the 
configuration - we don't do anything to validate the vast majority of other 
variables that are set in this script.

It does appear that people have added validations in an ad-hoc way before so if 
a majority of people think that that's a good idea, I can yield to that.

We should also keep in mind that this script is a crappy programming 
environment, because it's running in the context of the user's shell and we 
can't use things like "set -x" and have to be careful setting variables that we 
don't intend to leak into the user's shell.

So I think regardless this logic would be more maintainable in a separate 
script to validate the AWS config. My preference is that we also run that 
script from elsewhere to keep impala-config.sh lightweight but if other people 
feel strongly that impala-config.sh should be doing more validation of configs, 
etc then that's not the worst thing in the world.


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@303
PS2, Line 303: CURL_ARGS
This variable will leak out into the user's shell.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 02:21:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@307
PS2, Line 307: if ! curl "${CURL_ARGS[@]}" ; then
curl is not a development dependency in bin/bootstrap_system.sh. I'd suggest 
using wget or adding curl to the apt-get install list in bin/bootstrap_system.sh



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:20:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> Does impala-config.sh really have to talk to the internet? It just tends to
(I think it's fine if this is put in a separate script and called from 
buildall.sh or testdata/bin/run-all.sh or something like that)



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2: Code-Review-1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
Does impala-config.sh really have to talk to the internet? It just tends to 
cause problems if impala-config.sh does things aside from setting environment 
variables.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 17:05:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

> Hi Laszlo,
 >
 > We're interested in this change for avoiding some compatibility
 > problems with Hadoop 3. Any news here?
 >
 > Thanks!

Hi Philip,

Sorry for the late update on this patch, I was detoured by other, more 
timeboxed activities. I have just posted a new patch set and I'm looking 
forward to your comments on it.


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 15:27:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@240
PS1, Line 240: #export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY-}"
 : #export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID-}"
> Self-inflicted review finding: remove commented-out lines.
Done


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256
PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then
> (Bash hackery)
Thanks for the tip, Philip, I ended up using it.


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@268
PS1, Line 268: test
> tests
Done


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: curl -sf --connect-timeout 1 --max-time 5
> Are these curl options common across a wide variety of OSs? If they are new
Yes. I have tested the curl options on all the platforms on which the packaging 
build runs. Curl understands these options on all these platforms.


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: http://169.254.169.254/latest/meta-data/iam/security-credentials/
> Is there a AWS reference page you can add as a comment so I can read up mor
Done


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: 169.254.169.254
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.ht
I have put the URL to the Amazon doc page where this mechanism is described.


http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3000
PS1, Line 3000:   AnalysisError(String.format("load data inpath '%s' %s 
into table " +
  :   "tpch.lineitem", 
"s3a://bucket/test-warehouse/test.out", overwrite),
  :   "INPATH location 
's3a://bucket/test-warehouse/test.out' must point to an " +
  :   "HDFS, S3A or ADL filesystem.");
> Impala cannot load data from s3n. I think this test is intended to verify o
That's right; I was pattern-matching too eagerly. Thanks for catching this.
I have in fact reverted all the s3n->s3a changes in the FE tests; looks like 
they don't in fact touch S3 and they are OK with some existing but obviously 
fake credentials in core-site.xml.


http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl@61
PS1, Line 61: 
> Maybe have a comment explaining testdata/cluster/admin needs this (and the
Done,

referred the Amazon doc site as well.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 15:24:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Laszlo Gaal (Code Review)
Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, 
David Knupp, Joe McDonnell, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8294

to look at the new patch set (#2).

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..

IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

For some time Impala in a production environment has been able
to access data stored in Amazon S3 buckets using credentials specified
in a number of ways:
- storing Amazon access keys in environment variables or
  in core-site.xml.
- using proprietary management tools to store Amazon access keys
  securely
- using Amazon IAM roles bound to VMs running in EC2.

The development minicluster environment used the first approach,
which risked leaking these keys.

This change paves the way for Impala development setups to use IAM
roles to access S3 buckets when running on an Amazon EC2 virtual
machine. The changes mainly ensure that traditional credentials
supplied in environment variables do not conflict with credentials
supplied by the IAM role attached to the VM instance.
The IAM role based credentials are accessible through the EC2
instance-property mechanism; for further details see Amazon's docs at
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials

Changes to the configuration script:
1. bin/impala-config.sh stops setting the AWS_* environment variables
   to dummy default values. When AWS credentials are not supplied in
   the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
   these variables are unset (removed from the environment), otherwise
   they would preempt authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
   optional. They are still accepted to allow for private test runs
   accessing private/nondefault buckets with custom credentials.
3. bin/impala-config.sh now checks if credentials are supplied in the
   AWS_* variables or via the IAM role.

Changes to the minicluster configuration:
1. Some front-end tests still refer to the old S3 connector s3n:.
   This connector does not support s3 auth via IAM roles, but this
   is not a problem: these front-end tests don't actually reach out
   to S3, the s3n: notation is just for the FE.
   For these tests to work authentication parameters still need to
   exist for the s3n: connector in core-site.xml, but the values do
   not matter, so the configuration template now has fixed dummy
   values for the s3n: AWS credentials.

2. Remove empty AWS credentials from core-site.xml.tmpl:
   The testdata/cluster/admin setup script substitutes values from
   environment variables into Hadoop *-site.xml configuration files
   when setting up the minicluster runtime environment.

   The configuration section for s3a: credentials are now completely
   removed if:
   - the target filesystem is set to "s3"
   - and the AWS credential environment variables AWS_ACCESS_KEY_ID
 and AWS_SECRET_ACCESS_KEY are both empty or missing.

   The configuration file core-site.xml.tmpl was extended with
   comment markers that delimit the section to be removed in this case.

Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
---
M bin/impala-config.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
3 files changed, 92 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/2
--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-24 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

Hi Laszlo,

We're interested in this change for avoiding some compatibility problems with 
Hadoop 3. Any news here?

Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 24 Oct 2017 23:03:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: 169.254.169.254
> What is this address? Can we use a domain name and https?
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

This is EC2 magic. If you're not on EC2, this will tell you to set the 
environment variables (if you're testing S3). If you're on EC2, this will 
return something. We don't actually check that those credentials can access the 
relevant bucket, but at least it lets you through.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Oct 2017 21:06:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-18 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: 169.254.169.254
What is this address? Can we use a domain name and https?

What would a new Impala developer get as a result of this?



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Oct 2017 20:55:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256
PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then
> We may want to include a check here that "set -x" has not been enabled?
(Bash hackery)

If you want, you can use a subshell:

$(set -x; (set +x; [ $USER == philip ])) && echo yes || echo no
+ set +x
yes
10:46:02 mystery  ~
$(set -x; (set +x; [ $USER == phili ])) && echo yes || echo no
+ set +x
no


Return values survive, and you can just do

if (set +x; [[ -z ${AWS_SECRET_ACCESS_KEY_ID-} ]]); then
  ...
fi

I tested something similar above.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Oct 2017 17:47:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256
PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then
We may want to include a check here that "set -x" has not been enabled?

# Prevent leaking the AWS keys to the log
set_x=0
if set +o | grep -q "set -o xtrace"; then
  set_x=1
  set +x
fi

DO STUFF

# Restore xtrace flag
if [[ $set_x -eq 1 ]]; then
  set -x
fi



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Oct 2017 17:32:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3000
PS1, Line 3000:   AnalysisError(String.format("load data inpath '%s' %s 
into table " +
  :   "tpch.lineitem", 
"s3a://bucket/test-warehouse/test.out", overwrite),
  :   "INPATH location 
's3a://bucket/test-warehouse/test.out' must point to an " +
  :   "HDFS, S3A or ADL filesystem.");
Impala cannot load data from s3n. I think this test is intended to verify our 
error message when given an s3n path, so I don't think an s3a path will work 
here. The source of the error is LoadDataStmt::analyzePaths().



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 17 Oct 2017 23:30:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-17 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8294/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8294/1//COMMIT_MSG@9
PS1, Line 9: JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins 
workers
   : to facilitate s3 access without having to carry around AWS 
credentials
   : in environment variables, from where they were prone to escape to 
log
   : files posted in public places.
This looks like something specific to a downstream environment and should be 
removed from the commit message.


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@268
PS1, Line 268: test
tests


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: curl -sf --connect-timeout 1 --max-time 5
Are these curl options common across a wide variety of OSs? If they are newer 
and only work under, say, Ubuntu 16, they would be a problem for contributors 
using older distributions.


http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276
PS1, Line 276: http://169.254.169.254/latest/meta-data/iam/security-credentials/
Is there a AWS reference page you can add as a comment so I can read up more on 
this?


http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl@61
PS1, Line 61: 
Maybe have a comment explaining testdata/cluster/admin needs this (and the END) 
as a marker and not to remove?



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 17 Oct 2017 23:09:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@240
PS1, Line 240: #export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY-}"
 : #export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID-}"
Self-inflicted review finding: remove commented-out lines.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 17 Oct 2017 19:45:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8294


Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..

IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins workers
to facilitate s3 access without having to carry around AWS credentials
in environment variables, from where they were prone to escape to log
files posted in public places.

This change paves the way for Impala build and test jobs to use the IAM
roles to access s3 buckets. There are a few minor changes that allow
this to happen:

Changes to the configuration script:
1. bin/impala-config.sh stops setting the AWS_* environment variables
   to dummy default values. When AWS credentials are not supplied in
   the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
   these variables are unset (removed from the environment), otherwise
   they would preempt authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
   optional. They are still accepted to allow for private test runs
   accessing private/nondefault buckets with custom credentials.
3. bin/impala-config.sh now checks if credentials are supplied in the
   AWS_* variables or via the IAM role.

Changes to the frontend tests:
1. Some front-end tests still referenced the old s3 connector s3n:,
   this connector does not support s3 auth via IAM roles. These
   locations are changed to use the newer s3a:, which is the connector
   capable of using IAM roles for authentication and which is used
   in all other code locations.

Changes to the minicluster setup:
1. As a corollary the s3n: configuration sections are removed from
   core-site.xml.tmpl.
2. Remove empty AWS credentials from core-site.xml.tmpl:

   The minicluster setup script susbstitutes values from environment
   variables into Hadoop *-site.xml config files when setting up
   the minicluster runtime environment. The configuration file
   core-site.xml.tmpl contains a section for s3 access, including
   AWS credentials.

   Impala can now use IAM roles for s3 access; this requires the removal
   of environment variables holding AWS credentials, which
   1. breaks the substitution logic in testdata/cluster/admin, and
   2. would break the IAM-based credentials if empty credentials were
  supplied in core-site.xml

   The fix for all of the above issues is to remove the AWS credential
   settings from the generated core-site.xml if both AWS_ACCESS_KEY_ID and
   AWS_SECRET_ACCESS_KEY environment variables are absent or empty.

Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
---
M bin/impala-config.sh
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
5 files changed, 52 insertions(+), 29 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/1
--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal