Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Austin Bennett
Hi, Andrew, Yes this is sensible (and close to what I had in mind). I had imagined extending the copy_query to generate different text whether relying upon a role or secret_/access_key (to additionally keep current/base functionality). This looks like it can work for my needs -- and hopefully wou

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Andrew Harmon
Here is the code we came up with. Pulled some stuff out for some weird stuff that only pertained to us. This is the execute() of the operator we wrote. Basically, it uses the PostgresHook to connect to Redshift. Then i stored the Role arn in the extras of the Airflow Postgres connection in a key ca

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Andrew Harmon
I think what @ash is referring to is if you have an IAM role associated with an EC2 instance, and your AWS connection in Airflow is left blank, Boto3 will default tonthst role for any calls made by Boto3. However, in this instance, Boto3 is not used, psycopg2 is used to make a connection to Redshif

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Austin Bennett
@Andrew, indeed, having to authenticate to Redshift, separate from credentials that allow S3 access, is how I work (outside of Airflow), so also sensible that would be how is done in Airflow. I guess I should use ARN - rather than IAM - as the acronym (referring to the redshift-copy role/credentia

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread David Cavaletto
This is exactly how we do it. We set the AWS_REGION, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, and AWS_SECURITY_TOKEN as environment variables and boto3 picks up the role from there. Works great. Here is a good entry point to the AWS docs explaining it: https://docs.aws.amazon.com/cli/latest/usergu

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Andrew Harmon
Maybe just to clarify, to connect to Redshift and issue a COPY, you’ll need a Redshift username and password. You would store that in a Postgres connection. This is a un/pw in Redshift, not AWS creds. The SQL text needed to issue the COPY requires either AWS creds or the arn of a role to use. Andr

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Andrew Harmon
Hi, I extended the Redshift operator to pull the Role needed for the copy from the connection object. I stored the role arm in the extras of the connection. Works well so far. I’m not sure if that helps or if you’re looking for an out of the box solution, but I’d be happy to share my code with you.

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Austin Bennett
@ash I'll look into that as an option. Given I am still a novice user, I'm consistently impressed with the simplicity (once understood) given the layers of abstractions. I am not familiar enough with Instance profiles to say whether that is suitable. Was reading the copy_query default ( https://

Re: IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Ash Berlin-Taylor
If you create an "empty" connection of type "AWS" (i.e. don't specify a username or password) then the AWSHook/S3Hook will use instance profiles. Is that what you want? -ash > On 30 Jan 2019, at 18:45, Austin Bennett wrote: > > Have started to push our group to standardizing on airflow. We s

IAM Permissions for AWS / S3_to_Redshift_Operator

2019-01-30 Thread Austin Bennett
Have started to push our group to standardizing on airflow. We still have a few large Redshift clusters. The s3_to_redshift_operator.py only appears to be written to authenticate via secret/access-keys. We no longer use Key Based authentication and rely upon Role Based, therefore IAM groups. Wh