This is exactly how we do it. We set the AWS_REGION, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, and AWS_SECURITY_TOKEN as environment variables and boto3 picks up the role from there. Works great. Here is a good entry point to the AWS docs explaining it: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html
On Wed, Jan 30, 2019 at 4:57 PM Austin Bennett <whatwouldausti...@gmail.com> wrote: > @ash I'll look into that as an option. Given I am still a novice user, I'm > consistently impressed with the simplicity (once understood) given the > layers of abstractions. I am not familiar enough with Instance profiles to > say whether that is suitable. > > Was reading the copy_query default ( > > https://github.com/apache/airflow/blob/master/airflow/operators/s3_to_redshift_operator.py#L93 > ): > copy_query = """ > COPY {schema}.{table} > FROM 's3://{s3_bucket}/{s3_key}/{table}' > with credentials > 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}' > {copy_options}; > """.format(schema=self.schema, > table=self.table, > s3_bucket=self.s3_bucket, > s3_key=self.s3_key, > access_key=credentials.access_key, > secret_key=credentials.secret_key, > copy_options=copy_options) > > Which then seems like there could be issues if not providing keys. And > ultimately I'd like to issue an alternate (albeit similar) copy_query: > > Currently, for perhaps 100 tables, am issuing the command "COPY <TABLENAME> > FROM 's3://<LOCATION>' iam_role > 'arn:aws:iam::<####>:role/<redshift-copy-unload>' <OPTIONS DELIMITER GZIP, > etc etc>;". > > Or, from their docs (linked in this thread): > > copy catdemo > from 's3://awssampledbuswest2/tickit/category_pipe.txt' > iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>' > region 'us-west-2'; > > > Was anchoring on getting the exact same statement built, which seems > straight forward (I could sketch that out, if you think that'd be of > interest). Perhaps that goal makes sense (whether that is the best > solution is certainly a different story). > > I must acknowledge inexperience with Airflow, so believe there is more to > it than my simplistic approach. Redshift with Airflow is queued for > another couple weeks (haven't even touched Redshift in months), was trying > to get ahead of things. I can start with an empty connection type and > reading up on instance profiles. > > Thank you for the input/feedback! > > > > > On Wed, Jan 30, 2019 at 1:56 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > > If you create an "empty" connection of type "AWS" (i.e. don't specify a > > username or password) then the AWSHook/S3Hook will use instance profiles. > > > > Is that what you want? > > > > -ash > > > > > On 30 Jan 2019, at 18:45, Austin Bennett <whatwouldausti...@gmail.com> > > wrote: > > > > > > Have started to push our group to standardizing on airflow. We still > > have > > > a few large Redshift clusters. > > > > > > The s3_to_redshift_operator.py only appears to be written to > authenticate > > > via secret/access-keys. We no longer use Key Based authentication and > > rely > > > upon Role Based, therefore IAM groups. > > > > > > What would the recommended way - in airflow - to rely upon IAM for > > > authentication for COPY Commands to Redshift? Should I rely on a > plugin? > > > I'd be interested to contribute to the project by updating/extending > > > s3_to_redshift_operator.py should that be welcome. > > > > > > > > > References: > > > * https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html > > > * > > > > > > https://airflow.readthedocs.io/en/latest/_modules/airflow/operators/s3_to_redshift_operator.html > > > > > > * > > > > > > https://github.com/apache/airflow/blob/master/airflow/operators/s3_to_redshift_operator.py > > > > >