Maybe just to clarify, to connect to Redshift and issue a COPY, you’ll need a Redshift username and password. You would store that in a Postgres connection. This is a un/pw in Redshift, not AWS creds. The SQL text needed to issue the COPY requires either AWS creds or the arn of a role to use.
Andrew On Wed, Jan 30, 2019 at 6:52 PM Andrew Harmon <andrewhhar...@gmail.com> wrote: > Hi, I extended the Redshift operator to pull the Role needed for the copy > from the connection object. I stored the role arm in the extras of the > connection. Works well so far. I’m not sure if that helps or if you’re > looking for an out of the box solution, but I’d be happy to share my code > with you. > > Andrew > > On Wed, Jan 30, 2019 at 5:57 PM Austin Bennett < > whatwouldausti...@gmail.com> wrote: > >> @ash I'll look into that as an option. Given I am still a novice user, >> I'm >> consistently impressed with the simplicity (once understood) given the >> layers of abstractions. I am not familiar enough with Instance profiles >> to >> say whether that is suitable. >> >> Was reading the copy_query default ( >> >> https://github.com/apache/airflow/blob/master/airflow/operators/s3_to_redshift_operator.py#L93 >> ): >> copy_query = """ >> COPY {schema}.{table} >> FROM 's3://{s3_bucket}/{s3_key}/{table}' >> with credentials >> 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}' >> {copy_options}; >> """.format(schema=self.schema, >> table=self.table, >> s3_bucket=self.s3_bucket, >> s3_key=self.s3_key, >> access_key=credentials.access_key, >> secret_key=credentials.secret_key, >> copy_options=copy_options) >> >> Which then seems like there could be issues if not providing keys. And >> ultimately I'd like to issue an alternate (albeit similar) copy_query: >> >> Currently, for perhaps 100 tables, am issuing the command "COPY >> <TABLENAME> >> FROM 's3://<LOCATION>' iam_role >> 'arn:aws:iam::<####>:role/<redshift-copy-unload>' <OPTIONS DELIMITER GZIP, >> etc etc>;". >> >> Or, from their docs (linked in this thread): >> >> copy catdemo >> from 's3://awssampledbuswest2/tickit/category_pipe.txt' >> iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>' >> region 'us-west-2'; >> >> >> Was anchoring on getting the exact same statement built, which seems >> straight forward (I could sketch that out, if you think that'd be of >> interest). Perhaps that goal makes sense (whether that is the best >> solution is certainly a different story). >> >> I must acknowledge inexperience with Airflow, so believe there is more to >> it than my simplistic approach. Redshift with Airflow is queued for >> another couple weeks (haven't even touched Redshift in months), was trying >> to get ahead of things. I can start with an empty connection type and >> reading up on instance profiles. >> >> Thank you for the input/feedback! >> >> >> >> >> On Wed, Jan 30, 2019 at 1:56 PM Ash Berlin-Taylor <a...@apache.org> wrote: >> >> > If you create an "empty" connection of type "AWS" (i.e. don't specify a >> > username or password) then the AWSHook/S3Hook will use instance >> profiles. >> > >> > Is that what you want? >> > >> > -ash >> > >> > > On 30 Jan 2019, at 18:45, Austin Bennett <whatwouldausti...@gmail.com >> > >> > wrote: >> > > >> > > Have started to push our group to standardizing on airflow. We still >> > have >> > > a few large Redshift clusters. >> > > >> > > The s3_to_redshift_operator.py only appears to be written to >> authenticate >> > > via secret/access-keys. We no longer use Key Based authentication and >> > rely >> > > upon Role Based, therefore IAM groups. >> > > >> > > What would the recommended way - in airflow - to rely upon IAM for >> > > authentication for COPY Commands to Redshift? Should I rely on a >> plugin? >> > > I'd be interested to contribute to the project by updating/extending >> > > s3_to_redshift_operator.py should that be welcome. >> > > >> > > >> > > References: >> > > * https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html >> > > * >> > > >> > >> https://airflow.readthedocs.io/en/latest/_modules/airflow/operators/s3_to_redshift_operator.html >> > > >> > > * >> > > >> > >> https://github.com/apache/airflow/blob/master/airflow/operators/s3_to_redshift_operator.py >> > >> > >> > -- > Sent from Gmail Mobile >