Variable is either all encrypted as a single blob or all plain. I think changing to use a single EMR connection type and require the key/secret/role info to be in that connection makes the most sense.
-a > On 15 Apr 2019, at 12:03, Daniel Mateus Pires <dmate...@gmail.com> wrote: > > In our company we use EMR based operators a lot and it's always been > confusing for new users to find the different kinds of EMR clusters as > "Connections". > > Not sure you could just remove the aws_conn_id, because the emr_conn_id > doesn't define which AWS account, which region, which profile to use etc.. > this is the role of the aws_conn_id > > I think Variables would make more sense, although I'm not super familiar > with Variables either (can they hide some values?) I'm asking because it's > common for us to have Hive metastore username and password inside the EMR > definition, so at least Airflow Connections would hide that. > > On Mon, 15 Apr 2019 at 11:52, Ash Berlin-Taylor <a...@apache.org> wrote: > >> Or we should remove the aws_conn_id from the Emr* (hook and op) rather >> than passing in two connection types. >> >> Anyone have a though as to which way to go? >> >>> On 15 Apr 2019, at 11:51, Ash Berlin-Taylor <a...@apache.org> wrote: >>> >>> We have an EMR connection type, but the operator actually uses this as a >> config value, and the actual credentials come form the default aws_conn_id: >>> >>> def __init__( >>> self, >>> aws_conn_id='s3_default', >>> emr_conn_id='emr_default', >>> job_flow_overrides=None, >>> region_name=None, >>> *args, **kwargs): >>> >>> Oh also: that _should_ not say 's3_default' anymore :D >>> >>> I would like to propose then that we remove the emr_default conneciton, >> and any reference to a connection in the EMR* Operators, and instead change >> the EMR config to come from a Variables instead. >>> >>> -ash >> >>