Joel Croteau created AIRFLOW-4408:
-------------------------------------

             Summary: Add template expansion support to properties arguments 
for DataProc operators
                 Key: AIRFLOW-4408
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4408
             Project: Apache Airflow
          Issue Type: Improvement
          Components: contrib, gcp, operators
    Affects Versions: 1.10.3
            Reporter: Joel Croteau


Most GCP Dataproc operators take a `properties` argument of some sort, in the 
form of a `dict` of string->string that sets details of how the job is to be 
run on the cluster and other useful properties. It would be very nice if we 
could expand templates in the components of this `dict` in the same way we do 
for other string arguments. In particular, I am enabling log aggregation on a 
cluster, and would like to specify `yarn:yarn.nodemanager.remote-app-log-dir` 
to include the cluster name, which itself includes a template for the run date.

It seems like it would be easy enough to iterate through the keys and values of 
the dict and build a new dict with the template-expanded results. I guess one 
potential problem with this is it might be possible that distinct templated 
keys might resolve to the same string after expansion, but this seems like a 
corner case we could deal with by raising an exception, or even just not expand 
keys and only expand values, as I'm not really sure how useful templated 
property names would actually be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to