[ 
https://issues.apache.org/jira/browse/AIRFLOW-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749932#comment-16749932
 ] 

ASF GitHub Bot commented on AIRFLOW-2508:
-----------------------------------------

ashb commented on pull request #4292: [AIRFLOW-2508] Handle non string types in 
Operators templatized fields
URL: https://github.com/apache/airflow/pull/4292
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Handle non string types in render_template_from_field
> -----------------------------------------------------
>
>                 Key: AIRFLOW-2508
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2508
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: models
>    Affects Versions: 2.0.0
>            Reporter: Eugene Brown
>            Assignee: Galak
>            Priority: Minor
>              Labels: easyfix, newbie
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The render_template_from_field method of the BaseOperator class raises an 
> exception when it encounters content that is not a string_type, list, tuple 
> or dict.
> Example exception:
> {noformat}
> airflow.exceptions.AirflowException: Type '<type 'int'>' used for parameter 
> 'job_flow_overrides[Instances][InstanceGroups][InstanceCount]' is not 
> supported for templating{noformat}
> I propose instead that when it encounters content of other types it returns 
> the content unchanged, rather than raising an exception.
> Consider this case: I extended the EmrCreateJobFlowOperator to make the 
> job_flow_overrides argument a templatable field. job_flow_overrides is a 
> dictionary with a mix of strings, integers and booleans for values.
> When I extended the class as such:
> {code:java}
> class EmrCreateJobFlowOperatorTemplateOverrides(EmrCreateJobFlowOperator):
>     template_fields = ['job_flow_overrides']{code}
> And added a task to my dag with this format:
> {code:java}
> step_create_cluster = EmrCreateJobFlowOperatorTemplateOverrides(
>     task_id="create_cluster",
>     job_flow_overrides={
>         "Name": "my-cluster {{ dag_run.conf['run_date'] }}",
>         "Instances": {
>             "InstanceGroups": [
>                 {
>                     "Name": "Master nodes",
>                     "InstanceType": "c3.4xlarge",
>                     "InstanceCount": 1
>                 },
>                 {
>                     "Name": "Slave nodes",
>                     "InstanceType": "c3.4xlarge",
>                     "InstanceCount": 4
>                 },
>                 "TerminationProtected": False
>             ]
>         },
>         "BootstrapActions": [{
>              "Name": "Custom action",
>              "ScriptBootstrapAction": {
>                  "Path": "s3://repo/{{ dag_run.conf['branch'] 
> }}/requirements.txt"
>              }
>         }],
>    },
>    aws_conn_id='aws_default',
>    emr_conn_id='aws_default',
>    dag=dag
> )
> {code}
> The exception I gave above was raised and the step failed. I think it would 
> be preferable for the method to instead pass over numeric and boolean values 
> as users may want to use template_fields in the way I have to template string 
> values in dictionaries or lists of mixed types.
> Here is the render_template_from_field method from the BaseOperator:
> {code:java}
> def render_template_from_field(self, attr, content, context, jinja_env):
>     """
>     Renders a template from a field. If the field is a string, it will
>     simply render the string and return the result. If it is a collection or
>     nested set of collections, it will traverse the structure and render
>     all strings in it.
>     """
>     rt = self.render_template
>     if isinstance(content, six.string_types):
>         result = jinja_env.from_string(content).render(**context)
>     elif isinstance(content, (list, tuple)):
>         result = [rt(attr, e, context) for e in content]
>     elif isinstance(content, dict):
>         result = {
>             k: rt("{}[{}]".format(attr, k), v, context)
>             for k, v in list(content.items())}
>     else:
>         param_type = type(content)
>         msg = (
>             "Type '{param_type}' used for parameter '{attr}' is "
>             "not supported for templating").format(**locals())
>         raise AirflowException(msg)
>     return result{code}
>  I propose that the method returns content unchanged if the content is of one 
> of (int, float, complex, bool) types. So my solution would include an extra 
> elif in the form:
> {code}
> elif isinstance(content, (int, float, complex, bool)):
>     result = content
> {code}
>  Are there any reasons this would be a bad idea?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to