[jira] [Assigned] (AIRFLOW-2009) DataFlowHook does not use correct service account

2019-01-20 Thread Feng Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Lu reassigned AIRFLOW-2009:


Assignee: Feng Lu

> DataFlowHook does not use correct service account
> -
>
> Key: AIRFLOW-2009
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2009
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: Dataflow, hooks
>Affects Versions: 2.0.0
>Reporter: Jessica Laughlin
>Assignee: Feng Lu
>Priority: Major
>
> We have been using the DataFlowOperator to schedule DataFlow jobs.
> We found that the DataFlowHook used by the DataFlowOperator doesn't actually 
> use the passed `gcp_conn_id` to schedule the DataFlow job, but only to read 
> the results after. 
> code 
> (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataflow_hook.py#L158):
> _Dataflow(cmd).wait_for_done()
> _DataflowJob(self.get_conn(), variables['project'],
>  name, self.poll_sleep).wait_for_done()
> The first line here should also be using self.get_conn(). 
> For this reason, our tasks using the DataFlowOperator have actually been 
> using the default Google Compute Engine service account (which has DataFlow 
> permissions) to schedule DataFlow jobs. It is only when our provided service 
> account (which does not have DataFlow permissions) is used in the second line 
> that we are seeing a permissions error. 
> I would like to fix this bug, but have to work around it at the moment due to 
> time constraints. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2009) DataFlowHook does not use correct service account

2019-01-17 Thread Wilson Lian (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilson Lian reassigned AIRFLOW-2009:


Assignee: (was: Wilson Lian)

> DataFlowHook does not use correct service account
> -
>
> Key: AIRFLOW-2009
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2009
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: Dataflow, hooks
>Affects Versions: 2.0.0
>Reporter: Jessica Laughlin
>Priority: Major
>
> We have been using the DataFlowOperator to schedule DataFlow jobs.
> We found that the DataFlowHook used by the DataFlowOperator doesn't actually 
> use the passed `gcp_conn_id` to schedule the DataFlow job, but only to read 
> the results after. 
> code 
> (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataflow_hook.py#L158):
> _Dataflow(cmd).wait_for_done()
> _DataflowJob(self.get_conn(), variables['project'],
>  name, self.poll_sleep).wait_for_done()
> The first line here should also be using self.get_conn(). 
> For this reason, our tasks using the DataFlowOperator have actually been 
> using the default Google Compute Engine service account (which has DataFlow 
> permissions) to schedule DataFlow jobs. It is only when our provided service 
> account (which does not have DataFlow permissions) is used in the second line 
> that we are seeing a permissions error. 
> I would like to fix this bug, but have to work around it at the moment due to 
> time constraints. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2009) DataFlowHook does not use correct service account

2019-01-17 Thread Wilson Lian (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilson Lian reassigned AIRFLOW-2009:


Assignee: Wilson Lian

> DataFlowHook does not use correct service account
> -
>
> Key: AIRFLOW-2009
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2009
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: Dataflow, hooks
>Affects Versions: 2.0.0
>Reporter: Jessica Laughlin
>Assignee: Wilson Lian
>Priority: Major
>
> We have been using the DataFlowOperator to schedule DataFlow jobs.
> We found that the DataFlowHook used by the DataFlowOperator doesn't actually 
> use the passed `gcp_conn_id` to schedule the DataFlow job, but only to read 
> the results after. 
> code 
> (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataflow_hook.py#L158):
> _Dataflow(cmd).wait_for_done()
> _DataflowJob(self.get_conn(), variables['project'],
>  name, self.poll_sleep).wait_for_done()
> The first line here should also be using self.get_conn(). 
> For this reason, our tasks using the DataFlowOperator have actually been 
> using the default Google Compute Engine service account (which has DataFlow 
> permissions) to schedule DataFlow jobs. It is only when our provided service 
> account (which does not have DataFlow permissions) is used in the second line 
> that we are seeing a permissions error. 
> I would like to fix this bug, but have to work around it at the moment due to 
> time constraints. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)