[jira] [Commented] (BEAM-6910) Beam does not consider BigQuery's processing location when getting query results

2020-06-01 Thread Beam JIRA Bot (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17122875#comment-17122875
 ] 

Beam JIRA Bot commented on BEAM-6910:
-

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> Beam does not consider BigQuery's processing location when getting query 
> results
> 
>
> Key: BEAM-6910
> URL: https://issues.apache.org/jira/browse/BEAM-6910
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, runner-dataflow, sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python
>Reporter: Graham Polley
>Priority: P2
>  Labels: stale-P2
>
> When using the BigQuery source with a SQL query in a pipeline, the 
> "processing location" is not taken into consideration and the pipeline fails.
> For example, consider the following which uses {{BigQuerySource}} to read 
> from BigQuery using some SQL. The BigQuery dataset and tables are located in 
> {{australia-southeast1}}. The query is submitted successfully ([Beam works 
> out the processing location by examining the first table referenced in the 
> query and sets it 
> accordingly|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L221]),
>  but when Beam attempts to poll for the job status after it has been 
> submitted, it fails because it doesn't set the {{location}} to be 
> {{australia-southeast1}}, which is required by BigQuery:
>  
> {code:java}
> p | 'read' >> beam.io.Read(beam.io.BigQuerySource(use_standard_sql=True, 
> query='SELECT * from 
> `a_project_id.dataset_in_australia.table_in_australia`'){code}
>  
> {code:java}
> HttpNotFoundError: HttpError accessing 
> :
>  response: <{'status': '404', 'content-length': '328', 'x-xss-protection': 
> '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 
> 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', 
> '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Tue, 26 Mar 
> 2019 03:11:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'quic=":443"; 
> ma=2592000; v="46,44,43,39"', 'content-type': 'application/json; 
> charset=UTF-8'}>, content <{
>   "error": {
>     "code": 404,
>     "message": "Not found: Job a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "errors": [
>       {
>     "message": "Not found: Job 
> a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "domain": "global",
>     "reason": "notFound"
>   }
>     ],
>     "status": "NOT_FOUND"
>   }
> }
> {code}
>  
> The problem can be seen/found here:
> [https://github.com/apache/beam/blob/v2.11.0/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L571]
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L357]
> The location of the job (in this case {{australia-southeast1}}) needs to 
> set/inferred (or exposed via the API), otherwise its fails.
>  For reference, Airflow had the same bug/problem: 
> [https://github.com/apache/airflow/pull/4695]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6910) Beam does not consider BigQuery's processing location when getting query results

2019-03-29 Thread niklas Hansson (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804669#comment-16804669
 ] 

niklas Hansson commented on BEAM-6910:
--

As far as I understand this issue is solved in  BEAM-6909. I will leave this 
task. Let me know if there is anything that needs to be done. 

> Beam does not consider BigQuery's processing location when getting query 
> results
> 
>
> Key: BEAM-6910
> URL: https://issues.apache.org/jira/browse/BEAM-6910
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, runner-dataflow, sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python
>Reporter: Graham Polley
>Assignee: niklas Hansson
>Priority: Major
>
> When using the BigQuery source with a SQL query in a pipeline, the 
> "processing location" is not taken into consideration and the pipeline fails.
> For example, consider the following which uses {{BigQuerySource}} to read 
> from BigQuery using some SQL. The BigQuery dataset and tables are located in 
> {{australia-southeast1}}. The query is submitted successfully ([Beam works 
> out the processing location by examining the first table referenced in the 
> query and sets it 
> accordingly|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L221]),
>  but when Beam attempts to poll for the job status after it has been 
> submitted, it fails because it doesn't set the {{location}} to be 
> {{australia-southeast1}}, which is required by BigQuery:
>  
> {code:java}
> p | 'read' >> beam.io.Read(beam.io.BigQuerySource(use_standard_sql=True, 
> query='SELECT * from 
> `a_project_id.dataset_in_australia.table_in_australia`'){code}
>  
> {code:java}
> HttpNotFoundError: HttpError accessing 
> :
>  response: <{'status': '404', 'content-length': '328', 'x-xss-protection': 
> '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 
> 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', 
> '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Tue, 26 Mar 
> 2019 03:11:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'quic=":443"; 
> ma=2592000; v="46,44,43,39"', 'content-type': 'application/json; 
> charset=UTF-8'}>, content <{
>   "error": {
>     "code": 404,
>     "message": "Not found: Job a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "errors": [
>       {
>     "message": "Not found: Job 
> a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "domain": "global",
>     "reason": "notFound"
>   }
>     ],
>     "status": "NOT_FOUND"
>   }
> }
> {code}
>  
> The problem can be seen/found here:
> [https://github.com/apache/beam/blob/v2.11.0/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L571]
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L357]
> The location of the job (in this case {{australia-southeast1}}) needs to 
> set/inferred (or exposed via the API), otherwise its fails.
>  For reference, Airflow had the same bug/problem: 
> [https://github.com/apache/airflow/pull/4695]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6910) Beam does not consider BigQuery's processing location when getting query results

2019-03-28 Thread niklas Hansson (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803980#comment-16803980
 ] 

niklas Hansson commented on BEAM-6910:
--

I will start to look at this! 

> Beam does not consider BigQuery's processing location when getting query 
> results
> 
>
> Key: BEAM-6910
> URL: https://issues.apache.org/jira/browse/BEAM-6910
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, runner-dataflow, sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python
>Reporter: Graham Polley
>Assignee: niklas Hansson
>Priority: Major
>
> When using the BigQuery source with a SQL query in a pipeline, the 
> "processing location" is not taken into consideration and the pipeline fails.
> For example, consider the following which uses {{BigQuerySource}} to read 
> from BigQuery using some SQL. The BigQuery dataset and tables are located in 
> {{australia-southeast1}}. The query is submitted successfully ([Beam works 
> out the processing location by examining the first table referenced in the 
> query and sets it 
> accordingly|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L221]),
>  but when Beam attempts to poll for the job status after it has been 
> submitted, it fails because it doesn't set the {{location}} to be 
> {{australia-southeast1}}, which is required by BigQuery:
>  
> {code:java}
> p | 'read' >> beam.io.Read(beam.io.BigQuerySource(use_standard_sql=True, 
> query='SELECT * from 
> `a_project_id.dataset_in_australia.table_in_australia`'){code}
>  
> {code:java}
> HttpNotFoundError: HttpError accessing 
> :
>  response: <{'status': '404', 'content-length': '328', 'x-xss-protection': 
> '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 
> 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', 
> '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Tue, 26 Mar 
> 2019 03:11:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'quic=":443"; 
> ma=2592000; v="46,44,43,39"', 'content-type': 'application/json; 
> charset=UTF-8'}>, content <{
>   "error": {
>     "code": 404,
>     "message": "Not found: Job a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "errors": [
>       {
>     "message": "Not found: Job 
> a_project_id:5ad9cc803baa432290b6cd0203f556d9",
>     "domain": "global",
>     "reason": "notFound"
>   }
>     ],
>     "status": "NOT_FOUND"
>   }
> }
> {code}
>  
> The problem can be seen/found here:
> [https://github.com/apache/beam/blob/v2.11.0/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L571]
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L357]
> The location of the job (in this case {{australia-southeast1}}) needs to 
> set/inferred (or exposed via the API), otherwise its fails.
>  For reference, Airflow had the same bug/problem: 
> [https://github.com/apache/airflow/pull/4695]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)