[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393781&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393781 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 22:06 Start Date: 26/Feb/20 22:06 Worklog Time Spent: 10m Work Description: iht commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-591668641 Thank you! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393781) Time Spent: 4h 40m (was: 4.5h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393725 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 19:41 Start Date: 26/Feb/20 19:41 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393725) Time Spent: 4.5h (was: 4h 20m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393724&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393724 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 19:40 Start Date: 26/Feb/20 19:40 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-591606577 Error for failing task seems to be unrelated. Task 'javaPreCommitPortabilityApiJava11' not found in root project 'beam'. org.gradle.execution.TaskSelectionException: Task 'javaPreCommitPortabilityApiJava11' not found in root project 'beam' Merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393724) Time Spent: 4h 20m (was: 4h 10m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393676 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 18:20 Start Date: 26/Feb/20 18:20 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-591569326 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393676) Time Spent: 4h 10m (was: 4h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393073&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393073 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 00:10 Start Date: 26/Feb/20 00:10 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-591155029 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393073) Time Spent: 4h (was: 3h 50m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=393062&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393062 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 26/Feb/20 00:06 Start Date: 26/Feb/20 00:06 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-591152984 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 393062) Time Spent: 3h 50m (was: 3h 40m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=392758&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392758 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 25/Feb/20 18:25 Start Date: 25/Feb/20 18:25 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-590996787 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 392758) Time Spent: 3.5h (was: 3h 20m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=392759&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392759 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 25/Feb/20 18:25 Start Date: 25/Feb/20 18:25 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-590996862 Run Dataflow ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 392759) Time Spent: 3h 40m (was: 3.5h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=392756&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392756 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 25/Feb/20 18:24 Start Date: 25/Feb/20 18:24 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-590996495 LGTM. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 392756) Time Spent: 3h 10m (was: 3h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=392757&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392757 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 25/Feb/20 18:24 Start Date: 25/Feb/20 18:24 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-590996675 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 392757) Time Spent: 3h 20m (was: 3h 10m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=390319&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390319 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 20/Feb/20 23:44 Start Date: 20/Feb/20 23:44 Worklog Time Spent: 10m Work Description: aaltay commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-589420201 R: @chamikaramj / @pabloem -- could you please take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390319) Time Spent: 3h (was: 2h 50m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=383681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383681 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 07/Feb/20 15:53 Start Date: 07/Feb/20 15:53 Worklog Time Spent: 10m Work Description: iht commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-583465741 I have now addressed all your comments @chamikaramj Please have a look at the new changes. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383681) Time Spent: 2h 50m (was: 2h 40m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=383680&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383680 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 07/Feb/20 15:52 Start Date: 07/Feb/20 15:52 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r376464998 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySourceDef.java ## @@ -115,19 +127,26 @@ TableReference getTableReference(BigQueryOptions bqOptions, String stepUuid) useLegacySql, priority, location, +queryTempDataset, kmsKey); } void cleanupTempResource(BigQueryOptions bqOptions, String stepUuid) throws Exception { +Optional queryTempDatasetOpt = Optional.ofNullable(queryTempDataset); TableReference tableToRemove = createTempTableReference( -bqOptions.getProject(), createJobIdToken(bqOptions.getJobName(), stepUuid)); +bqOptions.getProject(), +createJobIdToken(bqOptions.getJobName(), stepUuid), +queryTempDatasetOpt); BigQueryServices.DatasetService tableService = bqServices.getDatasetService(bqOptions); LOG.info("Deleting temporary table with query results {}", tableToRemove); tableService.deleteTable(tableToRemove); -LOG.info("Deleting temporary dataset with query results {}", tableToRemove.getDatasetId()); -tableService.deleteDataset(tableToRemove.getProjectId(), tableToRemove.getDatasetId()); +if (queryTempDatasetOpt.isPresent()) { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383680) Time Spent: 2h 40m (was: 2.5h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=383678&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383678 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 07/Feb/20 15:51 Start Date: 07/Feb/20 15:51 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r376464384 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -1502,6 +1521,20 @@ public TableReference getTable() { return toBuilder().setQueryLocation(location).build(); } +/** + * Temporary dataset reference when using {@link #fromQuery(String)}. When reading from a query, + * BigQuery will create a temporary dataset and a temporary table to store the results of the + * query. With this option, you can set an existing dataset to create the temporary table. + * BigQueryIO will create a temporary table in that dataset, and will remove it once it is not + * needed. No other tables in the dataset will be modified. If your job does not have + * permissions to create a new dataset, and you want to use {@link #fromQuery(String)} (for + * instance, to read from a view), you should use this option. Remember that the dataset must + * exist and your job needs permissions to create and remove tables inside that dataset. Review comment: I have just added two commits. If the user specifies the temp dataset and it is using `fromQuery` * check that the specified dataset exists * check that the destination table does not exist, to avoid overwriting any existing table in the dataset specified by the user (unlikely, due to the random generation of uuids for the temp tables, but not impossible) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383678) Time Spent: 2h 20m (was: 2h 10m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=383679&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383679 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 07/Feb/20 15:51 Start Date: 07/Feb/20 15:51 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r376464607 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -1342,16 +1354,23 @@ void cleanup(ContextContainer c) throws Exception { BigQueryOptions options = c.getPipelineOptions().as(BigQueryOptions.class); String jobUuid = c.getJobId(); + Optional queryTempDataset = Optional.ofNullable(getQueryTempDataset()); Review comment: Commit added for that. See also reply to another of your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383679) Time Spent: 2.5h (was: 2h 20m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382382 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 16:57 Start Date: 05/Feb/20 16:57 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r375382122 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySourceDef.java ## @@ -115,19 +127,26 @@ TableReference getTableReference(BigQueryOptions bqOptions, String stepUuid) useLegacySql, priority, location, +queryTempDataset, kmsKey); } void cleanupTempResource(BigQueryOptions bqOptions, String stepUuid) throws Exception { +Optional queryTempDatasetOpt = Optional.ofNullable(queryTempDataset); TableReference tableToRemove = createTempTableReference( -bqOptions.getProject(), createJobIdToken(bqOptions.getJobName(), stepUuid)); +bqOptions.getProject(), +createJobIdToken(bqOptions.getJobName(), stepUuid), +queryTempDatasetOpt); BigQueryServices.DatasetService tableService = bqServices.getDatasetService(bqOptions); LOG.info("Deleting temporary table with query results {}", tableToRemove); tableService.deleteTable(tableToRemove); -LOG.info("Deleting temporary dataset with query results {}", tableToRemove.getDatasetId()); -tableService.deleteDataset(tableToRemove.getProjectId(), tableToRemove.getDatasetId()); +if (queryTempDatasetOpt.isPresent()) { Review comment: Ok. Will push a commit with that change, and a bug fix in line 138. The logic in the if is wrong, it should when the optional is *not* present (meaning, that the user did not specify the dataset, and therefore Beam created it). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382382) Time Spent: 2h 10m (was: 2h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382141&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382141 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 08:26 Start Date: 05/Feb/20 08:26 Worklog Time Spent: 10m Work Description: iht commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-582294711 I am working on addressing the suggestions from the code review, and submitting additional commits. Still WIP. I will write again once I have submitted all the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382141) Time Spent: 2h (was: 1h 50m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382139&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382139 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 08:20 Start Date: 05/Feb/20 08:20 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r375112195 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySourceDef.java ## @@ -68,13 +78,15 @@ private BigQueryQuerySourceDef( Boolean useLegacySql, BigQueryIO.TypedRead.QueryPriority priority, String location, + String queryTempDataset, Review comment: Changed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382139) Time Spent: 1h 50m (was: 1h 40m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382136&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382136 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 08:15 Start Date: 05/Feb/20 08:15 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r375110144 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java ## @@ -692,8 +693,9 @@ static String getExtractJobId(String jobIdToken) { return String.format("%s-extract", jobIdToken); } - static TableReference createTempTableReference(String projectId, String jobUuid) { -String queryTempDatasetId = "temp_dataset_" + jobUuid; + static TableReference createTempTableReference( + String projectId, String jobUuid, Optional queryTempDatasetIdOpt) { Review comment: Changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382136) Time Spent: 1h 40m (was: 1.5h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382134 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 08:15 Start Date: 05/Feb/20 08:15 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r375109916 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -274,12 +275,12 @@ * .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]")); * } * - * Users can optionally specify a query priority using {@link TypedRead#withQueryPriority( - * TypedRead.QueryPriority)} and a geographic location where the query will be executed using {@link - * TypedRead#withQueryLocation(String)}. Query location must be specified for jobs that are not - * executed in US or EU, or if you are reading from an authorized view. See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query";>BigQuery Jobs: - * query. + * Users can optionally specify a query priority using {@link + * TypedRead#withQueryPriority(TypedRead.QueryPriority)} and a geographic location where the query + * will be executed using {@link TypedRead#withQueryLocation(String)}. Query location must be + * specified for jobs that are not executed in US or EU, or if you are reading from an authorized Review comment: No, I accidentally reformatted these lines, that's why they are in the PR. Let me see if I can undo these changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382134) Time Spent: 1h 20m (was: 1h 10m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=382135&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382135 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 05/Feb/20 08:15 Start Date: 05/Feb/20 08:15 Worklog Time Spent: 10m Work Description: iht commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r375110051 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -1342,16 +1354,23 @@ void cleanup(ContextContainer c) throws Exception { BigQueryOptions options = c.getPipelineOptions().as(BigQueryOptions.class); String jobUuid = c.getJobId(); + Optional queryTempDataset = Optional.ofNullable(getQueryTempDataset()); Review comment: Got it, let me add that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382135) Time Spent: 1.5h (was: 1h 20m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381134 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374276451 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySourceDef.java ## @@ -115,19 +127,26 @@ TableReference getTableReference(BigQueryOptions bqOptions, String stepUuid) useLegacySql, priority, location, +queryTempDataset, kmsKey); } void cleanupTempResource(BigQueryOptions bqOptions, String stepUuid) throws Exception { +Optional queryTempDatasetOpt = Optional.ofNullable(queryTempDataset); TableReference tableToRemove = createTempTableReference( -bqOptions.getProject(), createJobIdToken(bqOptions.getJobName(), stepUuid)); +bqOptions.getProject(), +createJobIdToken(bqOptions.getJobName(), stepUuid), +queryTempDatasetOpt); BigQueryServices.DatasetService tableService = bqServices.getDatasetService(bqOptions); LOG.info("Deleting temporary table with query results {}", tableToRemove); tableService.deleteTable(tableToRemove); -LOG.info("Deleting temporary dataset with query results {}", tableToRemove.getDatasetId()); -tableService.deleteDataset(tableToRemove.getProjectId(), tableToRemove.getDatasetId()); +if (queryTempDatasetOpt.isPresent()) { Review comment: Probably extract "queryTempDatasetOpt.isPresent()" to an instance variable boolean datasetProvidedByUser (before this point). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381134) Time Spent: 1h (was: 50m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381137&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381137 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374270958 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java ## @@ -692,8 +693,9 @@ static String getExtractJobId(String jobIdToken) { return String.format("%s-extract", jobIdToken); } - static TableReference createTempTableReference(String projectId, String jobUuid) { -String queryTempDatasetId = "temp_dataset_" + jobUuid; + static TableReference createTempTableReference( + String projectId, String jobUuid, Optional queryTempDatasetIdOpt) { Review comment: Just "tempDatasetId" should be good I think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381137) Time Spent: 1h 10m (was: 1h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381136&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381136 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374275369 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -1502,6 +1521,20 @@ public TableReference getTable() { return toBuilder().setQueryLocation(location).build(); } +/** + * Temporary dataset reference when using {@link #fromQuery(String)}. When reading from a query, + * BigQuery will create a temporary dataset and a temporary table to store the results of the + * query. With this option, you can set an existing dataset to create the temporary table. + * BigQueryIO will create a temporary table in that dataset, and will remove it once it is not + * needed. No other tables in the dataset will be modified. If your job does not have + * permissions to create a new dataset, and you want to use {@link #fromQuery(String)} (for + * instance, to read from a view), you should use this option. Remember that the dataset must + * exist and your job needs permissions to create and remove tables inside that dataset. Review comment: We should also make sure that any table that Beam create or delete dynamically does not conflict with an existing table in the Dataset (at runtime). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381136) Time Spent: 1h 10m (was: 1h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381138&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381138 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374273965 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -1342,16 +1354,23 @@ void cleanup(ContextContainer c) throws Exception { BigQueryOptions options = c.getPipelineOptions().as(BigQueryOptions.class); String jobUuid = c.getJobId(); + Optional queryTempDataset = Optional.ofNullable(getQueryTempDataset()); Review comment: If dataset is provided by the user we should try to validate (before pipeline submission) that it exists. (unless user specified withoutValidation()) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381138) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381139&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381139 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374272780 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java ## @@ -274,12 +275,12 @@ * .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]")); * } * - * Users can optionally specify a query priority using {@link TypedRead#withQueryPriority( - * TypedRead.QueryPriority)} and a geographic location where the query will be executed using {@link - * TypedRead#withQueryLocation(String)}. Query location must be specified for jobs that are not - * executed in US or EU, or if you are reading from an authorized view. See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query";>BigQuery Jobs: - * query. + * Users can optionally specify a query priority using {@link + * TypedRead#withQueryPriority(TypedRead.QueryPriority)} and a geographic location where the query + * will be executed using {@link TypedRead#withQueryLocation(String)}. Query location must be + * specified for jobs that are not executed in US or EU, or if you are reading from an authorized Review comment: Did you mean to mention "withQueryTempDataset" here ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381139) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=381135&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381135 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 03/Feb/20 18:51 Start Date: 03/Feb/20 18:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#discussion_r374275949 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySourceDef.java ## @@ -68,13 +78,15 @@ private BigQueryQuerySourceDef( Boolean useLegacySql, BigQueryIO.TypedRead.QueryPriority priority, String location, + String queryTempDataset, Review comment: Just "tempDatasetId" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 381135) Time Spent: 1h 10m (was: 1h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=380548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380548 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 02/Feb/20 15:44 Start Date: 02/Feb/20 15:44 Worklog Time Spent: 10m Work Description: iht commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-581147697 (PR text updated to provide more details about the intent of this PR) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 380548) Time Spent: 50m (was: 40m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=370033&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370033 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 10/Jan/20 19:51 Start Date: 10/Jan/20 19:51 Worklog Time Spent: 10m Work Description: iht commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-573181124 I will resolve the conflicts, and will add some more docs for this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370033) Time Spent: 40m (was: 0.5h) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries
[ https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=367895&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367895 ] ASF GitHub Bot logged work on BEAM-8458: Author: ASF GitHub Bot Created on: 08/Jan/20 00:57 Start Date: 08/Jan/20 00:57 Worklog Time Spent: 10m Work Description: stale[bot] commented on issue #9852: [BEAM-8458] Add option to set temp dataset in BigQueryIO.Read URL: https://github.com/apache/beam/pull/9852#issuecomment-571842474 This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the d...@beam.apache.org list. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 367895) Time Spent: 0.5h (was: 20m) > BigQueryIO.Read needs permissions to create datasets to be able to run queries > -- > > Key: BEAM-8458 > URL: https://issues.apache.org/jira/browse/BEAM-8458 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Israel Herraiz >Assignee: Israel Herraiz >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the > results of the query. > Therefore, Beam requires permissions to create datasets just to be able to > run a query. In practice, this means that Beam requires the role > bigQuery.User just to run queries, whereas if you use {{from}} (to read from > a table), the role bigQuery.jobUser suffices. > BigQueryIO.Read should have an option to set an existing dataset to write > the temp results of > a query, so it would be enough with having the role bigQuery.jobUser. -- This message was sent by Atlassian Jira (v8.3.4#803005)