[ https://issues.apache.org/jira/browse/BEAM-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chamikara Jayalath resolved BEAM-5434. -------------------------------------- Resolution: Fixed Fix Version/s: Not applicable > Issue with BigQueryIO in Template > --------------------------------- > > Key: BEAM-5434 > URL: https://issues.apache.org/jira/browse/BEAM-5434 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Affects Versions: 2.5.0 > Reporter: Amarendra Kumar > Assignee: Chamikara Jayalath > Priority: Major > Fix For: Not applicable > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I am trying to build a google Dataflow template to be run from a cloud > function. > The issue is with BigQueryIO trying execute a SQL. > The opening step for my Dataflow Template is > {code:java} > BigQueryIO.readTableRows().withQueryLocation("US").withoutValidation().fromQuery(options.getSql()).usingStandardSql() > {code} > When the template is triggered for the first time its running fine. > But when its triggered for the second time, it fails with the following error. > {code} > // Some comments here > java.io.FileNotFoundException: No files matched spec: > gs://test-notification/temp/Notification/BigQueryExtractTemp/34d42a122600416c9ea748a6e325f87a/000000000000.avro > at > org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172) > at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158) > at > org.apache.beam.sdk.io.FileBasedSource.createReader(FileBasedSource.java:329) > at > com.google.cloud.dataflow.worker.WorkerCustomSources$1.iterator(WorkerCustomSources.java:360) > at > com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:177) > at > com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158) > at > com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75) > at > com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391) > at > com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360) > at > com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > In the second run, why is the process expecting a file in the GCS location? > This file does get created while the job is running at the first run, but it > also gets deleted after the job is complete. > How are the two jobs related? > Could you please let me know if I am missing something or this is a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)