[ https://issues.apache.org/jira/browse/BEAM-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Hulette updated BEAM-9043: -------------------------------- Labels: (was: stale-P2) > BigQueryIO fails cryptically if gcpTempLocation is set and tempLocation is not > ------------------------------------------------------------------------------ > > Key: BEAM-9043 > URL: https://issues.apache.org/jira/browse/BEAM-9043 > Project: Beam > Issue Type: Bug > Components: io-java-gcp > Reporter: Brian Hulette > Priority: P2 > > The following error arises when running a pipeline that uses BigQueryIO with > gcpTempLocation set and tempLocation not set. We should either handle this > case gracefully, or throw a more helpful error like "please specify > tempLocation". > {code:java} > 2019-12-24 13:06:18 WARN UnboundedReadFromBoundedSource:152 - Exception > while splitting > org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource@5d21202d, skips the > initial splits. > java.lang.NullPointerException > at java.util.regex.Matcher.getTextLength(Matcher.java:1283) > at java.util.regex.Matcher.reset(Matcher.java:309) > at java.util.regex.Matcher.<init>(Matcher.java:229) > at java.util.regex.Pattern.matcher(Pattern.java:1093) > at > org.apache.beam.sdk.io.FileSystems.parseScheme(FileSystems.java:447) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:533) > at > org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.resolveTempLocation(BigQueryHelpers.java:706) > at > org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.extractFiles(BigQuerySourceBase.java:125) > at > org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.split(BigQuerySourceBase.java:148) > at > org.apache.beam.runners.core.construction.UnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter.split(UnboundedReadFromBoundedSource.java:144) > at > org.apache.beam.runners.dataflow.internal.CustomSources.serializeToCloudSource(CustomSources.java:87) > at > org.apache.beam.runners.dataflow.ReadTranslator.translateReadHelper(ReadTranslator.java:51) > at > org.apache.beam.runners.dataflow.DataflowRunner$StreamingUnboundedRead$ReadWithIdsTranslator.translate(DataflowRunner.java:1590) > at > org.apache.beam.runners.dataflow.DataflowRunner$StreamingUnboundedRead$ReadWithIdsTranslator.translate(DataflowRunner.java:1587) > at > org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.visitPrimitiveTransform(DataflowPipelineTranslator.java:475) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317) > at > org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251) > at > org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) > at > org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.translate(DataflowPipelineTranslator.java:414) > at > org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translate(DataflowPipelineTranslator.java:173) > at > org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:763) > at > org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:186) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)