That was it! Thanks Lukasz. I had to use a custom assembly to get around this. Thanks!
On Thu, Jun 21, 2018 at 3:28 PM Lukasz Cwik <lc...@google.com> wrote: > The FileSystems API uses a ServiceLoader[1] to find Apache Beam FileSystem > implementations. The ServiceLoader works by finding "service" files on the > classpath containing a list of classes implementing the Apache Beam > FileSystem API. The way in which your creating an executable jar is likely > dropping or incorrectly merging service files. The most common case is that > your using the Maven shade plugin and you haven't configured it to use the > services file resource transformer[2]. If you are packaging your executable > jar a different way, you'll want to lookup the documentation for your tool > and see how it can properly deal with the service files. > > 1: https://docs.oracle.com/javase/7/docs/api/java/util/ServiceLoader.html > 2: > https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer > > On Thu, Jun 21, 2018 at 12:06 PM Sameer Abhyankar <saabhyan...@google.com> > wrote: > >> Hello! >> >> I am trying to package a Beam Dataflow pipeline as a self executing jar >> using these >> <https://beam.apache.org/documentation/runners/dataflow/#self-executing-jar> >> instructions. >> However, I am running into a weird issue when attempting to execute this >> jar. >> >> My pipeline needs to read a file (avro schema .avsc) from GCS outside of >> a PCollection before starting to work with PCollections. In order to do >> that I use the FileSystems API. This works perfectly fine when I execute >> the pipeline via mvn compile exec:java .. >> >> However, if I attempt to run this as a jar, it appears to treat the GCS >> path as local and fails with a FileNotFoundException. >> >> *Exception in thread "main" java.io.FileNotFoundException: >> /some/local/filesystem/path/myproject/gs:/my-gcs-bucket/schema/my-schema.avsc >> (No such file or directory)* >> * at java.io.FileInputStream.open0(Native Method)* >> * at java.io.FileInputStream.open(FileInputStream.java:195)* >> * at java.io.FileInputStream.<init>(FileInputStream.java:138)* >> * at >> org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:113)* >> * at org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:78)* >> * at org.apache.beam.sdk.io.FileSystems.open(FileSystems.java:262)* >> >> (Note that the input path is correct with the double slash but the error >> seems to strip that out >> e.g: --inputPath=gs://my-gcs-bucket/schema/my-schema.avsc) >> >> Any pointers on what might be causing this? >> >> Thanks, >> - Sameer >> >