Beam has a file systems abstraction [1] that we use for artifact staging.

The gcs option presumably doesn't have any effect when other file systems
are used.

[1]
https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/io/FileSystems.html

On Tue, Dec 31, 2019 at 9:51 AM Yu Watanabe <[email protected]> wrote:

> Dear developers.
>
> I would like to ask question regarding to artifacts directory for job
> server used in portable runner.
>
> Recently, I had a chat with other beam user and just realized that
> "--artifacts-directory" supports both local filesystem path and gcs
> path .
>
> How does beam handle both local filesystem and gcs for artifacts directory
> ?
>
> In "FlinkJobServerDriver.java", it uses GcsOptions for implementing
> PipelineOptions interface so I thought  gcs bucket is the only option
> but its not.
> ---------------------------------------------------------------------
>      // TODO: Expose the fileSystem related options.
>     PipelineOptions options = PipelineOptionsFactory.create();
>     // Limiting gcs upload buffer to reduce memory usage while doing
> parallel artifact uploads.
>     options.as(GcsOptions.class).setGcsUploadBufferSizeBytes(1024 * 1024);
>     // Register standard file systems.
>     FileSystems.setDefaultPipelineOptions(options);
>     fromParams(args).run();
> ---------------------------------------------------------------------
>
> I appreciate if I could get some clarification with my question.
>
> Best Regards,
> Yu Watanabe
>
> --
> Yu Watanabe
> [email protected]
>

Reply via email to