Hi Beamers,

We can use artifact staging to make sure SDK workers have access to a
pipeline's dependencies. However, artifact staging is not always necessary.
For example, one can make sure that the environment contains all the
dependencies ahead of time. However, regardless of whether or not artifacts
are used, my understanding is an artifact manifest will be written and read
anyway. For example:

INFO AbstractArtifactRetrievalService: GetManifest for
/tmp/beam-artifact-staging/.../MANIFEST -> 0 artifacts

This can be a hassle, because users must set up a staging directory that
all workers can access, even if it isn't used aside from the (empty)
manifest [1]. Thomas mentioned that at Lyft they bypass artifact staging
altogether [2]. So I was wondering, do you all think it would be reasonable
or useful to create an "off switch" for artifact staging?

Thanks,
Kyle

[1]
https://lists.apache.org/thread.html/d293b4158f266be1cb6c99c968535706f491fdfcd4bb20c4e30939bb@%3Cdev.beam.apache.org%3E
[2]
https://issues.apache.org/jira/browse/BEAM-5187?focusedCommentId=16972715&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16972715

Reply via email to