I am deploying Beam pipelines with the DataflowRunner and would like to
move more of the pipeline options to use the ValueProvider interface so
they can be specified at runtime rather than at template compile time, but
running into various issues.

First, it's been unclear to the operations engineers deploying pipelines
which options are runtime and which are compile-time. The engineers are
typically using the gcloud command-line interface to deploy rather than the
console, so I don't see much benefit from implementing a metadata json file.

AFAICT, Dataflow happily accepts values for runtime parameters when
compiling the template, but those values are completely ignored and need to
be defined again at runtime, furthering the confusion. Should I just go
ahead and add "(runtime parameter)" to each of the @Description annotations
to at least document the distinction in --help output?

Finally, it's not clear whether the @Default annotation supports runtime
parameters. The dataflow docs show an example where @Default is used on a
ValueProvider [0], but this doesn't appear to actually have any effect. If
I don't pass in a value for a runtime parameter when executing a template,
the pipeline throws a "Value only available at runtime" exception on calls
to .get(), rather than returning the default value. Have others encountered
this? Is there a pattern for providing defaults for runtime parameters?

[0]
https://cloud.google.com/dataflow/docs/guides/templates/creating-templates#using-valueprovider-in-your-pipeline-options

Reply via email to