Agree that pipeline options lack some mechanism for scoping. It is also not
always possible distinguish options meant to be consumed at pipeline
construction time, by the runner, by the SDK harness, by the user code or
any combination -- and this causes confusion every now and then.

For Dataflow, we have been using "experiments" for arbitrary
runner-specific options. It's simply a string list pipeline option that all
SDKs support and, for Go at least, is sent to portable runners. Flink can
do the same in the short term to move forward.

Henning


On Fri, Oct 12, 2018 at 8:50 AM Thomas Weise <t...@apache.org> wrote:

> [moving to the list]
>
> The requirement driving this part of the change was to allow a user to
> specify pipeline options that a runner supports without having to declare
> those in each language SDK.
>
> In the specific scenario, we have options that the Flink runner supports
> (and can validate), that are not enumerated in the Python SDK.
>
> I think we have a bigger problem scoping pipeline options. For example,
> the runner options are dumped into the SDK worker. There is also a
> possibility of name collisions. So I think this would benefit from broader
> feedback.
>
> Thanks,
> Thomas
>
>
> ---------- Forwarded message ---------
> From: Charles Chen <notificati...@github.com>
> Date: Fri, Oct 12, 2018 at 8:36 AM
> Subject: Re: [apache/beam] [BEAM-5442] Store duplicate unknown options in
> a list argument (#6600)
> To: apache/beam <b...@noreply.github.com>
> Cc: Thomas Weise <thomas.we...@gmail.com>, Mention <
> ment...@noreply.github.com>
>
>
> CC: @tweise <https://github.com/tweise>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/beam/pull/6600#issuecomment-429367754>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAQGDwwt15R85eq9pySUisyxq2HYz-Vyks5ukLcLgaJpZM4XMo-T>
> .
>

Reply via email to