There is RunnerOptions already. Its options are populated by querying the
job service. Any portable runner is able to provide a list of options that
is runner specific through that mechanism.

*From: *Reza Rokni <r...@google.com>
*Date: *Mon, May 6, 2019 at 2:57 PM
*To: * <dev@beam.apache.org>

So the options here would be moved to runner options?
>
> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
>
> In Java they are in DataflowPipelineWorkerPoolOptions and of course we
> have FlinkPipelineOptions etc...
>
> *From: *Chamikara Jayalath <chamik...@google.com>
> *Date: *Tue, 7 May 2019 at 05:29
> *To: *dev
>
>
>> On Mon, May 6, 2019 at 2:13 PM Lukasz Cwik <lc...@google.com> wrote:
>>
>>> There were also discussions[1] in the past about scoping PipelineOptions
>>> to specific PTransforms. Would scoping PipelineOptions to PTransforms make
>>> this a more general solution?
>>>
>>> 1:
>>> https://lists.apache.org/thread.html/05f849d39788cb0af840cb9e86ca631586783947eb4e5a1774b647d1@%3Cdev.beam.apache.org%3E
>>>
>>
>> Is this just for pipeline construction time or also for runtime ? Trying
>> to scope options for transforms at runtime might complicate things in the
>> presence of optimizations such as fusion.
>>
>>
>>>
>>> On Mon, May 6, 2019 at 12:02 PM Ankur Goenka <goe...@google.com> wrote:
>>>
>>>> Having namespaces for option makes sense.
>>>> I think, along with a help command to print all the options given the
>>>> runner name will be useful.
>>>> As for the scope of name spacing, I think that assigning a logical name
>>>> space gives more flexibility around how and where we declare options. It
>>>> also make future refactoring possible.
>>>>
>>>>
>>>> On Mon, May 6, 2019 at 7:50 AM Maximilian Michels <m...@apache.org>
>>>> wrote:
>>>>
>>>>> Good points. As already mentioned there is no namespacing between the
>>>>> different pipeline option classes. In particular, there is no separate
>>>>> namespace for system and user options which is most concerning.
>>>>>
>>>>> I'm in favor of an optional namespace using the class name of the
>>>>> defining pipeline option class. That way we would at least be able to
>>>>> resolve duplicate option names. For example, if there were was
>>>>> "optionX"
>>>>> in class A and B, we could use "A#optionX" to refer to it from class A.
>>>>>
>>>>
>> I think this solves the original problem. Runner specific options will
>> have unique names that includes the runner (in options class). I guess to
>> be complete we also have to include the package (module for Python) ?
>> If an option is globally unique, users should be able to specify it
>> without qualifying (at least for backwards compatibility).
>>
>>
>>>
>>>>> -Max
>>>>>
>>>>> On 04.05.19 02:23, Reza Rokni wrote:
>>>>> > Great point Lukasz, worker machine could be relevant to multiple
>>>>> runners.
>>>>> >
>>>>> > Perhaps for parameters that could have multiple runner relevance,
>>>>> the
>>>>> > doc could be rephrased to reflect its potential multiple uses. For
>>>>> > example change the help information to start with a generic
>>>>> reference "
>>>>> > worker type on the runner" followed by runner specific behavior
>>>>> expected
>>>>> > for RunnerA, RunnerB etc...
>>>>> >
>>>>> > But I do worry that without prefix even generic options could cause
>>>>> > confusion. For example if the use of --network is substantially
>>>>> > different between runnerA vs runnerB then the user will only have
>>>>> this
>>>>> > information by reading the help. It will also mean that a pipeline
>>>>> which
>>>>> > is expected to work both on-premise on RunnerA and in the cloud on
>>>>> > RunnerB could fail because the format of the options to pass to
>>>>> > --network are different.
>>>>> >
>>>>> > Cheers
>>>>> >
>>>>> > Reza
>>>>> >
>>>>> > *From: *Kenneth Knowles <k...@apache.org <mailto:k...@apache.org>>
>>>>> > *Date: *Sat, 4 May 2019 at 03:54
>>>>> > *To: *dev
>>>>> >
>>>>> >     Even though they are in classes named for specific runners, they
>>>>> are
>>>>> >     not namespaced. All PipelineOptions exist in a global namespace
>>>>> so
>>>>> >     they need to be careful to be very precise.
>>>>> >
>>>>> >     It is a good point that even though they may be multiple uses for
>>>>> >     "machine type" they are probably not going to both happen at the
>>>>> >     same time.
>>>>> >
>>>>> >     If it becomes an issue, another thing we could do would be to add
>>>>> >     namespacing support so options have less spooky action, or at
>>>>> least
>>>>> >     have a way to resolve it when it happens on accident.
>>>>> >
>>>>> >     Kenn
>>>>> >
>>>>> >     On Fri, May 3, 2019 at 10:43 AM Chamikara Jayalath
>>>>> >     <chamik...@google.com <mailto:chamik...@google.com>> wrote:
>>>>> >
>>>>> >         Also, we do have runner specific options classes where truly
>>>>> >         runner specific options can go.
>>>>> >
>>>>> >
>>>>> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
>>>>> >
>>>>> https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
>>>>> >
>>>>> >         On Fri, May 3, 2019 at 9:50 AM Ahmet Altay <al...@google.com
>>>>> >         <mailto:al...@google.com>> wrote:
>>>>> >
>>>>> >             I agree, that is a good point.
>>>>> >
>>>>> >             *From: *Lukasz Cwik <lc...@google.com <mailto:
>>>>> lc...@google.com>>
>>>>> >             *Date: *Fri, May 3, 2019 at 9:37 AM
>>>>> >             *To: *dev
>>>>> >
>>>>> >                 The concept of a machine type isn't necessarily
>>>>> limited
>>>>> >                 to Dataflow. If it made sense for a runner, they
>>>>> could
>>>>> >                 use AWS/Azure machine types as well.
>>>>> >
>>>>> >                 On Fri, May 3, 2019 at 9:32 AM Ahmet Altay
>>>>> >                 <al...@google.com <mailto:al...@google.com>> wrote:
>>>>> >
>>>>> >                     This idea was discussed in a PR a few months ago,
>>>>> >                     and JIRA was filed as a follow up [1]. IMO, it
>>>>> makes
>>>>> >                     sense to use a namespace prefix. The primary
>>>>> issue
>>>>> >                     here is that, such a change will very likely be a
>>>>> >                     backward incompatible change and would be hard
>>>>> to do
>>>>> >                     before the next major version.
>>>>> >
>>>>> >                     [1]
>>>>> https://issues.apache.org/jira/browse/BEAM-6531
>>>>> >
>>>>> >                     *From: *Reza Rokni <r...@google.com
>>>>> >                     <mailto:r...@google.com>>
>>>>> >                     *Date: *Thu, May 2, 2019 at 8:00 PM
>>>>> >                     *To: * <dev@beam.apache.org
>>>>> >                     <mailto:dev@beam.apache.org>>
>>>>> >
>>>>> >                         Hi,
>>>>> >
>>>>> >                         Was reading this SO question:
>>>>> >
>>>>> >
>>>>> https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has
>>>>> >
>>>>> >                         And noticed that in
>>>>> >
>>>>> >
>>>>> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
>>>>> >
>>>>> >                         The option is called --worker_machine_type.
>>>>> >
>>>>> >                         I wonder if runner specific options should
>>>>> have
>>>>> >                         the runner in the prefix? Something like
>>>>> >                         --dataflow_worker_machine_type?
>>>>> >
>>>>> >                         Cheers
>>>>> >                         Reza
>>>>> >
>>>>> >                         --
>>>>> >
>>>>> >                         This email may be confidential and
>>>>> privileged.
>>>>> >                         If you received this communication by
>>>>> mistake,
>>>>> >                         please don't forward it to anyone else,
>>>>> please
>>>>> >                         erase all copies and attachments, and please
>>>>> let
>>>>> >                         me know that it has gone to the wrong person.
>>>>> >
>>>>> >                         The above terms reflect a potential business
>>>>> >                         arrangement, are provided solely as a basis
>>>>> for
>>>>> >                         further discussion, and are not intended to
>>>>> be
>>>>> >                         and do not constitute a legally binding
>>>>> >                         obligation. No legally binding obligations
>>>>> will
>>>>> >                         be created, implied, or inferred until an
>>>>> >                         agreement in final form is executed in
>>>>> writing
>>>>> >                         by all parties involved.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > This email may be confidential and privileged. If you received this
>>>>> > communication by mistake, please don't forward it to anyone else,
>>>>> please
>>>>> > erase all copies and attachments, and please let me know that it has
>>>>> > gone to the wrong person.
>>>>> >
>>>>> > The above terms reflect a potential business arrangement, are
>>>>> provided
>>>>> > solely as a basis for further discussion, and are not intended to be
>>>>> and
>>>>> > do not constitute a legally binding obligation. No legally binding
>>>>> > obligations will be created, implied, or inferred until an agreement
>>>>> in
>>>>> > final form is executed in writing by all parties involved.
>>>>> >
>>>>>
>>>>
>
> --
>
> This email may be confidential and privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person.
>
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and do
> not constitute a legally binding obligation. No legally binding obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>

Reply via email to