Hi,

Coming back to this, is the general consensus that this can be addressed
via https://issues.apache.org/jira/browse/BEAM-6531 in Beam 3.0?

Cheers
Reza

On Tue, 7 May 2019 at 23:15, Valentyn Tymofieiev <valen...@google.com>
wrote:

> I think using RunnerOptions was an idea at some point, but in Python, we
> ended up parsing options from the runner api without populating
> RunnerOptions, and  RunnerOptions was eventually removed [1].
>
> If we decide to rename options, a path forward may be to have runners
> recognize both old and new names until Beam 3.0, but update codebase,
> examples and documentation to use new names.
>
> [1]
> https://github.com/apache/beam/commit/f3623e8ba2257f7659ccb312dc2574f862ef41b5#diff-525d5d65bedd7ea5e6fce6e4cd57e153L815
>
> *From:*Ahmet Altay <al...@google.com>
> *Date:*Mon, May 6, 2019, 6:01 PM
> *To:*dev
>
> There is RunnerOptions already. Its options are populated by querying the
>> job service. Any portable runner is able to provide a list of options that
>> is runner specific through that mechanism.
>>
>> *From: *Reza Rokni <r...@google.com>
>> *Date: *Mon, May 6, 2019 at 2:57 PM
>> *To: * <dev@beam.apache.org>
>>
>> So the options here would be moved to runner options?
>>>
>>> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
>>>
>>> In Java they are in DataflowPipelineWorkerPoolOptions and of course we
>>> have FlinkPipelineOptions etc...
>>>
>>> *From: *Chamikara Jayalath <chamik...@google.com>
>>> *Date: *Tue, 7 May 2019 at 05:29
>>> *To: *dev
>>>
>>>
>>>> On Mon, May 6, 2019 at 2:13 PM Lukasz Cwik <lc...@google.com> wrote:
>>>>
>>>>> There were also discussions[1] in the past about scoping
>>>>> PipelineOptions to specific PTransforms. Would scoping PipelineOptions to
>>>>> PTransforms make this a more general solution?
>>>>>
>>>>> 1:
>>>>> https://lists.apache.org/thread.html/05f849d39788cb0af840cb9e86ca631586783947eb4e5a1774b647d1@%3Cdev.beam.apache.org%3E
>>>>>
>>>>
>>>> Is this just for pipeline construction time or also for runtime ?
>>>> Trying to scope options for transforms at runtime might complicate things
>>>> in the presence of optimizations such as fusion.
>>>>
>>>>
>>>>>
>>>>> On Mon, May 6, 2019 at 12:02 PM Ankur Goenka <goe...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Having namespaces for option makes sense.
>>>>>> I think, along with a help command to print all the options given the
>>>>>> runner name will be useful.
>>>>>> As for the scope of name spacing, I think that assigning a logical
>>>>>> name space gives more flexibility around how and where we declare 
>>>>>> options.
>>>>>> It also make future refactoring possible.
>>>>>>
>>>>>>
>>>>>> On Mon, May 6, 2019 at 7:50 AM Maximilian Michels <m...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Good points. As already mentioned there is no namespacing between
>>>>>>> the
>>>>>>> different pipeline option classes. In particular, there is no
>>>>>>> separate
>>>>>>> namespace for system and user options which is most concerning.
>>>>>>>
>>>>>>> I'm in favor of an optional namespace using the class name of the
>>>>>>> defining pipeline option class. That way we would at least be able
>>>>>>> to
>>>>>>> resolve duplicate option names. For example, if there were was
>>>>>>> "optionX"
>>>>>>> in class A and B, we could use "A#optionX" to refer to it from class
>>>>>>> A.
>>>>>>>
>>>>>>
>>>> I think this solves the original problem. Runner specific options will
>>>> have unique names that includes the runner (in options class). I guess to
>>>> be complete we also have to include the package (module for Python) ?
>>>> If an option is globally unique, users should be able to specify it
>>>> without qualifying (at least for backwards compatibility).
>>>>
>>>>
>>>>>
>>>>>>> -Max
>>>>>>>
>>>>>>> On 04.05.19 02:23, Reza Rokni wrote:
>>>>>>> > Great point Lukasz, worker machine could be relevant to multiple
>>>>>>> runners.
>>>>>>> >
>>>>>>> > Perhaps for parameters that could have multiple runner relevance,
>>>>>>> the
>>>>>>> > doc could be rephrased to reflect its potential multiple uses. For
>>>>>>> > example change the help information to start with a generic
>>>>>>> reference "
>>>>>>> > worker type on the runner" followed by runner specific behavior
>>>>>>> expected
>>>>>>> > for RunnerA, RunnerB etc...
>>>>>>> >
>>>>>>> > But I do worry that without prefix even generic options could
>>>>>>> cause
>>>>>>> > confusion. For example if the use of --network is substantially
>>>>>>> > different between runnerA vs runnerB then the user will only have
>>>>>>> this
>>>>>>> > information by reading the help. It will also mean that a pipeline
>>>>>>> which
>>>>>>> > is expected to work both on-premise on RunnerA and in the cloud on
>>>>>>> > RunnerB could fail because the format of the options to pass to
>>>>>>> > --network are different.
>>>>>>> >
>>>>>>> > Cheers
>>>>>>> >
>>>>>>> > Reza
>>>>>>> >
>>>>>>> > *From: *Kenneth Knowles <k...@apache.org <mailto:k...@apache.org>>
>>>>>>> > *Date: *Sat, 4 May 2019 at 03:54
>>>>>>> > *To: *dev
>>>>>>> >
>>>>>>> >     Even though they are in classes named for specific runners,
>>>>>>> they are
>>>>>>> >     not namespaced. All PipelineOptions exist in a global
>>>>>>> namespace so
>>>>>>> >     they need to be careful to be very precise.
>>>>>>> >
>>>>>>> >     It is a good point that even though they may be multiple uses
>>>>>>> for
>>>>>>> >     "machine type" they are probably not going to both happen at
>>>>>>> the
>>>>>>> >     same time.
>>>>>>> >
>>>>>>> >     If it becomes an issue, another thing we could do would be to
>>>>>>> add
>>>>>>> >     namespacing support so options have less spooky action, or at
>>>>>>> least
>>>>>>> >     have a way to resolve it when it happens on accident.
>>>>>>> >
>>>>>>> >     Kenn
>>>>>>> >
>>>>>>> >     On Fri, May 3, 2019 at 10:43 AM Chamikara Jayalath
>>>>>>> >     <chamik...@google.com <mailto:chamik...@google.com>> wrote:
>>>>>>> >
>>>>>>> >         Also, we do have runner specific options classes where
>>>>>>> truly
>>>>>>> >         runner specific options can go.
>>>>>>> >
>>>>>>> >
>>>>>>> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
>>>>>>> >
>>>>>>> https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
>>>>>>> >
>>>>>>> >         On Fri, May 3, 2019 at 9:50 AM Ahmet Altay <
>>>>>>> al...@google.com
>>>>>>> >         <mailto:al...@google.com>> wrote:
>>>>>>> >
>>>>>>> >             I agree, that is a good point.
>>>>>>> >
>>>>>>> >             *From: *Lukasz Cwik <lc...@google.com <mailto:
>>>>>>> lc...@google.com>>
>>>>>>> >             *Date: *Fri, May 3, 2019 at 9:37 AM
>>>>>>> >             *To: *dev
>>>>>>> >
>>>>>>> >                 The concept of a machine type isn't necessarily
>>>>>>> limited
>>>>>>> >                 to Dataflow. If it made sense for a runner, they
>>>>>>> could
>>>>>>> >                 use AWS/Azure machine types as well.
>>>>>>> >
>>>>>>> >                 On Fri, May 3, 2019 at 9:32 AM Ahmet Altay
>>>>>>> >                 <al...@google.com <mailto:al...@google.com>>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> >                     This idea was discussed in a PR a few months
>>>>>>> ago,
>>>>>>> >                     and JIRA was filed as a follow up [1]. IMO, it
>>>>>>> makes
>>>>>>> >                     sense to use a namespace prefix. The primary
>>>>>>> issue
>>>>>>> >                     here is that, such a change will very likely
>>>>>>> be a
>>>>>>> >                     backward incompatible change and would be hard
>>>>>>> to do
>>>>>>> >                     before the next major version.
>>>>>>> >
>>>>>>> >                     [1]
>>>>>>> https://issues.apache.org/jira/browse/BEAM-6531
>>>>>>> >
>>>>>>> >                     *From: *Reza Rokni <r...@google.com
>>>>>>> >                     <mailto:r...@google.com>>
>>>>>>> >                     *Date: *Thu, May 2, 2019 at 8:00 PM
>>>>>>> >                     *To: * <dev@beam.apache.org
>>>>>>> >                     <mailto:dev@beam.apache.org>>
>>>>>>> >
>>>>>>> >                         Hi,
>>>>>>> >
>>>>>>> >                         Was reading this SO question:
>>>>>>> >
>>>>>>> >
>>>>>>> https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has
>>>>>>> >
>>>>>>> >                         And noticed that in
>>>>>>> >
>>>>>>> >
>>>>>>> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
>>>>>>> >
>>>>>>> >                         The option is called --worker_machine_type.
>>>>>>> >
>>>>>>> >                         I wonder if runner specific options should
>>>>>>> have
>>>>>>> >                         the runner in the prefix? Something like
>>>>>>> >                         --dataflow_worker_machine_type?
>>>>>>> >
>>>>>>> >                         Cheers
>>>>>>> >                         Reza
>>>>>>> >
>>>>>>> >                         --
>>>>>>> >
>>>>>>> >                         This email may be confidential and
>>>>>>> privileged.
>>>>>>> >                         If you received this communication by
>>>>>>> mistake,
>>>>>>> >                         please don't forward it to anyone else,
>>>>>>> please
>>>>>>> >                         erase all copies and attachments, and
>>>>>>> please let
>>>>>>> >                         me know that it has gone to the wrong
>>>>>>> person.
>>>>>>> >
>>>>>>> >                         The above terms reflect a potential
>>>>>>> business
>>>>>>> >                         arrangement, are provided solely as a
>>>>>>> basis for
>>>>>>> >                         further discussion, and are not intended
>>>>>>> to be
>>>>>>> >                         and do not constitute a legally binding
>>>>>>> >                         obligation. No legally binding obligations
>>>>>>> will
>>>>>>> >                         be created, implied, or inferred until an
>>>>>>> >                         agreement in final form is executed in
>>>>>>> writing
>>>>>>> >                         by all parties involved.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> >
>>>>>>> > This email may be confidential and privileged. If you received
>>>>>>> this
>>>>>>> > communication by mistake, please don't forward it to anyone else,
>>>>>>> please
>>>>>>> > erase all copies and attachments, and please let me know that it
>>>>>>> has
>>>>>>> > gone to the wrong person.
>>>>>>> >
>>>>>>> > The above terms reflect a potential business arrangement, are
>>>>>>> provided
>>>>>>> > solely as a basis for further discussion, and are not intended to
>>>>>>> be and
>>>>>>> > do not constitute a legally binding obligation. No legally binding
>>>>>>> > obligations will be created, implied, or inferred until an
>>>>>>> agreement in
>>>>>>> > final form is executed in writing by all parties involved.
>>>>>>> >
>>>>>>>
>>>>>>
>>>
>>> --
>>>
>>> This email may be confidential and privileged. If you received this
>>> communication by mistake, please don't forward it to anyone else, please
>>> erase all copies and attachments, and please let me know that it has gone
>>> to the wrong person.
>>>
>>> The above terms reflect a potential business arrangement, are provided
>>> solely as a basis for further discussion, and are not intended to be and do
>>> not constitute a legally binding obligation. No legally binding obligations
>>> will be created, implied, or inferred until an agreement in final form is
>>> executed in writing by all parties involved.
>>>
>>

-- 

This email may be confidential and privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person.

The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and do
not constitute a legally binding obligation. No legally binding obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.

Reply via email to