There were also discussions[1] in the past about scoping PipelineOptions to
specific PTransforms. Would scoping PipelineOptions to PTransforms make
this a more general solution?

1:
https://lists.apache.org/thread.html/05f849d39788cb0af840cb9e86ca631586783947eb4e5a1774b647d1@%3Cdev.beam.apache.org%3E

On Mon, May 6, 2019 at 12:02 PM Ankur Goenka <goe...@google.com> wrote:

> Having namespaces for option makes sense.
> I think, along with a help command to print all the options given the
> runner name will be useful.
> As for the scope of name spacing, I think that assigning a logical name
> space gives more flexibility around how and where we declare options. It
> also make future refactoring possible.
>
>
> On Mon, May 6, 2019 at 7:50 AM Maximilian Michels <m...@apache.org> wrote:
>
>> Good points. As already mentioned there is no namespacing between the
>> different pipeline option classes. In particular, there is no separate
>> namespace for system and user options which is most concerning.
>>
>> I'm in favor of an optional namespace using the class name of the
>> defining pipeline option class. That way we would at least be able to
>> resolve duplicate option names. For example, if there were was "optionX"
>> in class A and B, we could use "A#optionX" to refer to it from class A.
>>
>> -Max
>>
>> On 04.05.19 02:23, Reza Rokni wrote:
>> > Great point Lukasz, worker machine could be relevant to multiple
>> runners.
>> >
>> > Perhaps for parameters that could have multiple runner relevance, the
>> > doc could be rephrased to reflect its potential multiple uses. For
>> > example change the help information to start with a generic reference "
>> > worker type on the runner" followed by runner specific behavior
>> expected
>> > for RunnerA, RunnerB etc...
>> >
>> > But I do worry that without prefix even generic options could cause
>> > confusion. For example if the use of --network is substantially
>> > different between runnerA vs runnerB then the user will only have this
>> > information by reading the help. It will also mean that a pipeline
>> which
>> > is expected to work both on-premise on RunnerA and in the cloud on
>> > RunnerB could fail because the format of the options to pass to
>> > --network are different.
>> >
>> > Cheers
>> >
>> > Reza
>> >
>> > *From: *Kenneth Knowles <k...@apache.org <mailto:k...@apache.org>>
>> > *Date: *Sat, 4 May 2019 at 03:54
>> > *To: *dev
>> >
>> >     Even though they are in classes named for specific runners, they are
>> >     not namespaced. All PipelineOptions exist in a global namespace so
>> >     they need to be careful to be very precise.
>> >
>> >     It is a good point that even though they may be multiple uses for
>> >     "machine type" they are probably not going to both happen at the
>> >     same time.
>> >
>> >     If it becomes an issue, another thing we could do would be to add
>> >     namespacing support so options have less spooky action, or at least
>> >     have a way to resolve it when it happens on accident.
>> >
>> >     Kenn
>> >
>> >     On Fri, May 3, 2019 at 10:43 AM Chamikara Jayalath
>> >     <chamik...@google.com <mailto:chamik...@google.com>> wrote:
>> >
>> >         Also, we do have runner specific options classes where truly
>> >         runner specific options can go.
>> >
>> >
>> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
>> >
>> https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
>> >
>> >         On Fri, May 3, 2019 at 9:50 AM Ahmet Altay <al...@google.com
>> >         <mailto:al...@google.com>> wrote:
>> >
>> >             I agree, that is a good point.
>> >
>> >             *From: *Lukasz Cwik <lc...@google.com <mailto:
>> lc...@google.com>>
>> >             *Date: *Fri, May 3, 2019 at 9:37 AM
>> >             *To: *dev
>> >
>> >                 The concept of a machine type isn't necessarily limited
>> >                 to Dataflow. If it made sense for a runner, they could
>> >                 use AWS/Azure machine types as well.
>> >
>> >                 On Fri, May 3, 2019 at 9:32 AM Ahmet Altay
>> >                 <al...@google.com <mailto:al...@google.com>> wrote:
>> >
>> >                     This idea was discussed in a PR a few months ago,
>> >                     and JIRA was filed as a follow up [1]. IMO, it makes
>> >                     sense to use a namespace prefix. The primary issue
>> >                     here is that, such a change will very likely be a
>> >                     backward incompatible change and would be hard to do
>> >                     before the next major version.
>> >
>> >                     [1] https://issues.apache.org/jira/browse/BEAM-6531
>> >
>> >                     *From: *Reza Rokni <r...@google.com
>> >                     <mailto:r...@google.com>>
>> >                     *Date: *Thu, May 2, 2019 at 8:00 PM
>> >                     *To: * <dev@beam.apache.org
>> >                     <mailto:dev@beam.apache.org>>
>> >
>> >                         Hi,
>> >
>> >                         Was reading this SO question:
>> >
>> >
>> https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has
>> >
>> >                         And noticed that in
>> >
>> >
>> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
>> >
>> >                         The option is called --worker_machine_type.
>> >
>> >                         I wonder if runner specific options should have
>> >                         the runner in the prefix? Something like
>> >                         --dataflow_worker_machine_type?
>> >
>> >                         Cheers
>> >                         Reza
>> >
>> >                         --
>> >
>> >                         This email may be confidential and privileged.
>> >                         If you received this communication by mistake,
>> >                         please don't forward it to anyone else, please
>> >                         erase all copies and attachments, and please let
>> >                         me know that it has gone to the wrong person.
>> >
>> >                         The above terms reflect a potential business
>> >                         arrangement, are provided solely as a basis for
>> >                         further discussion, and are not intended to be
>> >                         and do not constitute a legally binding
>> >                         obligation. No legally binding obligations will
>> >                         be created, implied, or inferred until an
>> >                         agreement in final form is executed in writing
>> >                         by all parties involved.
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are provided
>> > solely as a basis for further discussion, and are not intended to be
>> and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>

Reply via email to