Good points. As already mentioned there is no namespacing between the different pipeline option classes. In particular, there is no separate namespace for system and user options which is most concerning.

I'm in favor of an optional namespace using the class name of the defining pipeline option class. That way we would at least be able to resolve duplicate option names. For example, if there were was "optionX" in class A and B, we could use "A#optionX" to refer to it from class A.

-Max

On 04.05.19 02:23, Reza Rokni wrote:
Great point Lukasz, worker machine could be relevant to multiple runners.

Perhaps for parameters that could have multiple runner relevance, the doc could be rephrased to reflect its potential multiple uses. For example change the help information to start with a generic reference " worker type on the runner" followed by runner specific behavior expected for RunnerA, RunnerB etc...

But I do worry that without prefix even generic options could cause confusion. For example if the use of --network is substantially different between runnerA vs runnerB then the user will only have this information by reading the help. It will also mean that a pipeline which is expected to work both on-premise on RunnerA and in the cloud on RunnerB could fail because the format of the options to pass to --network are different.

Cheers

Reza

*From: *Kenneth Knowles <k...@apache.org <mailto:k...@apache.org>>
*Date: *Sat, 4 May 2019 at 03:54
*To: *dev

    Even though they are in classes named for specific runners, they are
    not namespaced. All PipelineOptions exist in a global namespace so
    they need to be careful to be very precise.

    It is a good point that even though they may be multiple uses for
    "machine type" they are probably not going to both happen at the
    same time.

    If it becomes an issue, another thing we could do would be to add
    namespacing support so options have less spooky action, or at least
    have a way to resolve it when it happens on accident.

    Kenn

    On Fri, May 3, 2019 at 10:43 AM Chamikara Jayalath
    <chamik...@google.com <mailto:chamik...@google.com>> wrote:

        Also, we do have runner specific options classes where truly
        runner specific options can go.

        
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
        
https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java

        On Fri, May 3, 2019 at 9:50 AM Ahmet Altay <al...@google.com
        <mailto:al...@google.com>> wrote:

            I agree, that is a good point.

            *From: *Lukasz Cwik <lc...@google.com <mailto:lc...@google.com>>
            *Date: *Fri, May 3, 2019 at 9:37 AM
            *To: *dev

                The concept of a machine type isn't necessarily limited
                to Dataflow. If it made sense for a runner, they could
                use AWS/Azure machine types as well.

                On Fri, May 3, 2019 at 9:32 AM Ahmet Altay
                <al...@google.com <mailto:al...@google.com>> wrote:

                    This idea was discussed in a PR a few months ago,
                    and JIRA was filed as a follow up [1]. IMO, it makes
                    sense to use a namespace prefix. The primary issue
                    here is that, such a change will very likely be a
                    backward incompatible change and would be hard to do
                    before the next major version.

                    [1] https://issues.apache.org/jira/browse/BEAM-6531

                    *From: *Reza Rokni <r...@google.com
                    <mailto:r...@google.com>>
                    *Date: *Thu, May 2, 2019 at 8:00 PM
                    *To: * <dev@beam.apache.org
                    <mailto:dev@beam.apache.org>>

                        Hi,

                        Was reading this SO question:

                        
https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has

                        And noticed that in

                        
https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions

                        The option is called --worker_machine_type.

                        I wonder if runner specific options should have
                        the runner in the prefix? Something like
                        --dataflow_worker_machine_type?

                        Cheers
                        Reza

--
                        This email may be confidential and privileged.
                        If you received this communication by mistake,
                        please don't forward it to anyone else, please
                        erase all copies and attachments, and please let
                        me know that it has gone to the wrong person.

                        The above terms reflect a potential business
                        arrangement, are provided solely as a basis for
                        further discussion, and are not intended to be
                        and do not constitute a legally binding
                        obligation. No legally binding obligations will
                        be created, implied, or inferred until an
                        agreement in final form is executed in writing
                        by all parties involved.



--

This email may be confidential and privileged. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

The above terms reflect a potential business arrangement, are provided solely as a basis for further discussion, and are not intended to be and do not constitute a legally binding obligation. No legally binding obligations will be created, implied, or inferred until an agreement in final form is executed in writing by all parties involved.

Reply via email to