+1

On 22.05.19 04:28, Reza Rokni wrote:
Hi,

Coming back to this, is the general consensus that this can be addressed via https://issues.apache.org/jira/browse/BEAM-6531 in Beam 3.0?

Cheers
Reza

On Tue, 7 May 2019 at 23:15, Valentyn Tymofieiev <valen...@google.com <mailto:valen...@google.com>> wrote:

    I think using RunnerOptions was an idea at some point, but in
    Python, we ended up parsing options from the runner api without
    populating RunnerOptions, and  RunnerOptions was eventually removed [1].

    If we decide to rename options, a path forward may be to have
    runners recognize both old and new names until Beam 3.0, but update
    codebase, examples and documentation to use new names.

    [1]
    
https://github.com/apache/beam/commit/f3623e8ba2257f7659ccb312dc2574f862ef41b5#diff-525d5d65bedd7ea5e6fce6e4cd57e153L815

    *From:*Ahmet Altay <al...@google.com <mailto:al...@google.com>>
    *Date:*Mon, May 6, 2019, 6:01 PM
    *To:*dev

        There is RunnerOptions already. Its options are populated by
        querying the job service. Any portable runner is able to provide
        a list of options that is runner specific through that mechanism.

        *From: *Reza Rokni <r...@google.com <mailto:r...@google.com>>
        *Date: *Mon, May 6, 2019 at 2:57 PM
        *To: * <dev@beam.apache.org <mailto:dev@beam.apache.org>>

            So the options here would be moved to runner options?
            
https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions

            In Java they are in DataflowPipelineWorkerPoolOptions and of
            course we have FlinkPipelineOptions etc...

            *From: *Chamikara Jayalath <chamik...@google.com
            <mailto:chamik...@google.com>>
            *Date: *Tue, 7 May 2019 at 05:29
            *To: *dev


                On Mon, May 6, 2019 at 2:13 PM Lukasz Cwik
                <lc...@google.com <mailto:lc...@google.com>> wrote:

                    There were also discussions[1] in the past about
                    scoping PipelineOptions to specific PTransforms.
                    Would scoping PipelineOptions to PTransforms make
                    this a more general solution?

                    1:
                    
https://lists.apache.org/thread.html/05f849d39788cb0af840cb9e86ca631586783947eb4e5a1774b647d1@%3Cdev.beam.apache.org%3E


                Is this just for pipeline construction time or also for
                runtime ? Trying to scope options for transforms at
                runtime might complicate things in the presence of
                optimizations such as fusion.


                    On Mon, May 6, 2019 at 12:02 PM Ankur Goenka
                    <goe...@google.com <mailto:goe...@google.com>> wrote:

                        Having namespaces for option makes sense.
                        I think, along with a help command to print all
                        the options given the runner name will be useful.
                        As for the scope of name spacing, I think that
                        assigning a logical name space gives more
                        flexibility around how and where we declare
                        options. It also make future refactoring possible.


                        On Mon, May 6, 2019 at 7:50 AM Maximilian
                        Michels <m...@apache.org <mailto:m...@apache.org>>
                        wrote:

                            Good points. As already mentioned there is
                            no namespacing between the
                            different pipeline option classes. In
                            particular, there is no separate
                            namespace for system and user options which
                            is most concerning.

                            I'm in favor of an optional namespace using
                            the class name of the
                            defining pipeline option class. That way we
                            would at least be able to
                            resolve duplicate option names. For example,
                            if there were was "optionX"
                            in class A and B, we could use "A#optionX"
                            to refer to it from class A.


                I think this solves the original problem. Runner
                specific options will have unique names that includes
                the runner (in options class). I guess to be complete we
                also have to include the package (module for Python) ?
                If an option is globally unique, users should be able to
                specify it without qualifying (at least for backwards
                compatibility).


                            -Max

                            On 04.05.19 02:23, Reza Rokni wrote:
                             > Great point Lukasz, worker machine could
                            be relevant to multiple runners.
                             >
                             > Perhaps for parameters that could have
                            multiple runner relevance, the
                             > doc could be rephrased to reflect its
                            potential multiple uses. For
                             > example change the help information to
                            start with a generic reference "
                             > worker type on the runner" followed by
                            runner specific behavior expected
                             > for RunnerA, RunnerB etc...
                             >
                             > But I do worry that without prefix even
                            generic options could cause
                             > confusion. For example if the use of
                            --network is substantially
                             > different between runnerA vs runnerB then
                            the user will only have this
                             > information by reading the help. It will
                            also mean that a pipeline which
                             > is expected to work both on-premise on
                            RunnerA and in the cloud on
                             > RunnerB could fail because the format of
                            the options to pass to
                             > --network are different.
                             >
                             > Cheers
                             >
                             > Reza
                             >
                             > *From: *Kenneth Knowles <k...@apache.org
                            <mailto:k...@apache.org>
                            <mailto:k...@apache.org
                            <mailto:k...@apache.org>>>
                             > *Date: *Sat, 4 May 2019 at 03:54
                             > *To: *dev
                             >
                             >     Even though they are in classes named
                            for specific runners, they are
                             >     not namespaced. All PipelineOptions
                            exist in a global namespace so
                             >     they need to be careful to be very
                            precise.
                             >
                             >     It is a good point that even though
                            they may be multiple uses for
                             >     "machine type" they are probably not
                            going to both happen at the
                             >     same time.
                             >
                             >     If it becomes an issue, another thing
                            we could do would be to add
                             >     namespacing support so options have
                            less spooky action, or at least
                             >     have a way to resolve it when it
                            happens on accident.
                             >
                             >     Kenn
                             >
                             >     On Fri, May 3, 2019 at 10:43 AM
                            Chamikara Jayalath
                             >     <chamik...@google.com
                            <mailto:chamik...@google.com>
                            <mailto:chamik...@google.com
                            <mailto:chamik...@google.com>>> wrote:
                             >
                             >         Also, we do have runner specific
                            options classes where truly
                             >         runner specific options can go.
                             >
                             >
                            
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
                             >
                            
https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
                             >
                             >         On Fri, May 3, 2019 at 9:50 AM
                            Ahmet Altay <al...@google.com
                            <mailto:al...@google.com>
                             >         <mailto:al...@google.com
                            <mailto:al...@google.com>>> wrote:
                             >
                             >             I agree, that is a good point.
                             >
                             >             *From: *Lukasz Cwik
                            <lc...@google.com <mailto:lc...@google.com>
                            <mailto:lc...@google.com
                            <mailto:lc...@google.com>>>
                             >             *Date: *Fri, May 3, 2019 at
                            9:37 AM
                             >             *To: *dev
                             >
                             >                 The concept of a machine
                            type isn't necessarily limited
                             >                 to Dataflow. If it made
                            sense for a runner, they could
                             >                 use AWS/Azure machine
                            types as well.
                             >
                             >                 On Fri, May 3, 2019 at
                            9:32 AM Ahmet Altay
                             >                 <al...@google.com
                            <mailto:al...@google.com>
                            <mailto:al...@google.com
                            <mailto:al...@google.com>>> wrote:
                             >
                             >                     This idea was
                            discussed in a PR a few months ago,
                             >                     and JIRA was filed as
                            a follow up [1]. IMO, it makes
                             >                     sense to use a
                            namespace prefix. The primary issue
                             >                     here is that, such a
                            change will very likely be a
                             >                     backward incompatible
                            change and would be hard to do
                             >                     before the next major
                            version.
                             >
                             >                     [1]
                            https://issues.apache.org/jira/browse/BEAM-6531
                             >
                             >                     *From: *Reza Rokni
                            <r...@google.com <mailto:r...@google.com>
>  <mailto:r...@google.com
                            <mailto:r...@google.com>>>
                             >                     *Date: *Thu, May 2,
                            2019 at 8:00 PM
                             >                     *To: *
                            <dev@beam.apache.org
                            <mailto:dev@beam.apache.org>
>  <mailto:dev@beam.apache.org
                            <mailto:dev@beam.apache.org>>>
                             >
                             >                         Hi,
                             >
                             >                         Was reading this
                            SO question:
                             >
                             >
                            
https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has
                             >
                             >                         And noticed that in
                             >
                             >
                            
https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
                             >
                             >                         The option is
                            called --worker_machine_type.
                             >
                             >                         I wonder if
                            runner specific options should have
                             >                         the runner in the
                            prefix? Something like
>  --dataflow_worker_machine_type?
                             >
                             >                         Cheers
                             >                         Reza
                             >
                             >                         --
                             >
                             >                         This email may be
                            confidential and privileged.
                             >                         If you received
                            this communication by mistake,
                             >                         please don't
                            forward it to anyone else, please
                             >                         erase all copies
                            and attachments, and please let
                             >                         me know that it
                            has gone to the wrong person.
                             >
                             >                         The above terms
                            reflect a potential business
                             >                         arrangement, are
                            provided solely as a basis for
                             >                         further
                            discussion, and are not intended to be
                             >                         and do not
                            constitute a legally binding
                             >                         obligation. No
                            legally binding obligations will
                             >                         be created,
                            implied, or inferred until an
                             >                         agreement in
                            final form is executed in writing
                             >                         by all parties
                            involved.
                             >
                             >
                             >
                             > --
                             >
                             > This email may be confidential and
                            privileged. If you received this
                             > communication by mistake, please don't
                            forward it to anyone else, please
                             > erase all copies and attachments, and
                            please let me know that it has
                             > gone to the wrong person.
                             >
                             > The above terms reflect a potential
                            business arrangement, are provided
                             > solely as a basis for further discussion,
                            and are not intended to be and
                             > do not constitute a legally binding
                            obligation. No legally binding
                             > obligations will be created, implied, or
                            inferred until an agreement in
                             > final form is executed in writing by all
                            parties involved.
                             >



--
            This email may be confidential and privileged. If you
            received this communication by mistake, please don't forward
            it to anyone else, please erase all copies and attachments,
            and please let me know that it has gone to the wrong person.

            The above terms reflect a potential business arrangement,
            are provided solely as a basis for further discussion, and
            are not intended to be and do not constitute a legally
            binding obligation. No legally binding obligations will be
            created, implied, or inferred until an agreement in final
            form is executed in writing by all parties involved.



--

This email may be confidential and privileged. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

The above terms reflect a potential business arrangement, are provided solely as a basis for further discussion, and are not intended to be and do not constitute a legally binding obligation. No legally binding obligations will be created, implied, or inferred until an agreement in final form is executed in writing by all parties involved.

Reply via email to