Re: [DISCUSS] State of the project

Etienne Chauchot Wed, 31 Jan 2018 03:41:05 -0800

Thanks Kenn and Luke for your comments.

WDYT about my proposition (bellow) to add methods to the runner api toenhance the coherence between the runners?

WDYT about my other proposition (bellow) of trying to avoid havingvalidates runner tests that are specific to a runner like we have now?


Thanks,

Etienne


Le 26/01/2018 à 21:34, Kenneth Knowles a écrit :

I also think that at a high level the success of Beam as aproject/community and as a piece of software depends on havingmultiple viable runners with healthy set of users and contributors.The pieces that are missing to me:


*User-focused comparison of runners (and IOs)*

+1 to Jesse's. Automated capability tests don't really help this.Benchmarks will be part of the story but are worth very little ontheir own. Focusing on these is just choosing to measure things thatare easy to measure instead of addressing what is important, which isin the end almost always qualitative.


*Automated integration tests on clusters*

We do need to know that runners and IOs "work" in a basic yes/nomanner on every commit/release, beyond unit tests. I am not reallywilling to strongly claim to a potential user that something "works"without this level of automation.


*More uniform operational experiences*

Setting up your Spark/Flink/Apex deployment should be different.Launching a Beam pipeline on it should not be.


*Portability: Any SDK on any runner*

We have now one SDK on master and one SDK on a dev branch that bothsupport portable execution somewhat. Unfortunately we have no majoropen source runner that supports portability*. "Java on any runner" isnot compelling enough any more, if it ever was.


----

Reviews: I agree our response latency is too slow. I do not agree thatour quality bar is too high; I think we should raise it*significantly*. Our codebase fails tests for long periods. Our testsneed to be green enough that we are comfortable blocking merges *evenfor unrelated failures*. We should be able to cut a release any time,modulo known blocker-level bugs.

Runner dev: I think Etienne's point about making it more uniform toadd features to all runners actually is quite important, since theportability framework is a lot harder than "translate a Beam ParDo toXYZ's FlatMap" where they are both Java. And even the support codewe've been building is not obvious to use and probably won't be forthe foreseeable future. This fits well into the "Ben thread" ontechnical ideas so I'll comment there.


Kenn

*We do have a local batch-only portable runner in Python

On Fri, Jan 26, 2018 at 10:09 AM, Lukasz Cwik <[email protected]<mailto:[email protected]>> wrote:


    Etienne, for the cross runner coherence, the portability framework
    is attempting to create an API across all runners for job
    management and job execution. A lot of work still needs to be done
    to define and implement these APIs and migrate runners and SDKs to
    support it since the current set of Java APIs are adhoc in usage
    and purpose. In my opinion, development should really be focused
    to migrate runners and SDKs to use these APIs to get developer
    coherence. Work is slowly progressing on integrating them into the
    Java, Python, and Go SDKs and there are several JIRA issues in
    this regard but involvement from more people could help.

    Some helpful pointers are:
    https://s.apache.org/beam-runner-api
    <https://s.apache.org/beam-runner-api>
    https://s.apache.org/beam-fn-api <https://s.apache.org/beam-fn-api>
    
https://issues.apache.org/jira/browse/BEAM-3515?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20portability
    
<https://issues.apache.org/jira/browse/BEAM-3515?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20portability>

    On Fri, Jan 26, 2018 at 7:21 AM, Etienne Chauchot
    <[email protected] <mailto:[email protected]>> wrote:

        Hi all,

        Does anyone have comments about my point about dev coherence
        across the runners?

        Thanks
        Etienne


        Le 22/01/2018 à 16:16, Etienne Chauchot a écrit :

            Thanks Davor for bringing this discussion up!

            I particularly like that you listed the different areas of
            improvement and proposed to assign people based on their
            tastes.

            I wanted to add a point about consistency across runners,
            but through the dev point of view: I've been working on a
            trans-runner feature lately (metrics push agnostic of the
            runners) for which I compared the behavior of the runners
            and wired up this feature into the flink and spark runners
            themselves. I must admit that I had a hard time figuring
            out how to wire it up in the different runners and that it
            was completely different between the runners. Also, their
            use (or non-use) of runner-core facilities vary. Even in
            the architecture of the tests: some, like spark, own their
            validates runner tests in the runner module and others
            runners run validates runner tests that are owned by
            sdk-core module. I also noticed some differences in the
            way to do streaming test: for some runners to trigger
            streaming mode it is needed to use an equivalent of direct
            runner's TestStream in the pipeline but for others putting
            streaming=true in pipelineOptions is enough.

            => long story short, IMHO I think that It could be
            interesting to enhance the runner API to contain more than
            run(). This could have the benefit to increase the
            coherence between runners. Besides we would need to find
            the correct balance between too many methods in the runner
            api that would reduce the flexibility of the runner
            implementations and too few methods that would reduce the
            coherence between the runners.

            =>In addition, to enhance the coherence (dev point of
            view) between the runners, having all the runners run the
            exact same validates runner tests in both batch and
            streaming modes would be awesome!

            Another thing: big +1 to have a programmatic way of
            defining the capability matrix as Romain suggested.

            Also agree on Ismaël's point about too flexible concepts
            across runners (termination, bundling, ...)

            Also big +1 to what Jessee wrote. I was myself in the past
            in the user architect position, and I can confirm that all
            the points that he mentioned are accurate.

            Best,

            Etienne


            Le 16/01/2018 à 17:39, Ismaël Mejía a écrit :

                Thanks Davor for opening this discussion and HUGE +1
                to do this every
                year or in cycles. I will fork this thread into a new
                one for the
                Culture / Project management part issues as suggested.

                About the diversity of users across runners subject I
                think that this
                requires more attention to unification and implies at
                least work in
                different areas:

                * Automatized validation and consistent semantics
                among runners

                Users should be confident that moving their code from
                one runner to
                the other just works and the only way to ensure this
                is by having a
                runner to pass ValidatesRunner/TCK tests and with this
                'graduate' such
                support as Romain suggested. The capatility-matrix is
                really nice but
                it is not a programmatic way to do this. And also
                usually individual
                features do work, but feature combinations produce
                issues so we need
                to have a more exact semantics to avoid these.

                Some parts of Beam's semantics are loose (e.g. bundle
                partiiioning,
                pipeline termination, etc.), this I suppose has been a
                design decision
                to allow flexibility in the runners implementation but
                it becomes
                inconvenient when users move among runners and have
                different results.
                I am not sure if the current tradeoff is worth the
                usability sacrifice
                for the end user.

                * Make user experience across runners a priority

                Today all runners not only behave in different ways
                but the way users
                publish and package their applications differ. Of
                course this is not a
                trivial problem because deployment normally is a end
                user problem, but
                we can improve in this area, e.g. guaranteeing a
                consistent deployment
                mechanism across runners, and making IO integration
                easier for example
                when using multiple IOs and switching runners it is
                easy to run into
                conflicts, we should try to minimize this for the
                end-users.

                * Simplify operational tasks among runners

                We need to add a minimum degree of consistent
                observability across
                runners. Of course Beam has metrics to do this, but it
                is not enough,
                an end-user that starts on one runner and moves to
                another has to deal
                with a totally different set of logs and operational
                issues. We can
                try to improve this too, of course without trying to
                cover the full
                spectrum but at least bringing some minimum set of
                observability. I
                hope that the current work on portability will bring
                some improvements
                in this area. But this is crucial for users that
                probably pass more
                time running (and dealing) with issues in their jobs
                than writing
                them.

                We need to have some integration tests that simulate
                common user
                scenarios and some distribution use cases, e.g.
                Probably the most
                common data store used for streaming is Kafka (at
                least in Open
                Source). We should have an IT that tests some common
                issues that can
                arrive when you use kafka, what happens if a kafka
                broker goes down,
                does Beam continues to read without issue? what about
                a new leader
                election, do we continue to work as expected, etc. Few
                projects have
                something like this and this will send a clear message
                that Beam cares
                about reliability too.

                Apart of these, I think we also need to work on:

                * Simpler APIs + User friendly libraries.

                I want to add a big thanks for Jesse for his list on
                criteria that
                people seek when they choose a framework for data
                processing. And the
                first point 'Will this dramatically improve the
                problems I'm trying to
                solve?' is super important. Of course Beam has
                portability and a rich
                model as its biggest assets  but I have been
                consistently asked in
                conferences if Beam has libraries for graph
                processing, CEP, Machine
                Learning or a Scala API.

                Of course we have had some progress with the recent
                addition of the
                SQL and hopefully the schema-aware PCollections would
                help there too,
                but there is still some way to go, and of course this
                can not be
                crucial considering the portability goals of Beam but
                these libraries
                are sometimes what make users to decide if they use a
                tool or not, so
                better have those than not.

                These are the most important issues from my point of
                view. my excuses
                for the long email but this was the perfect moment to
                discuss these.

                One extra point I think we should write and agree on a
                concise roadmap
                and take a look at our progress on it at the middle
                and the end of the
                year as other communities do.

                Regards,
                Ismaël

                On Mon, Jan 15, 2018 at 7:49 PM, Jesse Anderson
                <[email protected]
                <mailto:[email protected]>> wrote:

                    I think a focus on the runners is what's key to
                    Beam's adoption. The runners
                    are the foundation on which Beam sits. If the
                    runners don't work properly,
                    Beam won't work.

                    A focus on improved unit tests is a good start,
                    but isn't what's needed.
                    Compatibility matrices will help see how your
                    runner of choice stacks up,
                    but that requires too much knowledge of Beam's
                    internals to be
                    interpretable.

                    Imagine you're an (enterprise) architect looking
                    at adopting Beam. What do
                    you look at or what do you look for before going
                    deeper? What would make you
                    stick your neck out to adopt Beam? For my
                    experience, there are several/pass
                    fails along the way.

                    Here are a few of the common ones I've seen:

                    Will this dramatically improve the problems I'm
                    trying to solve? (not
                    writing APIs/better programming model/Beam's
                    better handling of windowing)
                    Can I get commercial support for Beam? (This is
                    changing soon)
                    Are other people using Beam with the configuration
                    and use case as me? (e.g.
                    I'm using Spark with Beam to process imagery. Are
                    others doing this in
                    production?)
                    Is there good documentation and books on the
                    subject? (Tyler's and others'
                    book will improve this)
                    Can I get my team trained on this new technology?
                    (I have Beam training and
                    Google has some cursory training)

                    I think the one the community can improve on the
                    most is the social proof of
                    Beam. I've tried to do this
                    (http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/
                    <http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/>
                    and
                    
http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/
                    
<http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/>).

                    We need to get the message out more about people
                    using Beam in production,
                    which configuration they have, and what their
                    results were. I think we have
                    the social proof on Dataflow, but not as much on
                    Spark/Flink/Apex.

                    I think it's important to note that these checks
                    don't look at the hardcore
                    language or API semantics that we're working on.
                    These are much later stage
                    issues, if they're ever used at all.

                    In my experience with other open source adoption
                    at enterprises, it starts
                    with architects and works its way around the
                    organization from there.

                    Thanks,

                    Jesse

                    On Mon, Jan 15, 2018 at 8:14 AM Ted Yu
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        bq. are hard to detect in our unit-test framework

                        Looks like more integration tests would help
                        discover bug / regression
                        more quickly. If committer reviewing the PR
                        has concern in this regard, the
                        concern should be stated on the PR so that the
                        contributor (and reviewer)
                        can spend more time in solidifying the solution.

                        bq. I've gone and fixed these issues myself
                        when merging

                        We can make stricter checkstyle rules so that
                        the code wouldn't pass build
                        without addressing commonly known issues.

                        Cheers

                        On Sun, Jan 14, 2018 at 12:37 PM, Reuven Lax
                        <[email protected] <mailto:[email protected]>>
                        wrote:

                            I agree with the sentiment, but I don't
                            completely agree with the
                            criteria.

                            I think we need to be much better about
                            reviewing PRs. Some PRs languish
                            for too long before the reviewer gets to
                            it (and I've been guilty of this
                            too), which does not send a good message.
                            Also new PRs sometimes languish
                            because there is no reviewer assigned;
                            maybe we could write a gitbot to
                            automatically assign a reviewer to every
                            new PR?

                            Also, I think that the bar for merging a
                            PR from a contributor should not
                            be "the PR is perfect." It's perfectly
                            fine to merge a PR that still has
                            some issues (especially if the issues are
                            stylistic). In the past when I've
                            done this, I've gone and fixed these
                            issues myself when merging. It was a
                            bit more work for me to fix these things
                            myself, but it was a small price to
                            pay in order to portray Beam as a
                            welcoming place for contributions.

                            On the other hand, "the build does not
                            break" is - in my opinion - too
                            weak of a criterion for merging. A few
                            reasons for this:

                               * Beam is a data-processing framework,
                            and data integrity is paramount.
                            If a reviewer sees an issue that could
                            lead to data loss (or duplication, or
                            corruption), I don't think that PR should
                            be merged. Historically many such
                            issues only actually manifest at scale,
                            and are hard to detect in our
                            unit-test framework. (we also need to
                            invest in more at-scale tests to catch
                            such issues).

                               * Beam guarantees backwards
                            compatibility for users (except across
                            major versions). If a bad API gets merged
                            and released (and the chances of
                            "forgetting" about it before the release
                            is cut is unfortunately high), we
                            are stuck with it. This is less of an
                            issue for many other open-source
                            projects that do not make such a
                            compatibility guarantee, as they are able
                            to simply remove or fix the API in the
                            next version.

                            I think we still need honest review of
                            PRs, with the criteria being
                            stronger than "the build doesn't break."
                            However reviewers also need to be
                            reasonable about what they ask for.

                            Reuven

                            On Sun, Jan 14, 2018 at 11:19 AM, Ted Yu
                            <[email protected]
                            <mailto:[email protected]>> wrote:

                                bq. if a PR is basically right (it
                                does what it should) without breaking
                                the build, then it has to be merged fast

                                +1 on above.
                                This would give contributors positive
                                feedback.

                                On Sun, Jan 14, 2018 at 8:13 AM,
                                Jean-Baptiste Onofré <[email protected]
                                <mailto:[email protected]>>
                                wrote:

                                    Hi Davor,

                                    Thanks a lot for this e-mail.

                                    I would like to emphasize two
                                    areas where we have to improve:

                                    1. Apache way and community. We
                                    still have to focus and being
                                    dedicated
                                    on our communities (both user &
                                    dev). Helping, encouraging,
                                    growing our
                                    communities is key for the
                                    project. Building bridges between
                                    communities is
                                    also very important. We have to be
                                    more "accessible": sometime
                                    simplifying
                                    our discussions, showing more
                                    interest and open minded in the
                                    proposals
                                    would help as well. I think we do
                                    a good job already: we just have to
                                    improve.

                                    2. Execution: a successful project
                                    is a project with a regular activity
                                    in term of releases, fixes,
                                    improvements.
                                    Regarding the PR, I think today we
                                    have a PR opened for long. And I
                                    think for three reasons:
                                    - some are not ready, not good
                                    enough, no question on these ones
                                    - some needs reviewer and speed
                                    up: we have to be careful on the open
                                    PRs and review asap
                                    - some are under review but we
                                    have a lot of "ping pong" and long
                                    discussion, not always justified.
                                    I already said that on the mailing
                                    list
                                    but, as for other Apache projects,
                                    if a PR is basically right (it
                                    does what
                                    it should) without breaking the
                                    build, then it has to be merged
                                    fast. If it
                                    requires additional changes
                                    (tests, polishing, improvements,
                                    ...), then it
                                    can be addressed in new PRs.
                                    As already mentioned in the Beam
                                    2.3.0 thread, we have to adopt a
                                    regular schedule for releases.
                                    It's a best effort to have a
                                    release every 2
                                    months, whatever the release will
                                    contain. That's essential to
                                    maintain a
                                    good activity in the project and
                                    for the third party projects using
                                    Beam.

                                    Again, don't get me wrong: we
                                    already do a good job ! It's just area
                                    where I think we have to improve.

                                    Anyway, thanks for all the hard
                                    work we are doing all together !

                                    Regards
                                    JB


                                    On 13/01/2018 05:12, Davor Bonaci
                                    wrote:

                                        Hi everyone --
                                        Apache Beam was established as
                                        a top-level project a year ago (on
                                        December 21, to be exact).
                                        This first anniversary is a
                                        great opportunity for
                                        us to look back at the past
                                        year, celebrate its successes,
                                        learn from any
                                        mistakes we have made, and
                                        plan for the next 1+ years.

                                        I’d like to invite everyone in
                                        the community, particularly
                                        users and
                                        observers on this mailing
                                        list, to participate in this
                                        discussion. Apache
                                        Beam is your project and I,
                                        for one, would much appreciate
                                        your candid
                                        thoughts and comments. Just as
                                        some other projects do, I’d
                                        like to make this
                                        “state of the project”
                                        discussion an annual tradition
                                        in this community.

                                        In terms of successes, the
                                        availability of the first
                                        stable release,
                                        version 2.0.0, was the biggest
                                        and most important milestone
                                        last year.
                                        Additionally, we have expanded
                                        the project’s breadth with new
                                        components,
                                        including several new runners,
                                        SDKs, and DSLs, and
                                        interconnected a large
                                        number of storage/messaging
                                        systems with new Beam IOs. In
                                        terms of community
                                        growth, crossing 200 lifetime
                                        individual contributors and
                                        achieving 76
                                        contributors to a single
                                        release were other highlights.
                                        We have doubled the
                                        number of committers, and
                                        invited a handful of new PMC
                                        members. Thanks to
                                        each and every one of you for
                                        making all of this possible in
                                        our first year.

                                        On the other hand, in such a
                                        young project as Beam, there are
                                        naturally many areas for
                                        improvement. This is the
                                        principal purpose of this
                                        thread (and any of its forks).
                                        To organize the separate
                                        discussions, I’d
                                        suggest to fork separate
                                        threads for different
                                        discussion areas:
                                        * Culture and governance
                                        (anything related to people
                                        and their
                                        behavior)
                                        * Community growth (what can
                                        we do to further grow a
                                        diverse and
                                        vibrant community)
                                        * Technical execution
                                        (anything related to releases,
                                        their frequency,
                                        website, infrastructure)
                                        * Feature roadmap for 2018
                                        (what can we do to make the
                                        project more
                                        attractive to users, Beam 3.0,
                                        etc.).

                                        I know many passionate folks
                                        who particularly care about
                                        each of these
                                        areas, but let me call on some
                                        folks from the community to
                                        get things
                                        started: Ismael for culture,
                                        Gris for community, JB for
                                        technical execution,
                                        and Ben for feature roadmap.

                                        Perhaps we can use this thread
                                        to discuss project-wide
                                        vision. To seed
                                        that discussion, I’d start
                                        somewhat provocatively -- we
                                        aren’t doing so well
                                        on the diversity of users
                                        across runners, which is very
                                        important to the
                                        realization of the project’s
                                        vision. Would you agree, and
                                        would you be
                                        willing to make it the
                                        project’s #1 priority for the
                                        next 1-2 years?

                                        Thanks -- and please join us
                                        in what would hopefully be a
                                        productive
                                        and informative discussion
                                        that shapes the future of this
                                        project!

                                        Davor

Re: [DISCUSS] State of the project

Reply via email to