Re: [DISCUSS] META-FLIP: Sticking (or not) to a strict FLIP voting process

Becket Qin Thu, 27 Jun 2019 19:06:57 -0700

Thanks a lot for bringing this up, Aljoscha.

Big +1 to the following:


1. Stick to a strict FLIP voting process.
In practice, I rarely see a FLIP with a voting thread. In fact, the search
in mail archive
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/template/NamlServlet.jtp?macro=search_page&node=1&query=subject%3AVOTE%2CFLIP&days=0>
gives
only 3 FLIPs with voting thread, and unfortunately none of them has met the
lazy majority requirements, which needs 3 binding votes. However, we have
11 adopted-but-unreleased FLIPs and 16 released FLIPs.
Even though we claimed *"These proposals are more serious than code changes
and more serious even than release votes.", *we did not really treat them
seriously. The missing voting process effectively put the efforts of FLIP
in vain. This leads to a few consequences:
a) The conclusion of the FLIP is never really finalized. People may change
the FLIP at wish during the implementation.
b) Some "adopted" FLIPs only have conceptual ideas instead of necessary
concrete interfaces, which leaves a lot of problems in the implementation
phase.
c) New contributors are completely confused on how to contribute. The
voting threads seems died, and magically someone else's code got checked in
without a passed FLIP. These "good citizens" may feel excluded and simply
leave the chaos.
d) API changes / user sensible behavior changes may be checked in without
being carefully inspected. To fix them, hacky tricks has to be made in
order to keep backwards compatibility.

So a huge +1 to stick to the FLIP voting process.

2. Stick to the definition of major changes. Generally speaking any user
sensible changes should go through a FLIP.
    - Some changes may be small from the size of patch perspective, but the
impact could be huge. Take metric as an example, imagine a cloud service
provider who relies on a metric to do alerting or bill their customer. Any
change to such metrics will have huge impact on them.
    - Sometimes there might be no "interface" change per se, but the
behavior of a method is slightly changed. Even that can be very annoying to
some users. So I think any user sensible changes should go through a FLIP.

3. Generally speaking, make each FLIP completable in a reasonable amount of
time. Some large changes may need multiple FLIPs.
   - I agree with David that a long lasting FLIP can be problematic as it
could become obsolete before the work is done. And might need to make
changes to the original proposal multiple times. It might be a little
difficult to have a standard to say what kind of FLIP is a long lasting
FLIP.
   - Sometimes long lasting FLIP may be necessary, e.g. a big new module /
functionality, etc. Those FLIPs are rare and usually more independent. We
may need to treat them case by case.

4. Take the votes from both committers and PMCs as binding.


In addition, I'd like to propose the following:

1. Always discuss the FLIP based on a FLIP wiki page instead of a Google
doc. It is perfectly fine to use google doc to explain stuff, but the FLIP
wiki page is the official source for the proposal. The discussion and vote
needs to be based on that.

According to the process of FLIP
<https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process>,
one should create a FLIP wiki page "before" starting a discussion ML
thread. The discussion is supposed to be happen in ML but based on the FLIP
wiki. This process has some benefits:
    a) Since all the FLIP proposals must give necessary information such as
public interface change / behavior change / migration plan and such, the
authors are enforced to think about them.
    b) Even if a FLIP is finally rejected, we have all the history of it.
These are valuable assets of the project and would give a good reference
for later contributors.

However, in practice, what people usually do is to have a Google doc for
discussion and only create a FLIP wiki page after that idea is accepted by
the community. There might be a few caveats in this:
a) The Google docs may be organized in various ways and something important
might be missing. This sometimes harms the review efficiency as people
might have to ask some basic questions.
b) More importantly, the rejected proposals will be silently lost without
any history - later contributors will not be able to know what happened
before, and there is no guarantee that the google docs will always be
accessible.
c) From process perspective, one may be confused on whether a discussion
thread on the FLIP wiki is still needed if people have agreed on the google
doc. (At least I always feel a little awkward after the google doc has been
agreed upon)

2. The public interface change proposal should be concrete in each FLIP,
instead of conceptual. This avoids surprises in the implementation phase.

3. Adopted FLIP should mostly be "immutable". Any change to an adopted FLIP
requires a new voting process. For minor changes, a Lazy Approval process
can be applied, i.e. announce the change in the voting ML thread, get at
least one binding +1. In case of any -1, a new lazy majority vote is
required.

As someone deeply involved in Kafka and KIP process design and execution, I
saw how critical it is to the healthiness of such projects keeping going
through tons of changes. I believe that the FLIP process could play a more
effective role to organize major changes and improve the overall
contribution efficiency, code quality / stability of Flink. To achieve
that, we really have to take the FLIP process seriously, follow it by
ourselves and mentor the community to do the same.

Thanks,

Jiangjie (Becket) Qin

On Thu, Jun 27, 2019 at 10:28 PM Stephan Ewen <[email protected]> wrote:

> +1 to re-think the FLIP process a bit.
>
> I think more explicit approval is certainly a good idea.
> Who can vote on FLIPs is a question to be answered, though. I think PMCs
> only would be a bit too strict.
>
> On Thu, Jun 27, 2019 at 11:38 AM Hequn Cheng <[email protected]> wrote:
>
> > Hi all,
> >
> > Thanks for raising the nice discussion @Aljoscha.
> >
> > +1 to sticking to the "lazy majority" voting process.
> > It is good to get more people involved in the design discussion and get
> > enough binding votes.
> >
> > As for the scope of the FLIP, previous replies show a lot of good
> thoughts.
> > On the other hand, I think we can also define some scope that which
> should
> > *not* be a FLIP.
> > Sometimes it is easier for us to list a blacklist.
> >
> > Best, Hequn
> >
> > On Thu, Jun 27, 2019 at 5:27 PM Biao Liu <[email protected]> wrote:
> >
> > > Hi community,
> > >
> > > Thanks Aljoscha for bringing us this discussion.
> > >
> > > As Aljoscha said, "lazy majority" is always the voting rule of FLIP. It
> > > seems that people just ignored or didn't realized this rule.
> > > My concern is that what we can do to make sure developers will obey the
> > > rules.
> > > I think Kurt has given a good suggestion. Since the community is
> growing
> > > bigger and bigger, maybe we need some volunteers to host the progress
> of
> > > FLIP. Like start a discussion/voting in ML or update the sheet of FLIP
> > > document [1].
> > >
> > > 1.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> > >
> > >
> > >
> > > Dawid Wysakowicz <[email protected]> 于2019年6月27日周四 下午2:56写道：
> > >
> > > > Hi all,
> > > >
> > > > I do very much agree with the statement from Aljosha's initial
> message,
> > > > which is currently also expressed in the description page of a FLIP.
> > > >
> > > > These will stick around for quite a while after they’re implemented
> and
> > > the PMC (and the committers) has the burden of maintaining them. I
> think
> > > that therefore FLIP votes are even move important than release votes,
> > > because they steer the long time direction of Flink.
> > > >
> > > >
> > > > Therefore I think we should enforce following the lazy majority
> > approach.
> > > > I will probably just repeat what was already said, but I do think
> this
> > > > would make the decisions more visible, easier to reference in case of
> > > > related decisions, and also this would show if the community has
> > capacity
> > > > to implement the FLIP. Nowadays, even if a FLIP is "accepted" it
> might
> > be
> > > > just stale because there are no committers that have the capacity to
> > help
> > > > with the changes.
> > > >
> > > > Another, maybe an orthogonal issue, is that we could maybe use this
> > > > process for agreeing on a scope of a release. I think it might make
> > sense
> > > > to construct a release plan of an accepted FLIPs. This would enforce
> > > better
> > > > scoping of FLIPs, as they would have to fit into a single release. In
> > my
> > > > opinion FLIPs that spawn multiple releases(thus even over multiple
> > years)
> > > > are rarely relevant in the future anymore, as the project evolves and
> > it
> > > > usually makes sense to revisit the original proposal anyway. This
> would
> > > > have the benefits that:
> > > >
> > > >    - we have a clear scope for a release rather than just a vague
> list
> > of
> > > >    features that we want to have.
> > > >    - the whole community is on the same page what a certain feature
> > means
> > > >    - the scope does not change drastically during the development
> > period
> > > >
> > > > As for what should and what should not deserve a FLIP, I actually
> quite
> > > > like the definition in the FLIPs page[1]. I think it does make sense
> to
> > > > have a FLIP, and as a result a voting process, for any *public* or
> > major
> > > > change. I agree with Gordon. Even if the change is trivial it might
> > > affect
> > > > external systems/users and it is also a commitment from the
> community.
> > > > Therefore I think they deserve a vote.
> > > >
> > > > Lastly, I think Jark raised a valid point. We should have a clear
> > > > understanding what binding votes in this case mean. I think it makes
> > > sense
> > > > to consider PMC's and committers' votes as binding for FLIPs voting.
> > > > Otherwise we would lose the aspect of committing to help with getting
> > the
> > > > FLIP into the codebase.
> > > >
> > > > To sum up I would opt for enforcing the lazy majority. I would
> suggest
> > to
> > > > consider constructing a release plan with a list of accepted FLIPs.
> > > >
> > > > Best,
> > > >
> > > > Dawid
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Whatisconsidereda%22majorchange%22thatneedsaFLIP
> > > > ?
> > > > On 27/06/2019 04:15, Jark Wu wrote:
> > > >
> > > > +1 for sticking to the lazy majority voting.
> > > >
> > > > A question from my side, the 3+1 votes are binding votes which only
> > > active
> > > > (i.e. non-emeritus) committers and PMC members have?
> > > >
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > > On Wed, 26 Jun 2019 at 19:07, Tzu-Li (Gordon) Tai <
> [email protected]
> > >
> > > <[email protected]>
> > > > wrote:
> > > >
> > > >
> > > > +1 to enforcing lazy majority voting for future FLIPs, starting from
> > > FLIPs
> > > > that are still currently under discussion (by the time we've agreed
> on
> > > the
> > > > FLIP voting process).
> > > >
> > > > My two cents concerning "what should and shouldn't be a FLIP":
> > > >
> > > > I can understand Chesnay's argument about how some FLIPs, while
> meeting
> > > the
> > > > criteria defined by the FLIP guidelines, feel to not be sufficiently
> > > large
> > > > to justify a FLIP.
> > > > As a matter of fact, the FLIP guidelines explicitly mention that
> > "Exposed
> > > > Monitoring Information" is considered public interface; I guess that
> > was
> > > > why this FLIP came around in the first place.
> > > > I was also hesitant in whether or not the recent FLIP about keyed
> state
> > > > snapshot binary format unification (FLIP-41) deserves to be a FLIP,
> > since
> > > > the complexity of the change is rather small.
> > > >
> > > > However, with the fact that these changes indeed touch the general
> > public
> > > > interface of Flink, the scope (including all potential 3rd party
> > > projects)
> > > > is strictly speaking hard to define.
> > > > Outcomes of such changes, even if the complexity of the change is
> > rather
> > > > trivial, can still stick around for quite a while.
> > > > In this case, IMO the value of proposing a FLIP for such a change is
> > less
> > > > about discussing design or implementation details, and more on the
> fact
> > > > that said change requires an official vote for approval from the
> > > community.
> > > >
> > > > Best,
> > > > Gordon
> > > >
> > > > On Wed, Jun 26, 2019 at 5:50 PM Chesnay Schepler <[email protected]
> >
> > <
> > > [email protected]>
> > > > wrote:
> > > >
> > > >
> > > > The FLIP guidelines disagree with your first point.
> > > >
> > > > The guidelines are a bit contradictory as at some places we say that
> > > > FLIPs are for major features, and in other places say they are for
> any
> > > > changes to the public API.
> > > > This very point came up in the recent FLIP about standardizing
> metrics.
> > > > Metrics are somewhat part of the public API, and thus you can
> interpret
> > > > the guidelines to say that you need a FLIP. But in terms of scope, I
> > > > believed it to not be sufficiently large to justify a FLIP.
> > > >
> > > > Overall I'm very much in favor of sticking to the lazy majority
> voting
> > > > scheme and enforcing it,
> > > > but I do think we have to reevaluate what changes require a FLIP and
> > > > which don't.
> > > >
> > > > On 26/06/2019 11:37, Aljoscha Krettek wrote:
> > > >
> > > > Hi All,
> > > >
> > > > When we originally introduced the FLIP process (which is based on the
> > > >
> > > > KIP process from Kafka and refers to the Kafka bylaws for how votes
> > work)
> > > > voting was set to be “lazy majority”. This means that a FLIP vote
> > > >
> > > > "requires
> > > >
> > > > 3 binding +1 votes and more binding +1 votes than -1 votes” [1][2].
> > > > Currently, we treat FLIP votes more like “lazy Approval”, i.e. if
> there
> > > >
> > > > are
> > > >
> > > > no -1 votes FLIP are often accepted, if there is a VOTE thread at
> all.
> > > >
> > > > I propose that we stick to the original process or update our FLIP
> > > >
> > > > document to a voting scheme that we agree on. I’m in favour of
> sticking
> > > > with “lazy majority”, for these reasons:
> > > >
> > > > 1. FLIPs should typically be used for deeper changes of Flink. These
> > > >
> > > > will stick around for quite a while after they’re implemented and the
> > PMC
> > > > (and the committers) has the burden of maintaining them. I think that
> > > > therefore FLIP votes are even move important than release votes,
> > because
> > > > they steer the long time direction of Flink.
> > > >
> > > > 2. Requiring at least 3 +1 votes means that there is more work needed
> > > >
> > > > for getting a FLIP accepted. I think this is a good thing because it
> > will
> > > > require people to be more involved in the direction of the project.
> And
> > > >
> > > > if
> > > >
> > > > there are not enough +1 votes on a FLIP, this is a signal that there
> is
> > > >
> > > > not
> > > >
> > > > enough interest in the feature or that there is not enough bandwidth
> > for
> > > > working on a feature.
> > > >
> > > > 3. This is more an “optics” thing, but I think having clear rules and
> > > >
> > > > sticking to them makes it easier for an international community (like
> > the
> > > > Apache Flink community) to work together and collaborate. If there is
> > > > preferential treatment for certain parts of the community that makes
> it
> > > > hard for other parts to participate and get into the community and
> > > > understand the workings of it.
> > > >
> > > > As a side note, I like the FLIP process because they are a place
> where
> > > >
> > > > we can keep track of important decisions and they are a place that we
> > can
> > > > point to when there is uncertainty about a certain feature in the
> > future.
> > > > For example FLIP-28 [3] (which is now discarded) would be a place
> where
> > > >
> > > > we
> > > >
> > > > record the decision that we want Flink to be Scala free in the long
> > term.
> > > > We could then point to this in the future. There are some decisions
> in
> > > > Flink that are somewhat hidden in ML discussions or Jira issues, and
> > > > therefore hard to find, for example the decision to eventually phase
> > out
> > > > the DataSet API, or the decision to drop the older Python APIs, or
> the
> > > > semantics of savepoints and checkpoints. Some FLIPs might not be
> about
> > > > implementing a certain feature but just a general direction that we
> > want
> > > >
> > > > to
> > > >
> > > > take. I think we should have more of these.
> > > >
> > > > What do you think?
> > > >
> > > > Best,
> > > > Aljoscha
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> > > >
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Bylaws#Bylaws-Approvals
> > > >
> > > > [3]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] META-FLIP: Sticking (or not) to a strict FLIP voting process

Reply via email to