+1 to Henry's comment, once this makes it to the wiki/website the wording needs to make it clear that the governance model is unchanged
On Mon, May 16, 2016 at 10:02 AM, Theodore Vasiloudis < theodoros.vasilou...@gmail.com> wrote: > I like the idea of having maintainers as well, hopefully we can streamline > the reviewing process. > > I of course can volunteer for the FlinkML component. > As I've mentioned before I'd love to get one more committer willing to > review PRs in FlinkML; by my last count we were up to ~20 open ML-related > PRs. > > Regards, > Theodore > > On Mon, May 16, 2016 at 2:17 AM, Henry Saputra <henry.sapu...@gmail.com> > wrote: > > > The maintainers concept is good idea to make sure PRs are moved smoothly. > > > > But, we need to make sure that this is not additional hierarchy on top of > > Flink PMCs. > > This will keep us in spirit of ASF community over code. > > > > Please do add me as cluster management maintainer member. > > > > - Henry > > > > On Tuesday, May 10, 2016, Stephan Ewen <se...@apache.org> wrote: > > > > > Hi everyone! > > > > > > We propose to establish some lightweight structures in the Flink open > > > source community and development process, > > > to help us better handle the increased interest in Flink (mailing list > > and > > > pull requests), while not overwhelming the > > > committers, and giving users and contributors a good experience. > > > > > > This proposal is triggered by the observation that we are reaching the > > > limits of where the current community can support > > > users and guide new contributors. The below proposal is based on > > > observations and ideas from Till, Robert, and me. > > > > > > ======== > > > Goals > > > ======== > > > > > > We try to achieve the following > > > > > > - Pull requests get handled in a timely fashion > > > - New contributors are better integrated into the community > > > - The community feels empowered on the mailing list. > > > But questions that need the attention of someone that has deep > > > knowledge of a certain part of Flink get their attention. > > > - At the same time, the committers that are knowledgeable about many > > core > > > parts do not get completely overwhelmed. > > > - We don't overlook threads that report critical issues. > > > - We always have a pretty good overview of what the status of certain > > > parts of the system are. > > > -> What are often encountered known issues > > > -> What are the most frequently requested features > > > > > > > > > ======== > > > Problems > > > ======== > > > > > > Looking into the process, there are two big issues: > > > > > > (1) Up to now, we have been relying on the fact that everything just > > > "organizes itself", driven by best effort. That assumes > > > that everyone feels equally responsible for every part, question, and > > > contribution. At the current state, this is impossible > > > to maintain, it overwhelms the committers and contributors. > > > > > > Example: Pull requests are picked up by whoever wants to pick them up. > > Pull > > > requests that are a lot of work, have little > > > chance of getting in, or relate to less active components are sometimes > > not > > > picked up. When contributors are pretty > > > loaded already, it may happen that no one eventually feels responsible > to > > > pick up a pull request, and it falls through the cracks. > > > > > > (2) There is no good overview of what are known shortcomings, efforts, > > and > > > requested features for different parts of the system. > > > This information exists in various peoples' heads, but is not easily > > > accessible for new people. The Flink JIRA is not well > > > maintained, it is not easy to draw insights from that. > > > > > > > > > =========== > > > The Proposal > > > =========== > > > > > > Since we are building a parallel system, the natural solution seems to > > be: > > > partition the workload ;-) > > > > > > We propose to define a set of components for Flink. Each component is > > > maintained or tracked by one or more > > > people - let's call them maintainers. It is important to note that we > > don't > > > suggest the maintainers as an authoritative role, but > > > simply as committers or contributors that visibly step up for a certain > > > component, and mainly track and drive the efforts > > > pertaining to that component. > > > > > > It is also important to realize that we do not want to suggest that > > people > > > get less involved with certain parts and components, because > > > they are not the maintainers. We simply want to make sure that each > pull > > > request or question or contribution has in the end > > > one person (or a small set of people) responsible for catching and > > tracking > > > it, if it was not worked on by the pro-active > > > community. > > > > > > For some components, having multiple maintainers will be helpful. In > that > > > case, one maintainer should be the "chair" or "lead" > > > and make sure that no issue of that component gets lost between the > > > multiple maintainers. > > > > > > > > > A maintainers' role is: > > > ----------------------------- > > > > > > - Have an overview of which of the open pull requests relate to their > > > component > > > - Drive the pull requests relating to the component to resolution > > > => Moderate the decision whether the feature should be merged > > > => Make sure the pull request gets a shepherd. > > > In many cases, the maintainers would shepherd themselves. > > > => In case the shepherd becomes inactive, the maintainers need to > > > find a new shepherd. > > > > > > - Have an overview of what are the known issues of their component > > > - Have an overview of what are the frequently requested features of > > their > > > component > > > > > > - Have an overview of which contributors are doing very good work in > > > their component, > > > would be candidates for committers, and should be mentored towards > > > that. > > > > > > - Resolve email threads that have been brought to their attention, > > > because deeper > > > component knowledge is required for that thread. > > > > > > A maintainers' role is NOT: > > > ---------------------------------- > > > > > > - Review all pull requests of that component > > > - Answer every mail with questions about that component > > > - Fix all bugs and implement all features of that components > > > > > > > > > We imagine the following way that the community and the maintainers > > > interact: > > > > > > > > > --------------------------------------------------------------------------------------------------------- > > > > > > - Pull requests should be tagged by component. Since we cannot add > > labels > > > at this point, we need > > > to rely on the following: > > > => The pull request opener should name the pull request like > > > "[FLINK-XXX] [component] Title" > > > => Components can be (re) tagged by adding special comments in the > > > pull request ("==> component client") > > > => With some luck, GitHub and Apache Infra will allow us to use > > labels > > > at some point > > > > > > - When pull requests are associated with a component, the maintainers > > > will manage them > > > (decision whether to add, find shepherd, catch dropped pull > requests) > > > > > > - We assume that maintainers frequently reach out to other community > > > members and ask them if they want > > > to shepherd a pull request. > > > > > > - On the mailing list, everyone should feel equally empowered to > answer > > > and discuss. > > > If at some point in the discussion, some deep technical knowledge > > about > > > a component is required, > > > the maintainer(s) should be drawn into the discussion. > > > Because the Mailing List infrastructure has no support to tag > > threads, > > > here are some simple workarounds: > > > > > > => One possibility is to put the maintainers' mail addresses on cc > > for > > > the thread, so they get the mail > > > not just via l the mailing list > > > => Another way would be to post something like "+maintainer > runtime" > > in > > > the thread and the "runtime" > > > maintainers would have a filter/alert on these keywords in > their > > > mail program. > > > > > > - We assume that maintainers will reach out to community members that > > are > > > very active and helpful in > > > a component, and will ask them if they want to be added as > > maintainers. > > > That will make it visible that those people are experts for that > part > > > of Flink. > > > > > > > > > ====================================== > > > Maintainers: Committers and Contributors > > > ====================================== > > > > > > It helps if maintainers are committers (since we want them to resolve > > pull > > > requests which often involves > > > merging them). > > > > > > Components with multiple maintainers can easily have non-committer > > > contributors in addition to committer > > > contributors. > > > > > > > > > ====== > > > JIRA > > > ====== > > > > > > Ideally, JIRA can be used to get an overview of what are the known > issues > > > of each component, and what are > > > common feature requests. Unfortunately, the Flink JIRA is quite > > unorganized > > > right now. > > > > > > A natural followup effort of this proposal would be to define in JIRA > the > > > same components as we defined here, > > > and have the maintainers keep JIRA meaningful for that particular > > > component. That would allow us to > > > easily generate some tables out of JIRA (like top known issues per > > > component, most requested features) > > > post them on the dev list once in a while as a "state of the union" > > report. > > > > > > Initial assignment of issues to components should be made by those > people > > > opening the issue. The maintainer > > > of that tagged component needs to change the tag, if the component was > > > classified incorrectly. > > > > > > > > > ====================================== > > > Initial Components and Maintainers Suggestion > > > ====================================== > > > > > > Below is a suggestion of how to define components for Flink. One goal > of > > > the division was to make it > > > obvious for the majority of questions and contributions to which > > component > > > they would relate. Otherwise, > > > if many contributions had fuzzy component associations, we would again > > not > > > solve the issue of having clear > > > responsibilities for who would track the progress and resolution. > > > > > > We also looked at each component and wrote the names of some people who > > we > > > thought were natural > > > experts for the components, and thus natural candidates for > maintainers. > > > > > > **These names are only a starting point for discussion.** > > > > > > Once agreed upon, the components and names of maintainers should be > kept > > in > > > the wiki and updated as > > > components change and people step up or down. > > > > > > > > > *DataSet API* (*Fabian, Greg, Gabor*) > > > - Incuding Hadoop compat. parts > > > > > > *DataStream API* (*Aljoscha, Max, Stephan*) > > > > > > *Runtime* > > > - Distributed Coordination (JobManager/TaskManager, Akka) (*Till*) > > > - Local Runtime (Memory Management, State Backends, Tasks/Operators) > ( > > > *Stephan*) > > > - Network (*Ufuk*) > > > > > > *Client/Optimizer* (*Fabian*) > > > > > > *Type system / Type extractor* (Timo) > > > > > > *Cluster Management* (Yarn, Mesos, Docker, ...) (*Max, Robert*) > > > > > > *Libraries* > > > - Gelly (*Vasia, Greg*) > > > - ML (*Till, Theo*) > > > - CEP (*Till*) > > > - Python (*Chesnay*) > > > > > > *Table API & SQL* (*Fabian, Vasia, Timo, Chengxiang*) > > > > > > *Streaming Connectors* (*Robert*, *Aljoscha*) > > > > > > *Batch Connectors and Input/Output Formats* (*Chesnay*) > > > > > > *Storm Compatibility Layer* (*Mathias*) > > > > > > *Scala shell* (*Till*) > > > > > > *Startup Shell Scripts* (Ufuk) > > > > > > *Flink Build System, Maven Files* (*Robert*) > > > > > > *Documentation* (Ufuk) > > > > > > > > > Please let us know what you think about this proposal. > > > Happy discussing! > > > > > > Greetings, > > > Stephan > > > > > >