+1 from my side. Happy to be the maintainer for Storm-Compatibiltiy (at least I guess it's me, even the correct spelling would be with two 't' :P)
-Matthias On 05/12/2016 12:56 PM, Till Rohrmann wrote: > +1 for the proposal > On May 12, 2016 12:13 PM, "Stephan Ewen" <se...@apache.org> wrote: > >> Yes, Gabor Gevay, that did refer to you! >> >> Sorry for the ambiguity... >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi <balassi.mar...@gmail.com >>> >> wrote: >> >>> +1 for the proposal >>> @ggevay: I do think that it refers to you. :) >>> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay <gga...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> There are at least three Gábors in the Flink community, :) so >>>> assuming that the Gábor in the list of maintainers of the DataSet API >>>> is referring to me, I'll be happy to do it. :) >>>> >>>> Best, >>>> Gábor G. >>>> >>>> >>>> >>>> 2016-05-10 11:24 GMT+02:00 Stephan Ewen <se...@apache.org>: >>>>> Hi everyone! >>>>> >>>>> We propose to establish some lightweight structures in the Flink open >>>>> source community and development process, >>>>> to help us better handle the increased interest in Flink (mailing >> list >>>> and >>>>> pull requests), while not overwhelming the >>>>> committers, and giving users and contributors a good experience. >>>>> >>>>> This proposal is triggered by the observation that we are reaching >> the >>>>> limits of where the current community can support >>>>> users and guide new contributors. The below proposal is based on >>>>> observations and ideas from Till, Robert, and me. >>>>> >>>>> ======== >>>>> Goals >>>>> ======== >>>>> >>>>> We try to achieve the following >>>>> >>>>> - Pull requests get handled in a timely fashion >>>>> - New contributors are better integrated into the community >>>>> - The community feels empowered on the mailing list. >>>>> But questions that need the attention of someone that has deep >>>>> knowledge of a certain part of Flink get their attention. >>>>> - At the same time, the committers that are knowledgeable about >> many >>>> core >>>>> parts do not get completely overwhelmed. >>>>> - We don't overlook threads that report critical issues. >>>>> - We always have a pretty good overview of what the status of >> certain >>>>> parts of the system are. >>>>> -> What are often encountered known issues >>>>> -> What are the most frequently requested features >>>>> >>>>> >>>>> ======== >>>>> Problems >>>>> ======== >>>>> >>>>> Looking into the process, there are two big issues: >>>>> >>>>> (1) Up to now, we have been relying on the fact that everything just >>>>> "organizes itself", driven by best effort. That assumes >>>>> that everyone feels equally responsible for every part, question, and >>>>> contribution. At the current state, this is impossible >>>>> to maintain, it overwhelms the committers and contributors. >>>>> >>>>> Example: Pull requests are picked up by whoever wants to pick them >> up. >>>> Pull >>>>> requests that are a lot of work, have little >>>>> chance of getting in, or relate to less active components are >> sometimes >>>> not >>>>> picked up. When contributors are pretty >>>>> loaded already, it may happen that no one eventually feels >> responsible >>> to >>>>> pick up a pull request, and it falls through the cracks. >>>>> >>>>> (2) There is no good overview of what are known shortcomings, >> efforts, >>>> and >>>>> requested features for different parts of the system. >>>>> This information exists in various peoples' heads, but is not easily >>>>> accessible for new people. The Flink JIRA is not well >>>>> maintained, it is not easy to draw insights from that. >>>>> >>>>> >>>>> =========== >>>>> The Proposal >>>>> =========== >>>>> >>>>> Since we are building a parallel system, the natural solution seems >> to >>>> be: >>>>> partition the workload ;-) >>>>> >>>>> We propose to define a set of components for Flink. Each component is >>>>> maintained or tracked by one or more >>>>> people - let's call them maintainers. It is important to note that we >>>> don't >>>>> suggest the maintainers as an authoritative role, but >>>>> simply as committers or contributors that visibly step up for a >> certain >>>>> component, and mainly track and drive the efforts >>>>> pertaining to that component. >>>>> >>>>> It is also important to realize that we do not want to suggest that >>>> people >>>>> get less involved with certain parts and components, because >>>>> they are not the maintainers. We simply want to make sure that each >>> pull >>>>> request or question or contribution has in the end >>>>> one person (or a small set of people) responsible for catching and >>>> tracking >>>>> it, if it was not worked on by the pro-active >>>>> community. >>>>> >>>>> For some components, having multiple maintainers will be helpful. In >>> that >>>>> case, one maintainer should be the "chair" or "lead" >>>>> and make sure that no issue of that component gets lost between the >>>>> multiple maintainers. >>>>> >>>>> >>>>> A maintainers' role is: >>>>> ----------------------------- >>>>> >>>>> - Have an overview of which of the open pull requests relate to >> their >>>>> component >>>>> - Drive the pull requests relating to the component to resolution >>>>> => Moderate the decision whether the feature should be merged >>>>> => Make sure the pull request gets a shepherd. >>>>> In many cases, the maintainers would shepherd themselves. >>>>> => In case the shepherd becomes inactive, the maintainers need >> to >>>>> find a new shepherd. >>>>> >>>>> - Have an overview of what are the known issues of their component >>>>> - Have an overview of what are the frequently requested features of >>>> their >>>>> component >>>>> >>>>> - Have an overview of which contributors are doing very good work >> in >>>>> their component, >>>>> would be candidates for committers, and should be mentored >> towards >>>> that. >>>>> >>>>> - Resolve email threads that have been brought to their attention, >>>>> because deeper >>>>> component knowledge is required for that thread. >>>>> >>>>> A maintainers' role is NOT: >>>>> ---------------------------------- >>>>> >>>>> - Review all pull requests of that component >>>>> - Answer every mail with questions about that component >>>>> - Fix all bugs and implement all features of that components >>>>> >>>>> >>>>> We imagine the following way that the community and the maintainers >>>>> interact: >>>>> >>>> >>> >> --------------------------------------------------------------------------------------------------------- >>>>> >>>>> - Pull requests should be tagged by component. Since we cannot add >>>> labels >>>>> at this point, we need >>>>> to rely on the following: >>>>> => The pull request opener should name the pull request like >>>>> "[FLINK-XXX] [component] Title" >>>>> => Components can be (re) tagged by adding special comments in >> the >>>>> pull request ("==> component client") >>>>> => With some luck, GitHub and Apache Infra will allow us to use >>>> labels >>>>> at some point >>>>> >>>>> - When pull requests are associated with a component, the >> maintainers >>>>> will manage them >>>>> (decision whether to add, find shepherd, catch dropped pull >>> requests) >>>>> >>>>> - We assume that maintainers frequently reach out to other >> community >>>>> members and ask them if they want >>>>> to shepherd a pull request. >>>>> >>>>> - On the mailing list, everyone should feel equally empowered to >>> answer >>>>> and discuss. >>>>> If at some point in the discussion, some deep technical knowledge >>>> about >>>>> a component is required, >>>>> the maintainer(s) should be drawn into the discussion. >>>>> Because the Mailing List infrastructure has no support to tag >>>> threads, >>>>> here are some simple workarounds: >>>>> >>>>> => One possibility is to put the maintainers' mail addresses on >> cc >>>> for >>>>> the thread, so they get the mail >>>>> not just via l the mailing list >>>>> => Another way would be to post something like "+maintainer >>> runtime" >>>> in >>>>> the thread and the "runtime" >>>>> maintainers would have a filter/alert on these keywords in >>> their >>>>> mail program. >>>>> >>>>> - We assume that maintainers will reach out to community members >> that >>>> are >>>>> very active and helpful in >>>>> a component, and will ask them if they want to be added as >>>> maintainers. >>>>> That will make it visible that those people are experts for that >>> part >>>>> of Flink. >>>>> >>>>> >>>>> ====================================== >>>>> Maintainers: Committers and Contributors >>>>> ====================================== >>>>> >>>>> It helps if maintainers are committers (since we want them to resolve >>>> pull >>>>> requests which often involves >>>>> merging them). >>>>> >>>>> Components with multiple maintainers can easily have non-committer >>>>> contributors in addition to committer >>>>> contributors. >>>>> >>>>> >>>>> ====== >>>>> JIRA >>>>> ====== >>>>> >>>>> Ideally, JIRA can be used to get an overview of what are the known >>> issues >>>>> of each component, and what are >>>>> common feature requests. Unfortunately, the Flink JIRA is quite >>>> unorganized >>>>> right now. >>>>> >>>>> A natural followup effort of this proposal would be to define in JIRA >>> the >>>>> same components as we defined here, >>>>> and have the maintainers keep JIRA meaningful for that particular >>>>> component. That would allow us to >>>>> easily generate some tables out of JIRA (like top known issues per >>>>> component, most requested features) >>>>> post them on the dev list once in a while as a "state of the union" >>>> report. >>>>> >>>>> Initial assignment of issues to components should be made by those >>> people >>>>> opening the issue. The maintainer >>>>> of that tagged component needs to change the tag, if the component >> was >>>>> classified incorrectly. >>>>> >>>>> >>>>> ====================================== >>>>> Initial Components and Maintainers Suggestion >>>>> ====================================== >>>>> >>>>> Below is a suggestion of how to define components for Flink. One goal >>> of >>>>> the division was to make it >>>>> obvious for the majority of questions and contributions to which >>>> component >>>>> they would relate. Otherwise, >>>>> if many contributions had fuzzy component associations, we would >> again >>>> not >>>>> solve the issue of having clear >>>>> responsibilities for who would track the progress and resolution. >>>>> >>>>> We also looked at each component and wrote the names of some people >> who >>>> we >>>>> thought were natural >>>>> experts for the components, and thus natural candidates for >>> maintainers. >>>>> >>>>> **These names are only a starting point for discussion.** >>>>> >>>>> Once agreed upon, the components and names of maintainers should be >>> kept >>>> in >>>>> the wiki and updated as >>>>> components change and people step up or down. >>>>> >>>>> >>>>> *DataSet API* (*Fabian, Greg, Gabor*) >>>>> - Incuding Hadoop compat. parts >>>>> >>>>> *DataStream API* (*Aljoscha, Max, Stephan*) >>>>> >>>>> *Runtime* >>>>> - Distributed Coordination (JobManager/TaskManager, Akka) (*Till*) >>>>> - Local Runtime (Memory Management, State Backends, >> Tasks/Operators) >>> ( >>>>> *Stephan*) >>>>> - Network (*Ufuk*) >>>>> >>>>> *Client/Optimizer* (*Fabian*) >>>>> >>>>> *Type system / Type extractor* (Timo) >>>>> >>>>> *Cluster Management* (Yarn, Mesos, Docker, ...) (*Max, Robert*) >>>>> >>>>> *Libraries* >>>>> - Gelly (*Vasia, Greg*) >>>>> - ML (*Till, Theo*) >>>>> - CEP (*Till*) >>>>> - Python (*Chesnay*) >>>>> >>>>> *Table API & SQL* (*Fabian, Vasia, Timo, Chengxiang*) >>>>> >>>>> *Streaming Connectors* (*Robert*, *Aljoscha*) >>>>> >>>>> *Batch Connectors and Input/Output Formats* (*Chesnay*) >>>>> >>>>> *Storm Compatibility Layer* (*Mathias*) >>>>> >>>>> *Scala shell* (*Till*) >>>>> >>>>> *Startup Shell Scripts* (Ufuk) >>>>> >>>>> *Flink Build System, Maven Files* (*Robert*) >>>>> >>>>> *Documentation* (Ufuk) >>>>> >>>>> >>>>> Please let us know what you think about this proposal. >>>>> Happy discussing! >>>>> >>>>> Greetings, >>>>> Stephan >>>> >>> >> >
signature.asc
Description: OpenPGP digital signature