Hi, Thanks for your explanation. I've added "Benchmarks" and renamed "Runtime / Operators".
On Mon, May 20, 2019 at 10:59 AM Piotr Nowojski <pi...@ververica.com> wrote: > Hi, > > > Concrete operator implementations will then go into the "API / > DataStream"? > > (or "API / DataSet" or Table) > > Afaik, there were some ideas to share operator implementations between > > DataStream and Table > > Yes & yes. I think for now we could keep the concrete operators > implementations under API / DataStream and we can split them out once we > have true use case for that. Unless this is confusing for someone, in that > case we could split it now to API / DataStream Operators. > > >> 2. I think we should add additional component for benchmarks and > >> benchmarking infrastructure. While this is more complicated topic > (because > >> of the setup and how is it running), it should be on the same level as > >> correctness tests. > >> > > > > I'm not sure if it is a good idea to add a "Benchmarks" component into > the > > Flink JIRA. Afaik, the benchmarks are managed from here? > > https://github.com/dataArtisans/flink/tree/benchmark-request < > https://github.com/dataArtisans/flink/tree/benchmark-request> > > Not all of them, some of them are in apache/flink. And it might be a > subject to change in the future. Ideally we should have benchmarking code > in the same repository, if not for some licensing issues. Also if we ever > implement full cluster benchmarks (not using JMH), they could also reside > in the Flink repository. > > Regardless of that, does it matter where the benchmarks are? In my opinion > the only thing that matters is that benchmarks are just another for of > tests/verification, we have unit tests, integrations tests, end to end > tests and also various level benchmarks. Why should those things be treated > differently? > > > Doesn't it make sense to track issues with GH issues there? > > Or asking more broadly, what types of issues would you see in that > > component? > > Same kind of issues as for any other type of tests. For example: > - release blocker Jira issue that benchmarks are broken and are not > testing anything (from time to time we have to fix something in the > benchmarking setup and also it happened couple of times, that benchmarks > have discovered some release blocker regressions in the Flink) > - Jira issue to fix some benchmark > - Jira issue to implement a missing benchmark > - … > > Piotrek > > > On 17 May 2019, at 14:41, Robert Metzger <rmetz...@apache.org> wrote: > > > > Hi, > > > > 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like > >> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it > >> sounded like it covers concrete implementations of the operators, like > >> “WindowOperator” or various join implementations. > >> > > > > I'm fine with this renaming. > > Concrete operator implementations will then go into the "API / > DataStream"? > > (or "API / DataSet" or Table) > > Afaik, there were some ideas to share operator implementations between > > DataStream and Table. If that's the case, we would have to find a good > > components for that as well. > > > > > >> > >> 2. I think we should add additional component for benchmarks and > >> benchmarking infrastructure. While this is more complicated topic > (because > >> of the setup and how is it running), it should be on the same level as > >> correctness tests. > >> > > > > I'm not sure if it is a good idea to add a "Benchmarks" component into > the > > Flink JIRA. Afaik, the benchmarks are managed from here? > > https://github.com/dataArtisans/flink/tree/benchmark-request > > Doesn't it make sense to track issues with GH issues there? > > Or asking more broadly, what types of issues would you see in that > > component? > > > > > >> > >> Piotrek > >> > >>> On 20 Feb 2019, at 10:53, Robert Metzger <rmetz...@apache.org> wrote: > >>> > >>> Thanks a lot Timo! > >>> > >>> I will start a vote Chesnay! > >>> > >>> On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <twal...@apache.org> > >> wrote: > >>> > >>>> +1 for the vote. Btw I can help cleaning up the "Table API & SQL" > >>>> component. It seems to be the biggest with 1229 Issues. > >>>> > >>>> Thanks, > >>>> Timo > >>>> > >>>> Am 20.02.19 um 10:09 schrieb Chesnay Schepler: > >>>>> I would prefer if you'd start a vote with a new cleaned up proposal. > >>>>> > >>>>> On 18.02.2019 15:23, Robert Metzger wrote: > >>>>>> I added "Runtime / Configuration" to the proposal: > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components > >>>>>> > >>>>>> > >>>>>> Since this discussion has been open for 10 days, I assume we have > >>>>>> reached > >>>>>> consensus here. I will soon start renaming components. > >>>>>> > >>>>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler < > ches...@apache.org > >>> > >>>>>> wrote: > >>>>>> > >>>>>>> The only parent I can think of is "Infrastructure", but I don't > quite > >>>>>>> like it :/ > >>>>>>> > >>>>>>> +1 for "Runtime / Configuration"; this is too general to be placed > in > >>>>>>> coordination imo. > >>>>>>> > >>>>>>> On 12.02.2019 18:25, Robert Metzger wrote: > >>>>>>>> Thanks a lot for your feedback Chesnay! > >>>>>>>> > >>>>>>>> re build/travis/release: Do you have a good idea for a common > >>>>>>>> parent for > >>>>>>>> "Build System", "Travis" and "Release System"? > >>>>>>>> > >>>>>>>> re legacy: Okay, I see your point. I will keep the Legacy > Components > >>>>>>> prefix. > >>>>>>>> re library: I think I don't have a argument here. My proposal is > >>>>>>>> based on > >>>>>>>> what I felt as being right :) I added the "Library / " prefix to > the > >>>>>>>> proposal. > >>>>>>>> > >>>>>>>> re core/config: From the proposed components, I see the best match > >>>>>>>> with > >>>>>>>> "Runtime / Coordination", but I agree that this example is > >>>>>>>> difficult to > >>>>>>>> place into my proposed scheme. Do you think we should introduce > >>>>>>>> "Runtime > >>>>>>> / > >>>>>>>> Configuration" as a component? > >>>>>>>> > >>>>>>>> > >>>>>>>> I updated the proposal accordingly! > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler < > >> ches...@apache.org > >>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> re build/travis/release: No, I'm against merging build system, > >> travis > >>>>>>>>> and release system. > >>>>>>>>> > >>>>>>>>> re legacy: So going forward you're proposing to move dropped > >> features > >>>>>>>>> into the legacy bucket and make it impossible to search for > >> specific > >>>>>>>>> issues for that component? There's 0 overhead to having these > >>>>>>>>> components, so I really don't get the benefit here, but see the > >>>>>>> overhead. > >>>>>>>>> I don't buy the argument of "people will not open issues if the > >>>>>>>>> component doesn't exist", they will just leave the component > field > >>>>>>>>> blank > >>>>>>>>> or add a random one (that would be wrong). In fact, if you had a > >>>>>>>>> storm/tez component (that users would adhere to) then it would be > >>>>>>>>> _easier_ to figure out whether an issue can be rejected right > away. > >>>>>>>>> > >>>>>>>>> re library: If you are against a library category, what's your > >>>>>>>>> argument > >>>>>>>>> for a connector category? > >>>>>>>>> > >>>>>>>>> re tests: I don't mind "tests" being removed from tickets about > >> test > >>>>>>>>> instabilities, but you specified the migration as "rename E2E > >> tests" > >>>>>>>>> which is not equivalent. > >>>>>>>>> Under what category would you file modifications to > >>>>>>> flink-test-utils-junit? > >>>>>>>>> I would propose to not differentiate between e2e and other > tests; I > >>>>>>>>> would go along with "Test infrastructure", and remove the major > >>>>>>>>> "Tests" > >>>>>>>>> category. > >>>>>>>>> > >>>>>>>>> re core/config: As an example, where (under Runtime) would you > >>>>>>>>> place the > >>>>>>>>> introduction of the ConfigOption class? > >>>>>>>>> > >>>>>>>>> On 11.02.2019 11:31, Robert Metzger wrote: > >>>>>>>>>> Thanks a lot for your feedback! > >>>>>>>>>> > >>>>>>>>>> @Timo: > >>>>>>>>>> I've followed your suggestions and updated the proposed names in > >> the > >>>>>>>>> wiki. > >>>>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly > >>>>>>>>>> not much > >>>>>>>>>> knowledge) would not add this component at the moment, and put > >>>>>>>>>> the SQL > >>>>>>>>>> stuff into the respective connector component. > >>>>>>>>>> It is probably pretty difficult for a user to decide whether a > but > >>>>>>>>> belongs > >>>>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does > >> not > >>>>>>> work. > >>>>>>>>>> @Chesnay: > >>>>>>>>>> - You are suggesting to rename "Build System" to "Maven" and > still > >>>>>>> merge > >>>>>>>>> it > >>>>>>>>>> with "Travis", "Release System" etc. as in the proposal? > >>>>>>>>>> > >>>>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I > >>>>>>>>>> changed the > >>>>>>>>>> proposal > >>>>>>>>>> > >>>>>>>>>> - Re. "Documentation": Yes, I think that would be better in the > >> long > >>>>>>> run. > >>>>>>>>>> We are already in a situation where there are groups within the > >>>>>>> community > >>>>>>>>>> focusing on certain areas of the code (such as SQL, the runtime, > >>>>>>>>>> connectors). Those groups will monitor their components, but it > >> will > >>>>>>> be a > >>>>>>>>>> lot of overhead for them to monitor the "Documentation" > component. > >>>>>>>>>> We can also try to assign documentation components to both > >>>>>>>>> "Documentation" > >>>>>>>>>> and the affected component, such as "Runtime / Metrics". > >>>>>>>>>> > >>>>>>>>>> - Removed "Misc / " prefix. > >>>>>>>>>> > >>>>>>>>>> - "Legacy Components": Usually legacy components usually have > >>>>>>>>>> very few > >>>>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has > >>>>>>>>>> a bulk > >>>>>>>>>> edit feature :) > >>>>>>>>>> The benefit of having it generalized is that people will > probably > >>>>>>>>>> not > >>>>>>> add > >>>>>>>>>> tickets to it. > >>>>>>>>>> > >>>>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some > >>>>>>>>> libraries > >>>>>>>>>> might grow in the future (like the Table API), then we need to > >>>>>>>>>> rename. > >>>>>>>>>> the "flink-libraries" module does contain stuff like the sql > >>>>>>>>>> client or > >>>>>>>>> the > >>>>>>>>>> python api, which are already covered by other components in my > >>>>>>> proposal > >>>>>>>>> -- > >>>>>>>>>> so going with the maven module structure is not an argument > here. > >>>>>>>>>> > >>>>>>>>>> - "End to end infrastructure" and "Tests: The same argument as > >>>>>>>>>> with the > >>>>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics, > >> .. > >>>>>>>>> should > >>>>>>>>>> get visibility into "their" test instabilities through "their" > >>>>>>>>> components. > >>>>>>>>>> Not many people will feel responsible for the "Tests" component. > >>>>>>>>>> > >>>>>>>>>> For "Core" and "Configuration", I will move the tickets to the > >>>>>>>>> appropriate > >>>>>>>>>> components in "Runtime /". > >>>>>>>>>> > >>>>>>>>>> For "API / Scala": Good point. I will add that component. > >>>>>>>>>> > >>>>>>>>>> How to do it? I will just go through the pain and do it. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Robert > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler < > >> ches...@apache.org > >>>>> > >>>>>>>>> wrote: > >>>>>>>>>>> Some concerns: > >>>>>>>>>>> > >>>>>>>>>>> Travis and build system / release system are entirely > different. > >> I > >>>>>>> would > >>>>>>>>>>> even keep the release system away from the build-system, as it > >>>>>>>>>>> is more > >>>>>>>>>>> about the release scripts and documentation, while the latter > is > >>>>>>>>>>> about > >>>>>>>>>>> maven. Actually I'd just rename build-system to maven. > >>>>>>>>>>> > >>>>>>>>>>> Control Plane is a term I've never heard before in this > context; > >>>>>>>>>>> I'd > >>>>>>>>>>> replace it with Coordination. > >>>>>>>>>>> > >>>>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback > >>>>>>> component". > >>>>>>>>>>> In other words, if I make a change to the metrics > documentation I > >>>>>>>>>>> shouldn't use this component any more? > >>>>>>>>>>> > >>>>>>>>>>> I don't see the benefit of a `Misc` major category. I'd > attribute > >>>>>>>>>>> everything that doesn't have a major category implicitly to > >> "Misc". > >>>>>>>>>>> > >>>>>>>>>>> Not a fan of a generalized "Legacy components" category; this > >> seems > >>>>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have > to > >>>>>>>>>>> touch > >>>>>>>>>>> every JIRA for a component if we drop it. > >>>>>>>>>>> > >>>>>>>>>>> How come gelly/CEP don't have a Major category (libraries?) > >>>>>>>>>>> > >>>>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests". > >>>>>>>>>>> Infrastructure is not about fixing failing tests, which is what > >> we > >>>>>>>>>>> partially used this component for so far. > >>>>>>>>>>> > >>>>>>>>>>> I don't believe you can get rid of the generic "Tests" > component; > >>>>>>>>>>> consider any changes to the `flink-test-utils-junit` module. > >>>>>>>>>>> > >>>>>>>>>>> You propose deleting "Core" and "Configuration" but haven't > >>>>>>>>>>> listed any > >>>>>>>>>>> migration paths. > >>>>>>>>>>> > >>>>>>>>>>> If there's a API / Python category there should also be a API / > >>>>>>>>>>> Scala > >>>>>>>>>>> category. This could also include the shala-shell. Note that > the > >>>>>>>>>>> existing Scala API category is not mentioned anywhere in the > >>>>>>>>>>> document. > >>>>>>>>>>> > >>>>>>>>>>> How do you actually want to do the migration? > >>>>>>>>>>> > >>>>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote: > >>>>>>>>>>>> Hi Robert, > >>>>>>>>>>>> > >>>>>>>>>>>> thanks for starting this discussion. I was also about to > suggest > >>>>>>>>>>>> splitting the `Table API & SQL` component because it contains > >>>>>>>>>>>> already > >>>>>>>>>>>> more than 1000 issues. > >>>>>>>>>>>> > >>>>>>>>>>>> My comments: > >>>>>>>>>>>> > >>>>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term > goal > >>>>>>>>>>>> might > >>>>>>>>>>>> not only be a CLI interface. I would keep the generic name > "SQL > >>>>>>>>>>>> Client" for now. This is also what is written in FLIPs, > >>>>>>> presentations, > >>>>>>>>>>>> and documentation. > >>>>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is > >> read-only > >>>>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner > >> is > >>>>>>> more > >>>>>>>>>>>> generic. > >>>>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know > >> what > >>>>>>>>>>>> Gelly means. This is the only component that has a "feature > >>>>>>>>>>>> name". I > >>>>>>>>>>>> don't know if we want to stick with that in the future. > >>>>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because > SQL > >>>>>>>>>>>> connectors are tightly bound to SQL internals but also to the > >>>>>>>>>>>> connector itself. > >>>>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name > >> is > >>>>>>> more > >>>>>>>>>>>> generic and reflects the efforts about Hive Metastore and > >> catalog > >>>>>>>>>>>> integration that is currenlty taking place. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> Timo > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger: > >>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I am currently trying to improve how the Flink community is > >>>>>>>>>>>>> handling > >>>>>>>>>>>>> incoming pull requests and JIRA tickets. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've looked at how other big communities are handling such a > >> high > >>>>>>>>>>>>> number of > >>>>>>>>>>>>> contributions, and I found that many are using GitHub labels > >>>>>>>>>>>>> extensively. > >>>>>>>>>>>>> An integral part of the label use is to tag PRs with the > >>>>>>>>>>>>> component / > >>>>>>>>>>>>> area > >>>>>>>>>>>>> they belong to. I think the most obvious and logical way of > >>>>>>>>>>>>> tagging > >>>>>>>>>>>>> the PRs > >>>>>>>>>>>>> is by using the JIRA components. This will force us to keep > >>>>>>>>>>>>> the JIRA > >>>>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :) > >>>>>>>>>>>>> I will soon start a separate discussion for the GitHub > labels. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Let's first discuss the JIRA components. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've created the following Wiki page with my proposal of the > >> new > >>>>>>>>>>>>> component, > >>>>>>>>>>>>> and how to migrate from the existing components: > >>>>>>>>>>>>> > >>>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components > >>>>>>> > >>>>>>>>>>>>> Please comment here or directly in the Wiki to let me know > >>>>>>>>>>>>> what you > >>>>>>>>>>>>> think. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best, > >>>>>>>>>>>>> Robert > >>>>>>>>>>>>> > >>>>>>> > >>>> > >>>> > >> > >> > >