Next YuniKorn release discussion - the next step
Hi all In today's community meeting, we have started the discussion for the next release. About the release version, we have 2 candidates: *0.11* or *1.0*. Which one should be our next release? I would like to bring this up to more people's attention and hear more thoughts from you. The project was started back in *Jan 2019*, becoming an Apache incubator in *Jan 2020*, and now we see more and more adoption in the community. Today, our major interface is stable (scheduler-interface), major features such as hierarchy queue, app/node/queue sorting, placement rule, resource fairness, resource reservation are all stable. IMO, we should chase for the 1.0 release in the next few months. Please help to vote for the version name in this google form: https://forms.gle/ZhmrFvZpBdXnRmeh7, your opinion matters! Thanks Weiwei
Re: Next YuniKorn release discussion - the next step
Hi Weiwei, Thanks for bringing up this discussion. That may depend on what "1.0" means, and what bumping a major version means. Usually a new major version (e.g. 0.x -> 1.x -> 2.x) contains some critical new features or breaking changes. One critical feature of the next release is integration with Spark K8S Operator, which feels can justify for a bump and catch more attention/adoption in the space. Anything else? Another consideration about bumping major versions relates to graduation. I've seen projects using a new major release as a graduation ceremony from the incubator (some also stick with minor releases). Curious - What's the plan for the graduation? Any blockers, or is it just a matter of time when YuniKorn community applies for it? Does using 1.0 indicate we are preparing for it? Thanks On Wed, Apr 21, 2021 at 10:19 AM Weiwei Yang wrote: > Hi all > > In today's community meeting, we have started the discussion for the next > release. > About the release version, we have 2 candidates: *0.11* or *1.0*. Which one > should be our next release? > I would like to bring this up to more people's attention and hear more > thoughts from you. The project was started back in *Jan 2019*, becoming an > Apache incubator in *Jan 2020*, and now we see more and more adoption in > the community. Today, our major interface is stable (scheduler-interface), > major features such as hierarchy queue, app/node/queue sorting, placement > rule, resource fairness, resource reservation are all stable. IMO, we > should chase for the 1.0 release in the next few months. > Please help to vote for the version name in this google form: > https://forms.gle/ZhmrFvZpBdXnRmeh7, your opinion matters! > > Thanks > Weiwei >
Re: Next YuniKorn release discussion - the next step
Hi Bowen Thanks for sharing your thoughts. There is no standard for what 1.0 release means. IMO, we can refer to https://en.wikipedia.org/wiki/Software_versioning#Version_1.0_as_a_milestone . Having 1.0 release indicates the software has the major features ready, the public APIs are stabilized, and ready for general releases. This is what I think where we are. 1.0 is a major milestone, that's the reason I started the vote to gather opinions. Graduation from the incubator is a different discussion. We touched on that topic in today's community call. Given the stage of the project, I think we should start to prepare for graduation. I do not think the Apache community has very clear criteria, but based on my experience, I think a project must prove itself is sustainable, the community is mature, diverse, follow the apache way to release versions, enough activities, etc. We can ask for more info from the project mentors or other IPMC for guidance. The plan in my mind is to gather enough info for graduation now, fix the remaining gaps, have another release (hopefully 1.0), and then start the process. On Wed, Apr 21, 2021 at 9:21 PM Bowen Li wrote: > Hi Weiwei, > > Thanks for bringing up this discussion. > > That may depend on what "1.0" means, and what bumping a major version > means. > > Usually a new major version (e.g. 0.x -> 1.x -> 2.x) contains some critical > new features or breaking changes. One critical feature of the next release > is integration with Spark K8S Operator, which feels can justify for a bump > and catch more attention/adoption in the space. Anything else? > > Another consideration about bumping major versions relates to graduation. > I've seen projects using a new major release as a graduation ceremony from > the incubator (some also stick with minor releases). Curious - What's the > plan for the graduation? Any blockers, or is it just a matter of time when > YuniKorn community applies for it? Does using 1.0 indicate we are preparing > for it? > > Thanks > > > > On Wed, Apr 21, 2021 at 10:19 AM Weiwei Yang wrote: > > > Hi all > > > > In today's community meeting, we have started the discussion for the next > > release. > > About the release version, we have 2 candidates: *0.11* or *1.0*. Which > one > > should be our next release? > > I would like to bring this up to more people's attention and hear more > > thoughts from you. The project was started back in *Jan 2019*, becoming > an > > Apache incubator in *Jan 2020*, and now we see more and more adoption in > > the community. Today, our major interface is stable > (scheduler-interface), > > major features such as hierarchy queue, app/node/queue sorting, placement > > rule, resource fairness, resource reservation are all stable. IMO, we > > should chase for the 1.0 release in the next few months. > > Please help to vote for the version name in this google form: > > https://forms.gle/ZhmrFvZpBdXnRmeh7, your opinion matters! > > > > Thanks > > Weiwei > > >
Re: Next YuniKorn release discussion - the next step
Weiwei, We are cleaning up the scheduler interface and moving things around for the API in the next release. We are also about to make major changes to the REST interface. Settling those and making sure we have that all correct is I think required before we do a 1.0 release, not as part of a 1.0 release. We also still have a tangled build between the shim and core which we should solve before a 1.0 release. That untangling is for instance needed to, for instance, allow setting a log level dynamically and expose it via the REST interface. Same for the admission controller, we should move from scripts inside the scheduler image to code in the admission controller image. That is not even talking about some of the major changes that have been pushed to the background a bit around opentracing and or pluggability of policies. Both are major things which indicate I think we are not ready for 1.0 just yet. Wilfred On Thu, 22 Apr 2021 at 15:38, Weiwei Yang wrote: > Hi Bowen > > Thanks for sharing your thoughts. > > There is no standard for what 1.0 release means. IMO, we can refer to > > https://en.wikipedia.org/wiki/Software_versioning#Version_1.0_as_a_milestone > . > Having 1.0 release indicates the software has the major features ready, the > public APIs are stabilized, and ready for general releases. This is what I > think where we are. 1.0 is a major milestone, that's the reason I started > the vote to gather opinions. > > Graduation from the incubator is a different discussion. We touched on that > topic in today's community call. Given the stage of the project, I think > we should start to prepare for graduation. I do not think the Apache > community has very clear criteria, but based on my experience, I think a > project must prove itself is sustainable, the community is mature, diverse, > follow the apache way to release versions, enough activities, etc. We can > ask for more info from the project mentors or other IPMC for guidance. The > plan in my mind is to gather enough info for graduation now, fix the > remaining gaps, have another release (hopefully 1.0), and then start the > process. > > On Wed, Apr 21, 2021 at 9:21 PM Bowen Li wrote: > > > Hi Weiwei, > > > > Thanks for bringing up this discussion. > > > > That may depend on what "1.0" means, and what bumping a major version > > means. > > > > Usually a new major version (e.g. 0.x -> 1.x -> 2.x) contains some > critical > > new features or breaking changes. One critical feature of the next > release > > is integration with Spark K8S Operator, which feels can justify for a > bump > > and catch more attention/adoption in the space. Anything else? > > > > Another consideration about bumping major versions relates to graduation. > > I've seen projects using a new major release as a graduation ceremony > from > > the incubator (some also stick with minor releases). Curious - What's the > > plan for the graduation? Any blockers, or is it just a matter of time > when > > YuniKorn community applies for it? Does using 1.0 indicate we are > preparing > > for it? > > > > Thanks > > > > > > > > On Wed, Apr 21, 2021 at 10:19 AM Weiwei Yang wrote: > > > > > Hi all > > > > > > In today's community meeting, we have started the discussion for the > next > > > release. > > > About the release version, we have 2 candidates: *0.11* or *1.0*. Which > > one > > > should be our next release? > > > I would like to bring this up to more people's attention and hear more > > > thoughts from you. The project was started back in *Jan 2019*, becoming > > an > > > Apache incubator in *Jan 2020*, and now we see more and more adoption > in > > > the community. Today, our major interface is stable > > (scheduler-interface), > > > major features such as hierarchy queue, app/node/queue sorting, > placement > > > rule, resource fairness, resource reservation are all stable. IMO, we > > > should chase for the 1.0 release in the next few months. > > > Please help to vote for the version name in this google form: > > > https://forms.gle/ZhmrFvZpBdXnRmeh7, your opinion matters! > > > > > > Thanks > > > Weiwei > > > > > >
Re: Next YuniKorn release discussion - the next step
Hi Wilfred Thanks for sharing your thoughts. Please see my comments below: We are cleaning up the scheduler interface and moving things around for the > API in the next release. I assume you are referring to https://issues.apache.org/jira/browse/YUNIKORN-486. Removing some unused protobuf messages won't cause compatibility issues. We are also about to make major changes to the > REST interface. > The REST API changes are something that might have some compatibility impact. One way to handle this is to maintain the old v1 API for several more releases before removing them. Mani is currently working on this, we also need to check with Mani if this can be committed to the next release. I'll start a thread to discuss this. We also still have a tangled build between the shim and core which we > should solve before a 1.0 release. That untangling is for instance needed > to, for instance, allow setting a log level dynamically and expose it via > the REST interface. IMO, that is not something we need to do, at least not a near-term goal. I don't see there is a problem with the current deployment model, or build. Scheduler-core is a standalone project, it can be built separately; the shim builds a K8s scheduler as a single binary and deployed as a container. We still provide the flexibility to build any other form of shims to talk to the core using GRPC. I don't see any major issue here. I suggest starting a separate thread to discuss this and do not mark this as any release blocker. Same for the admission controller, we should move from > scripts inside the scheduler image to code in the admission controller > image. This is just a non-functional change, no API changes. it won't cause any compatibility issues. We can work on this post 1.0 release. Relase a 1.0 version doesn't mean we have to get every major thing fixed, (if so we probably will never release 1.0). It is a milestone to indicate that the project reaches a certain level of matureness in its cycle. It only means it has the major functionality ready, and we do not expect any major API changes until we release 2.0. (I also pinged our mentors to collect more opinions) On Thu, Apr 22, 2021 at 12:06 AM Wilfred Spiegelenburg wrote: > Weiwei, > > We are cleaning up the scheduler interface and moving things around for the > API in the next release. We are also about to make major changes to the > REST interface. Settling those and making sure we have that all correct is > I think required before we do a 1.0 release, not as part of a 1.0 release. > > We also still have a tangled build between the shim and core which we > should solve before a 1.0 release. That untangling is for instance needed > to, for instance, allow setting a log level dynamically and expose it via > the REST interface. Same for the admission controller, we should move from > scripts inside the scheduler image to code in the admission controller > image. > > That is not even talking about some of the major changes that have been > pushed to the background a bit around opentracing and or pluggability of > policies. Both are major things which indicate I think we are not ready for > 1.0 just yet. > > Wilfred > > On Thu, 22 Apr 2021 at 15:38, Weiwei Yang wrote: > > > Hi Bowen > > > > Thanks for sharing your thoughts. > > > > There is no standard for what 1.0 release means. IMO, we can refer to > > > > > https://en.wikipedia.org/wiki/Software_versioning#Version_1.0_as_a_milestone > > . > > Having 1.0 release indicates the software has the major features ready, > the > > public APIs are stabilized, and ready for general releases. This is what > I > > think where we are. 1.0 is a major milestone, that's the reason I started > > the vote to gather opinions. > > > > Graduation from the incubator is a different discussion. We touched on > that > > topic in today's community call. Given the stage of the project, I think > > we should start to prepare for graduation. I do not think the Apache > > community has very clear criteria, but based on my experience, I think a > > project must prove itself is sustainable, the community is mature, > diverse, > > follow the apache way to release versions, enough activities, etc. We can > > ask for more info from the project mentors or other IPMC for guidance. > The > > plan in my mind is to gather enough info for graduation now, fix the > > remaining gaps, have another release (hopefully 1.0), and then start the > > process. > > > > On Wed, Apr 21, 2021 at 9:21 PM Bowen Li wrote: > > > > > Hi Weiwei, > > > > > > Thanks for bringing up this discussion. > > > > > > That may depend on what "1.0" means, and what bumping a major version > > > means. > > > > > > Usually a new major version (e.g. 0.x -> 1.x -> 2.x) contains some > > critical > > > new features or breaking changes. One critical feature of the next > > release > > > is integration with Spark K8S Operator, which feels can justify for a > > bump > > > and catch more attention/adoption
Re: Next YuniKorn release discussion - the next step
On Fri, 23 Apr 2021 at 04:00, Weiwei Yang wrote: > Hi Wilfred > > Thanks for sharing your thoughts. Please see my comments below: > > We are cleaning up the scheduler interface and moving things around for the > > API in the next release. > > > I assume you are referring to > https://issues.apache.org/jira/browse/YUNIKORN-486. > Removing some unused protobuf messages won't cause compatibility issues > I am talking about YUNIKORN-490 and its sub jira: moving the API definition from the core into the SI is a bit more than just removing some unused messages. It also includes a major change of protobuf and protoc in YUNIKORN-488 as we're on a superseded release. > The REST API changes are something that might have some compatibility > impact. One way to handle this > is to maintain the old v1 API for several more releases before removing > them. Mani is currently working on this, > we also need to check with Mani if this can be committed to the next > release. I'll start a thread to discuss this. > This is the exact reason why we do not want to go to 1.0 now. You do not want to start a 1.0 release with an already deprecated REST interface or something you will deprecate immediately. If we get this out of the way in a release before 1.0 we do not have to carry the baggage around of an already deprecated REST interface. We also do not want to rush this development either. We also still have a tangled build between the shim and core which we > > should solve before a 1.0 release. That untangling is for instance needed > > to, for instance, allow setting a log level dynamically and expose it via > > the REST interface. > > > IMO, that is not something we need to do, at least not a near-term goal. > I don't see there is a problem with the current deployment model, or build. > Scheduler-core is a standalone project, > it can be built separately; the shim builds a K8s scheduler as a single > binary and deployed as a container. > We still provide the flexibility to build any other form of shims to talk > to the core using GRPC. I don't see any major issue here. > I suggest starting a separate thread to discuss this and do not mark this > as any release blocker. > > Same for the admission controller, we should move from > > scripts inside the scheduler image to code in the admission controller > > image. > > > This is just a non-functional change, no API changes. it won't cause any > compatibility issues. > We can work on this post 1.0 release. > It has an impact. Currently we have abandoned changing log levels for YuniKorn on the fly as it is not possible. To get this working the requirement is to initialise the logging in the core code to start with. The second step is that we then need to introduce the REST API to allow it and introduce a core to shim way to communicate the log change. I have not looked at the Opentracing changes recently but I do think that needs changes to the interface as well. Instead of rushing this why not take a little more time to figure this out. We're not talking about years, or even a year. We're talking about between the 2 and 3 months at the most. Wilfred > > Relase a 1.0 version doesn't mean we have to get every major thing fixed, > (if so we probably will never release 1.0). > It is a milestone to indicate that the project reaches a certain level of > matureness in its cycle. It only means it has the major functionality > ready, > and we do not expect any major API changes until we release 2.0. > > (I also pinged our mentors to collect more opinions) > > > On Thu, Apr 22, 2021 at 12:06 AM Wilfred Spiegelenburg < > wilfr...@apache.org> > wrote: > > > Weiwei, > > > > We are cleaning up the scheduler interface and moving things around for > the > > API in the next release. We are also about to make major changes to the > > REST interface. Settling those and making sure we have that all correct > is > > I think required before we do a 1.0 release, not as part of a 1.0 > release. > > > > We also still have a tangled build between the shim and core which we > > should solve before a 1.0 release. That untangling is for instance needed > > to, for instance, allow setting a log level dynamically and expose it via > > the REST interface. Same for the admission controller, we should move > from > > scripts inside the scheduler image to code in the admission controller > > image. > > > > That is not even talking about some of the major changes that have been > > pushed to the background a bit around opentracing and or pluggability of > > policies. Both are major things which indicate I think we are not ready > for > > 1.0 just yet. > > > > Wilfred > > > > On Thu, 22 Apr 2021 at 15:38, Weiwei Yang wrote: > > > > > Hi Bowen > > > > > > Thanks for sharing your thoughts. > > > > > > There is no standard for what 1.0 release means. IMO, we can refer to > > > > > > > > > https://en.wikipedia.org/wiki/Software_versioning#Version_1.0_as_a_milestone > > > . > > > Having 1.0 release indicates
Re: Next YuniKorn release discussion - the next step
Hi Everyone! I think the safest way for now is to go with 0.11 for the next release. Right now I think we have a stable and tested version of YuniKorn, but also have some bigger things ongoing, as were there were mentioned before, such as the REST API changes, and interface changes. I would prefer to have the 1.0 release without any deprecations, so only with the new API, but for that we need to test it carefully, and let give the users some time to get familiar with that one, so until then we can keep both APIs, just mark the old one as deprecated, and remove it completely in the 1.0 release. Also, even if we have a working version for the Gang scheduling, I think we still have some Jiras to fix, such as all the improvement Jiras and the CRD, that one is half implemented, so it would be good to complete it, or at least decide if we need it or not. Regards, Kinga On Fri, Apr 23, 2021 at 3:42 AM Wilfred Spiegelenburg wrote: > On Fri, 23 Apr 2021 at 04:00, Weiwei Yang wrote: > > > Hi Wilfred > > > > Thanks for sharing your thoughts. Please see my comments below: > > > > We are cleaning up the scheduler interface and moving things around for > the > > > API in the next release. > > > > > > I assume you are referring to > > https://issues.apache.org/jira/browse/YUNIKORN-486. > > Removing some unused protobuf messages won't cause compatibility issues > > > > I am talking about YUNIKORN-490 and its sub jira: moving the API definition > from the core into the SI is a bit more than just removing some unused > messages. > It also includes a major change of protobuf and protoc in YUNIKORN-488 as > we're on a superseded release. > > > > The REST API changes are something that might have some compatibility > > impact. One way to handle this > > is to maintain the old v1 API for several more releases before removing > > them. Mani is currently working on this, > > we also need to check with Mani if this can be committed to the next > > release. I'll start a thread to discuss this. > > > > This is the exact reason why we do not want to go to 1.0 now. You do not > want to start a 1.0 release with an already deprecated REST interface or > something you will deprecate immediately. If we get this out of the way in > a release before 1.0 we do not have to carry the baggage around of an > already deprecated REST interface. > We also do not want to rush this development either. > > We also still have a tangled build between the shim and core which we > > > should solve before a 1.0 release. That untangling is for instance > needed > > > to, for instance, allow setting a log level dynamically and expose it > via > > > the REST interface. > > > > > > IMO, that is not something we need to do, at least not a near-term goal. > > I don't see there is a problem with the current deployment model, or > build. > > Scheduler-core is a standalone project, > > it can be built separately; the shim builds a K8s scheduler as a single > > binary and deployed as a container. > > We still provide the flexibility to build any other form of shims to talk > > to the core using GRPC. I don't see any major issue here. > > I suggest starting a separate thread to discuss this and do not mark this > > as any release blocker. > > > > Same for the admission controller, we should move from > > > scripts inside the scheduler image to code in the admission controller > > > image. > > > > > > This is just a non-functional change, no API changes. it won't cause any > > compatibility issues. > > We can work on this post 1.0 release. > > > > It has an impact. Currently we have abandoned changing log levels for > YuniKorn on the fly as it is not possible. > To get this working the requirement is to initialise the logging in the > core code to start with. > The second step is that we then need to introduce the REST API to allow it > and introduce a core to shim way to communicate the log change. > > I have not looked at the Opentracing changes recently but I do think that > needs changes to the interface as well. > Instead of rushing this why not take a little more time to figure this out. > We're not talking about years, or even a year. We're talking about between > the 2 and 3 months at the most. > > Wilfred > > > > > > Relase a 1.0 version doesn't mean we have to get every major thing fixed, > > (if so we probably will never release 1.0). > > It is a milestone to indicate that the project reaches a certain level of > > matureness in its cycle. It only means it has the major functionality > > ready, > > and we do not expect any major API changes until we release 2.0. > > > > (I also pinged our mentors to collect more opinions) > > > > > > On Thu, Apr 22, 2021 at 12:06 AM Wilfred Spiegelenburg < > > wilfr...@apache.org> > > wrote: > > > > > Weiwei, > > > > > > We are cleaning up the scheduler interface and moving things around for > > the > > > API in the next release. We are also about to make major changes to the > > > REST interface. Settling those an
Re: Next YuniKorn release discussion - the next step
Hi all Recently we had a lot of discussions, and thanks for everyone sharing the thought in the google form. Really helpful. Please see the result: [image: The_Next_YuniKorn_Release__0_11_or_1_0_-_Google_Forms.png] even we have slightly more people who prefer to release the 1.0 version, we do see some valid concerns about some of the remaining issues. The best choice we have now is to release the 0.11 release and focus on fixing these issues, and we need to apply a faster pace to get the release done. I propose to plan the code freeze date for 0.11 on Jun 29. And we need to cut the clear scope in https://issues.apache.org/jira/projects/YUNIKORN/versions/12350025 to make sure all items can be contained. We can discuss these items in the coming community meetings. Let me know if there are any concerns, thanks! On Wed, Apr 28, 2021 at 3:30 AM Julia Kinga Marton wrote: > Hi Everyone! > > I think the safest way for now is to go with 0.11 for the next release. > > Right now I think we have a stable and tested version of YuniKorn, but also > have some bigger things ongoing, as were there were mentioned before, such > as the REST API changes, and interface changes. I would prefer to have the > 1.0 release without any deprecations, so only with the new API, but for > that we need to test it carefully, and let give the users some time to get > familiar with that one, so until then we can keep both APIs, just mark the > old one as deprecated, and remove it completely in the 1.0 release. > > Also, even if we have a working version for the Gang scheduling, I think we > still have some Jiras to fix, such as all the improvement Jiras and the > CRD, that one is half implemented, so it would be good to complete it, or > at least decide if we need it or not. > > Regards, > Kinga > > On Fri, Apr 23, 2021 at 3:42 AM Wilfred Spiegelenburg > > wrote: > > > On Fri, 23 Apr 2021 at 04:00, Weiwei Yang wrote: > > > > > Hi Wilfred > > > > > > Thanks for sharing your thoughts. Please see my comments below: > > > > > > We are cleaning up the scheduler interface and moving things around for > > the > > > > API in the next release. > > > > > > > > > I assume you are referring to > > > https://issues.apache.org/jira/browse/YUNIKORN-486. > > > Removing some unused protobuf messages won't cause compatibility issues > > > > > > > I am talking about YUNIKORN-490 and its sub jira: moving the API > definition > > from the core into the SI is a bit more than just removing some unused > > messages. > > It also includes a major change of protobuf and protoc in YUNIKORN-488 as > > we're on a superseded release. > > > > > > > The REST API changes are something that might have some compatibility > > > impact. One way to handle this > > > is to maintain the old v1 API for several more releases before removing > > > them. Mani is currently working on this, > > > we also need to check with Mani if this can be committed to the next > > > release. I'll start a thread to discuss this. > > > > > > > This is the exact reason why we do not want to go to 1.0 now. You do not > > want to start a 1.0 release with an already deprecated REST interface or > > something you will deprecate immediately. If we get this out of the way > in > > a release before 1.0 we do not have to carry the baggage around of an > > already deprecated REST interface. > > We also do not want to rush this development either. > > > > We also still have a tangled build between the shim and core which we > > > > should solve before a 1.0 release. That untangling is for instance > > needed > > > > to, for instance, allow setting a log level dynamically and expose it > > via > > > > the REST interface. > > > > > > > > > IMO, that is not something we need to do, at least not a near-term > goal. > > > I don't see there is a problem with the current deployment model, or > > build. > > > Scheduler-core is a standalone project, > > > it can be built separately; the shim builds a K8s scheduler as a single > > > binary and deployed as a container. > > > We still provide the flexibility to build any other form of shims to > talk > > > to the core using GRPC. I don't see any major issue here. > > > I suggest starting a separate thread to discuss this and do not mark > this > > > as any release blocker. > > > > > > Same for the admission controller, we should move from > > > > scripts inside the scheduler image to code in the admission > controller > > > > image. > > > > > > > > > This is just a non-functional change, no API changes. it won't cause > any > > > compatibility issues. > > > We can work on this post 1.0 release. > > > > > > > It has an impact. Currently we have abandoned changing log levels for > > YuniKorn on the fly as it is not possible. > > To get this working the requirement is to initialise the logging in the > > core code to start with. > > The second step is that we then need to introduce the REST API to allow > it > > and introduce a core to shim way to communicate the log ch