Chaoran, nice catch on this one. Unfortunate that we didn’t find it before cutting 0.12.2.
I agree with Wilfred that we can add to the release notes on the website, but that we should back port to 0.12.3 as well. I can RM that release as well, unless someone else wants to volunteer. - Craig > On Jan 21, 2022, at 12:44 AM, Wilfred Spiegelenburg <wilfr...@apache.org> > wrote: > > We have seen large numbers of people running and deploying. I have > opened a PR with the fix. > The scheduler should not get deleted, unless scaled down on purpose. > It should not get evicted either, it should run as a high priority pod > unless we missed that. > Crashing of the scheduler is a bug, > > We should let v0.12.2 go through as normal. In the release > announcement we should have a section that points to known issues and > we can reference the jira there with the workaround. > > The workaround is as simple as a scale down and scale up. As long as > the admission controller is running all pods will be pushed towards > the YuniKorn scheduler. We can start on a next release on the branch > v0.12. We should get this case as part of our e2e tests added. > > Wilfred > > On Fri, 21 Jan 2022 at 17:15, Weiwei Yang <w...@apache.org> wrote: >> >> Agree, this needs to be fixed. >> Likely we need to revoke 0.12.2 and get out a 0.12.3. >> >> On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yuchaoran2...@gmail.com> wrote: >> >>> Yes, Helm install and upgrade both work. >>> The failure scenario is as follows: >>> >>> 1. Both the admission controller and the scheduler pods are running >>> 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted, >>> or crashed) >>> 3. The new scheduler pod will be stuck in the pending state because it’s >>> intercepted by the admission controller (The schedulerName field is >>> yunikorn). >>> >>> I think this bug is critical because if the scheduler pod fails for any >>> reason, someone has to manually redeploy the whole thing. >>> >>> >>>> On Jan 20, 2022, at 21:45, Weiwei Yang <w...@apache.org> wrote: >>>> >>>> Hmmm. that is a bug. But during the release verification, I have tried >>> the >>>> helm install, and that works as expected. I am guessing that is because >>> the >>>> scheduler always gets started first. Maybe the same for the upgrade? In >>>> this case, maybe this can work as long as people are using helm charts to >>>> deploy yunikorn? Craig, could you please look into this and let us know >>> if >>>> we need to revoke the vote for 0.12.2 and have a 0.12.3? >>>> >>>> Thank you Chaoran to raise this up. Much appreciated! >>>> >>>> On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yuchaoran2...@gmail.com> >>> wrote: >>>> >>>>> I just spotted a bug >>> https://issues.apache.org/jira/browse/YUNIKORN-1038. >>>>> which is critical and worth porting back into branch 0.12 >>>>> >>>>> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <sun...@apache.org> >>> wrote: >>>>> >>>>>> A late +1 (binding) from me. >>>>>> >>>>>> I build this from source >>>>>> - Ran basic spark job >>>>>> - Verified UI >>>>>> - Checked signature. >>>>>> - Checked the images. >>>>>> >>>>>> Thanks >>>>>> Sunil >>>>>> >>>>>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <apa...@craigcondit.com> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed >>>>>>> with 3 binding +1 votes and 3 non-binding +1 votes. >>>>>>> >>>>>>> Vote thread: >>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j < >>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j> >>>>>>> >>>>>>> Thank you to all the members who helped verify this release. We will >>>>> move >>>>>>> to IPMC voting shortly. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Craig >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org >>> For additional commands, e-mail: dev-h...@yunikorn.apache.org >>> >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org > For additional commands, e-mail: dev-h...@yunikorn.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org