Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Wilfred Spiegelenburg
We have seen large numbers of people running and deploying. I have opened a PR with the fix. The scheduler should not get deleted, unless scaled down on purpose. It should not get evicted either, it should run as a high priority pod unless we missed that. Crashing of the scheduler is a bug, We

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Weiwei Yang
Agree, this needs to be fixed. Likely we need to revoke 0.12.2 and get out a 0.12.3. On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu wrote: > Yes, Helm install and upgrade both work. > The failure scenario is as follows: > > 1. Both the admission controller and the scheduler pods are running > 2.

Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-20 Thread Weiwei Yang
hi all Most issues under the graduation preparation JIRA YUNIKORN-1005 are fixed. The remaining one is the who-are-we web page, I am currently collecting info for that, should be done by next week. Shall we start to vote now? I can start a new

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Chaoran Yu
Yes, Helm install and upgrade both work. The failure scenario is as follows: 1. Both the admission controller and the scheduler pods are running 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted, or crashed) 3. The new scheduler pod will be stuck in the pending state

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Weiwei Yang
Hmmm. that is a bug. But during the release verification, I have tried the helm install, and that works as expected. I am guessing that is because the scheduler always gets started first. Maybe the same for the upgrade? In this case, maybe this can work as long as people are using helm charts to

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Chaoran Yu
I just spotted a bug https://issues.apache.org/jira/browse/YUNIKORN-1038. which is critical and worth porting back into branch 0.12 On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan wrote: > A late +1 (binding) from me. > > I build this from source > - Ran basic spark job > - Verified UI > -

[jira] [Created] (YUNIKORN-1038) Admission controller does not ignore the YuniKorn scheduler pod

2022-01-20 Thread Chaoran Yu (Jira)
Chaoran Yu created YUNIKORN-1038: Summary: Admission controller does not ignore the YuniKorn scheduler pod Key: YUNIKORN-1038 URL: https://issues.apache.org/jira/browse/YUNIKORN-1038 Project: Apache

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-20 Thread Sunil Govindan
A late +1 (binding) from me. I build this from source - Ran basic spark job - Verified UI - Checked signature. - Checked the images. Thanks Sunil On Wed, Jan 19, 2022 at 8:44 AM Craig Condit wrote: > Hi all, > > The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed > with 3

[jira] [Resolved] (YUNIKORN-1037) Update community conf links

2022-01-20 Thread Weiwei Yang (Jira)
[ https://issues.apache.org/jira/browse/YUNIKORN-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang resolved YUNIKORN-1037. --- Fix Version/s: 1.0.0 Resolution: Fixed > Update community conf links >

[jira] [Created] (YUNIKORN-1037) Update community conf links

2022-01-20 Thread Weiwei Yang (Jira)
Weiwei Yang created YUNIKORN-1037: - Summary: Update community conf links Key: YUNIKORN-1037 URL: https://issues.apache.org/jira/browse/YUNIKORN-1037 Project: Apache YuniKorn Issue Type: Bug