I'd like to start the release process.

The following items will be delivered as part of 1.5.1:
https://issues.apache.org/jira/issues/?filter=12353383

No features in this release, only bugfixes. No further items are considered
(unless something critical is found).

Planned schedule:
RC1 out: 10th May
Voting: from 10th May to early next week, 13th-14th May
Release: 15-16th May

Thanks,
Peter

On Thu, May 2, 2024 at 2:11 AM Shravan Achar
<shravan.ac...@apple.com.invalid> wrote:

> Have been helping Peter with YUNIKORN-2526, and it has been a tricky
> problem to reproduce and resolve. It makes sense to continue to make
> progress on it without blocking the 1.5.1 patch release as it has
> considerable fixes already (re: deadlock)
>
> Shravan
>
> On 2024/04/29 15:20:27 Peter Bacsko wrote:
> > Hey Wilfred,
> >
> > Yes, I'm taking the role of release manager.
> > I cherry-picked YUNIKORN-2520 to branch-1.5.
> >
> > Regarding the remaining JIRAs, I asked PoAn Yang on Slack to take a look
> at
> > YUNIKORN-2057 as he originally volunteered to solve it. I told him that
> it
> > was not urgent, but depending on how quickly he makes progress, we might
> > re-consider our position later.
> >
> > Peter
> >
> > On Mon, Apr 29, 2024 at 5:00 AM Wilfred Spiegelenburg <wi...@apache.org>
> > wrote:
> >
> > > Peter,
> > >
> > > Thank you for starting this discussion. See inline for further
> comments.
> > >
> > > > Hi all,
> > > >
> > > > Due to the number of problems that we have discovered since the
> release
> > > of
> > > > 1.5.0, I believe it makes sense to create a new Yunikorn release
> which
> > > > consists of bug fixes only. If I'm not mistaken we haven't done this
> > > before
> > > > (at least since leaving the ASF incubator), so this would be the
> first
> > > > minor Yunikorn release.
> > >
> > > +1
> > > I am totally for releasing YuniKorn 1.5.1 with the lock fixes.
> > > Looking at all the work you have done for this release: would you be
> > > willing to also step up as a release manager for the 1.5.1 release?
> > >
> > > > There are a bunch of fixes that are already on branch-1.5:
> > > >
> > > >    - YUNIKORN-2521 Scheduler deadlock (resolved indirectly by
> > > YUNIKORN-2544)
> > > >    - YUNIKORN-2539 Add optional deadlock detection
> > > >    - YUNIKORN-2544 [UMBRELLA] Fix Yunikorn potential locking issues
> > > >       - YUNIKORN-2543 Fix locking in RMProxy
> > > >       - YUNIKORN-2545 Eliminate multiple lock calls from Queue
> > > >       - YUNIKORN-2548 Potential deadlock during concurrent
> > > >       bottom-up/top-down queue traversal
> > > >       - YUNIKORN-2550 Fix locking in PartitionContext
> > > >       - YUNIKORN-2552 Recursive locking when sending remove queue
> event
> > > >       - YUNIKORN-2553 [core] Enable deadlock detection during unit
> tests
> > > >       - YUNIKORN-2563 [shim] Enable deadlock detection during unit
> tests
> > > >       - YUNIKORN-2574 totalPartitionResource should not be mutated
> with
> > > >       AddTo/SubFrom
> > > >       - YUNIKORN-2562 Nil pointer panic in
> > > Application.ReplaceAllocation()
> > > >
> > >
> > > Yes for all the above.
> > >
> > > > The following is In Progress for 1.5.1:
> > > >
> > > >    - YUNIKORN-2526 Discrepancy between shim cache and core app/task
> list
> > > >    after scheduler restart
> > >
> > > This would be a good one to get in if we have some progress on this.
> > > Do we understand what is going on yet? I looked at the jira and am not
> > > sure if we understand the root cause.
> > >
> > > > Candidates:
> > > >
> > > >    - YUNIKORN-2520 PVC errors in AssumePod() are not handled
> properly -
> > > >    Resolved, only cherry-picking is needed
> > >
> > > Yes, this could be added.
> > >
> > > I also think we need to check if we have any CVE fixes that need to be
> > > added.
> > > Quick check shows these two:
> > > * golang.org/x/net 0.23 (CVE-2023-45288 or GO-2024-2687 via
> YUNIKORN-2541)
> > > * google.golang.org/protobuf to v1.33.0 (CVE-2024-24786 via
> YUNIKORN-2469)
> > > * build with golang 1.21.9
> > >
> > > To satisfy the scanners, although we are not affected:
> > > * K8s 1.29.4 (CVE-2024-3177)
> > >
> > >
> > > >    - YUNIKORN-2057 FindQueueByAppID is slow - Critical priority, "In
> > > >    progress" since Oct 2023
> > > >    - YUNIKORN-1089 Application handling with invalid task group
> > > annotations
> > > >    - Critical priority, no progress
> > > >    - YUNIKORN-1988 Preemption happens when a queue lower than its
> > > >    guaranteed capacity - Critical priority, "In progress" since Sep
> 2023
> > >
> > > No for the last 3 mentioned. We did not block the 1.5.0 release on
> > > these and they have not made enough progress since then.
> > > I would not consider them as a possible candidate for 1.5.1
> > >
> > > Wilfred
> > >
> > > >
> > > > Thoughts, opinions? What should be the scope of 1.5.1?
> > > >
> > > > Thanks,
> > > > Peter
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> > >
> > >
> >

Reply via email to