Hi all,

Due to the number of problems that we have discovered since the release of
1.5.0, I believe it makes sense to create a new Yunikorn release which
consists of bug fixes only. If I'm not mistaken we haven't done this before
(at least since leaving the ASF incubator), so this would be the first
minor Yunikorn release.

There are a bunch of fixes that are already on branch-1.5:

   - YUNIKORN-2521 Scheduler deadlock (resolved indirectly by YUNIKORN-2544)
   - YUNIKORN-2539 Add optional deadlock detection
   - YUNIKORN-2544 [UMBRELLA] Fix Yunikorn potential locking issues
      - YUNIKORN-2543 Fix locking in RMProxy
      - YUNIKORN-2545 Eliminate multiple lock calls from Queue
      - YUNIKORN-2548 Potential deadlock during concurrent
      bottom-up/top-down queue traversal
      - YUNIKORN-2550 Fix locking in PartitionContext
      - YUNIKORN-2552 Recursive locking when sending remove queue event
      - YUNIKORN-2553 [core] Enable deadlock detection during unit tests
      - YUNIKORN-2563 [shim] Enable deadlock detection during unit tests
      - YUNIKORN-2574 totalPartitionResource should not be mutated with
      AddTo/SubFrom
      - YUNIKORN-2562 Nil pointer panic in Application.ReplaceAllocation()


The following is In Progress for 1.5.1:

   - YUNIKORN-2526 Discrepancy between shim cache and core app/task list
   after scheduler restart


Candidates:

   - YUNIKORN-2520 PVC errors in AssumePod() are not handled properly -
   Resolved, only cherry-picking is needed
   - YUNIKORN-2057 FindQueueByAppID is slow - Critical priority, "In
   progress" since Oct 2023
   - YUNIKORN-1089 Application handling with invalid task group annotations
   - Critical priority, no progress
   - YUNIKORN-1988 Preemption happens when a queue lower than its
   guaranteed capacity - Critical priority, "In progress" since Sep 2023


Thoughts, opinions? What should be the scope of 1.5.1?

Thanks,
Peter

Reply via email to