Scheduling is lock free when we get to any of the application.Try...()
calls. The scheduling thread does not hold any locks until we get
there. That was how it was designed and implemented.
When we get there nothing but the scheduling thread is allowed to make
changes to the application for the
Thanks for the replies. I managed to get good progress on this issue.
There's a thing which I'd like to talk about. It's not something which is
critical but it needs to be addressed IMO.
The scope of the mutex-protected critical section is too large in
tryAllocate, tryReservedAllocate and
Case 1:
I am all for simplifying and removing locks. Changing the SI like you
propose will trigger a YuniKorn 2.0 as it is incompatible with the
current setup. There is a much simpler change that does not require a
2.0 version. See comments in the jira.
Case 2:
This is a bug I think, which has
I’m all for fixing these… and in general where lockless algorithms can be
implemented cleanly, I’m in favor of those implementations instead of requiring
locks, so for RMProxy I’m +1 on that. The extra memory for an RMProxy instance
is irrelevant.
The recursive locking case is a real problem,
Hi all,
after YUNIKORN-2539 got merged, we identified some potential deadlocks.
These are false positives now, but a small change can cause Yunikorn to
fall apart, so the term "potential deadlock" describes them properly.
Thoughs, opinions are welcome. IMO we should handle these with priority to