[jira] [Resolved] (YUNIKORN-2507) Picking Victims should consider usage and max quota for queues at each level
[ https://issues.apache.org/jira/browse/YUNIKORN-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R resolved YUNIKORN-2507. Fix Version/s: 1.6.0 Target Version: 1.6.0 Resolution: Fixed Merged to master > Picking Victims should consider usage and max quota for queues at each level > > > Key: YUNIKORN-2507 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2507 > Project: Apache YuniKorn > Issue Type: Sub-task > Components: core - scheduler >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > > Queue setup: root.family.parent.child[1-2] > Max res has been set on parent queue. Say, 10GB. Usage is slightly lesser. > Say, 9GB. Ask1 (say, 2 GB) had come in for Child1 Queue whose usage is > already lesser than its guaranteed quota. So there is a need for preemption. > In this case, fence selection/queue selection should not go outside the > parent queue hierarchy and queues in the same level as parent (parent queue's > siblings) should not be considered at all as accommodating the ask somewhere > in parent siblings hierarchy would violate the max resource quota of parent > queue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
[jira] [Created] (YUNIKORN-2520) PVC errors in AssumePod() is not handled properly
Peter Bacsko created YUNIKORN-2520: -- Summary: PVC errors in AssumePod() is not handled properly Key: YUNIKORN-2520 URL: https://issues.apache.org/jira/browse/YUNIKORN-2520 Project: Apache YuniKorn Issue Type: Bug Components: shim - kubernetes Reporter: Peter Bacsko When there is an error caused by a volume operation in {{{}AssumePod(){}}}, the allocation on core side will not be removed. Although we check the result from UpdateAllocation, the error handling is just logging: {noformat} if err := callback.UpdateAllocation(response); err != nil { rmp.handleUpdateResponseError(rmID, err) } ... func (rmp *RMProxy) handleUpdateResponseError(rmID string, err error) { log.Log(log.RMProxy).Error("failed to handle response", zap.String("rmID", rmID), zap.Error(err)) }{noformat} I suggest moving volume-related code to {{{}Task.postTaskAllocated{}}}. In this case, the task will transition to "Failed" state and we'll have allocationID available, so we can release both the ask and the allocation: {noformat} func (task *Task) releaseAllocation() { ... var releaseRequest *si.AllocationRequest s := TaskStates() switch task.GetTaskState() { case s.New, s.Pending, s.Scheduling, s.Rejected: releaseRequest = common.CreateReleaseAskRequestForTask( task.applicationID, task.taskID, task.application.partition) <-- release ask + allocation if possible default: if task.allocationID == "" { ... log error ... return } releaseRequest = common.CreateReleaseAllocationRequestForTask( task.applicationID, task.taskID, task.allocationID, task.application.partition, task.terminationType) } ...{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
[jira] [Created] (YUNIKORN-2519) Remove bypass ACL check from placement rules
Wilfred Spiegelenburg created YUNIKORN-2519: --- Summary: Remove bypass ACL check from placement rules Key: YUNIKORN-2519 URL: https://issues.apache.org/jira/browse/YUNIKORN-2519 Project: Apache YuniKorn Issue Type: Improvement Components: core - scheduler Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg Instead of returning a flag to not bypass the ACL check by all rules except for the recovery rule special case the recovery rule to bypass checks. The recovery queue is created without ACLs, quota and is always a leaf queue. The only rule that can return the recovery queue is the recovery rule which is the last one in the list. Use all these facts to simplify the placement processing -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
[jira] [Created] (YUNIKORN-2518) Allow recovery queue in REST requests
Wilfred Spiegelenburg created YUNIKORN-2518: --- Summary: Allow recovery queue in REST requests Key: YUNIKORN-2518 URL: https://issues.apache.org/jira/browse/YUNIKORN-2518 Project: Apache YuniKorn Issue Type: Improvement Components: core - common Reporter: Wilfred Spiegelenburg The current checks for the REST requests that require a queue path to be provided prevent looking at the {{root.@recover@}} queue. The validator filters the queue names which makes it impossible to check if the queue has any running applications or pod after initialisation using the REST requests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
Community Over Code NA 2024 Travel Assistance Applications now open!
Hello to all users, contributors and Committers! [ You are receiving this email as a subscriber to one or more ASF project dev or user mailing lists and is not being sent to you directly. It is important that we reach all of our users and contributors/committers so that they may get a chance to benefit from this. We apologise in advance if this doesn't interest you but it is on topic for the mailing lists of the Apache Software Foundation; and it is important please that you do not mark this as spam in your email client. Thank You! ] The Travel Assistance Committee (TAC) are pleased to announce that travel assistance applications for Community over Code NA 2024 are now open! We will be supporting Community over Code NA, Denver Colorado in October 7th to the 10th 2024. TAC exists to help those that would like to attend Community over Code events, but are unable to do so for financial reasons. For more info on this years applications and qualifying criteria, please visit the TAC website at < https://tac.apache.org/ >. Applications are already open on https://tac-apply.apache.org/, so don't delay! The Apache Travel Assistance Committee will only be accepting applications from those people that are able to attend the full event. Important: Applications close on Monday 6th May, 2024. Applicants have until the the closing date above to submit their applications (which should contain as much supporting material as required to efficiently and accurately process their request), this will enable TAC to announce successful applications shortly afterwards. As usual, TAC expects to deal with a range of applications from a diverse range of backgrounds; therefore, we encourage (as always) anyone thinking about sending in an application to do so ASAP. For those that will need a Visa to enter the Country - we advise you apply now so that you have enough time in case of interview delays. So do not wait until you know if you have been accepted or not. We look forward to greeting many of you in Denver, Colorado , October 2024! Kind Regards, Gavin (On behalf of the Travel Assistance Committee)