[
https://issues.apache.org/jira/browse/YUNIKORN-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188809#comment-17188809
]
Weiwei Yang commented on YUNIKORN-201:
--------------------------------------
> I can change the appstatus to scheduling state, but I would keep the Status,
> because this is a predefined subresource.
OK. This is not a MUST from what I can see. Changing to SchedulingState is
merely to avoid giving the user the impression this gives the source of truth
about app states. Where I can see this will confuses a lot of people. We need
to document this carefully, hope this makes sense to the users.
> will there be a "finished" status for the App CRD
Unfortunately, we will not be able to set a "finished" state in the app CRD
today. Only app operators understand when an app is finished/completed. In the
scheduler, we could not tell that based on the info we have in the scheduler.
E.g we cannot assume a job is completed if there is no pod running, a good
example is for SchedulerSparkApplication, after one job succeed and before 2nd
job launched, there is no pod running but the app is not finished.
So the fix for YUNIKORN-26 won't be that easy. We have to introduce a way to
get feedback from app operators and observe when an app is finished, we notice
this to the scheduler-core and change the state accordingly. The logic can be
different for different apps. I doubt that's something we want to do. Instead,
I suggest to simply track the scheduling state in our CRD.
> when the app is finished and delete the CRD the status will be changed from
> "Waiting" to none
When app is deleted, we will make sure the corresponding app-CRD is also
deleted. And subsequentially we will delete the app from the scheduler. We do
not need to change the state in this case.
> Application tracking API and CRD
> --------------------------------
>
> Key: YUNIKORN-201
> URL: https://issues.apache.org/jira/browse/YUNIKORN-201
> Project: Apache YuniKorn
> Issue Type: New Feature
> Components: core - scheduler, scheduler-interface, shim - kubernetes
> Reporter: Weiwei Yang
> Assignee: Kinga Marton
> Priority: Major
>
> Today, YK works behind the scene, and the workflow is like
> # app operator or job server launch a bunch of pods on K8s
> # YK gets notified and group pods to apps based on appID
> # YK schedules the pods with respect to the app info
> This provides a simple model to integrate with existing K8s and to support
> workloads, but it has some user experience issues. Such as
> # YK can hardly manage the app lifecycle end to end. An outstanding issue is
> we do not know when an app is finished if we only look at the pod status.
> # YK doesn't have ability to admit apps. We need the ability to admit app
> based on various conditions, e.g resource quota, cluster overhead, ACL, etc.
> # Hard to track app status. Sometimes app might be pending in resource
> queues, but we do not have a good way to expose such status info.
> To further improve the user experience, we need to introduce an application
> tracking API and K8s custom resource definition (CRD). The CRD will be used
> by app operator/job server to interact with YK, to get the lifecycle fully
> controlled.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]